public class BVDecomposeSegCVSub extends java.lang.Object implements OptionHandler, TechnicalInformationHandler, RevisionHandler
@misc{Webb2002, address = {School of Computer Science and Software Engineering, Victoria, Australia}, author = {Geoffrey I. Webb and Paul Conilione}, institution = {Monash University}, title = {Estimating bias and variance from data}, year = {2002}, PDF = {http://www.csse.monash.edu.au/\~webb/Files/WebbConilione04.pdf} } @inproceedings{Kohavi1996, author = {Ron Kohavi and David H. Wolpert}, booktitle = {Machine Learning: Proceedings of the Thirteenth International Conference}, editor = {Lorenza Saitta}, pages = {275-283}, publisher = {Morgan Kaufmann}, title = {Bias Plus Variance Decomposition for Zero-One Loss Functions}, year = {1996}, PS = {http://robotics.stanford.edu/\~ronnyk/biasVar.ps} } @article{Webb2000, author = {Geoffrey I. Webb}, journal = {Machine Learning}, number = {2}, pages = {159-196}, title = {MultiBoosting: A Technique for Combining Boosting and Wagging}, volume = {40}, year = {2000} }Valid options are:
-c <class index> The index of the class attribute. (default last)
-D Turn on debugging output.
-l <num> The number of times each instance is classified. (default 10)
-p <proportion of objects in common> The average proportion of instances common between any two training sets
-s <seed> The random number seed used.
-t <name of arff file> The name of the arff file used for the decomposition.
-T <number of instances in training set> The number of instances in the training set.
-W <classifier class name> Full class name of the learner used in the decomposition. eg: weka.classifiers.bayes.NaiveBayes
Options specific to learner weka.classifiers.rules.ZeroR:
-D If set, classifier is run in debug mode and may output additional info to the consoleOptions after -- are passed to the designated sub-learner.
Constructor and Description |
---|
BVDecomposeSegCVSub() |
Modifier and Type | Method and Description |
---|---|
void |
decompose()
Carry out the bias-variance decomposition using the sub-sampled cross-validation method.
|
java.util.Vector<java.lang.Integer> |
findCentralTendencies(double[] predProbs)
Finds the central tendency, given the classifications for an instance.
|
Classifier |
getClassifier()
Gets the name of the classifier being analysed
|
int |
getClassifyIterations()
Gets the number of times an instance is classified
|
int |
getClassIndex()
Get the index (starting from 1) of the attribute used as the class.
|
java.lang.String |
getDataFileName()
Get the name of the data file used for the decomposition
|
boolean |
getDebug()
Gets whether debugging is turned on
|
double |
getError()
Get the calculated error rate
|
double |
getKWBias()
Get the calculated bias squared according to the Kohavi and Wolpert definition
|
double |
getKWSigma()
Get the calculated sigma according to the Kohavi and Wolpert definition
|
double |
getKWVariance()
Get the calculated variance according to the Kohavi and Wolpert definition
|
java.lang.String[] |
getOptions()
Gets the current settings of the CheckClassifier.
|
double |
getP()
Get the proportion of instances that are common between two training sets.
|
java.lang.String |
getRevision()
Returns the revision string.
|
int |
getSeed()
Gets the random number seed
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing
detailed information about the technical background of this class,
e.g., paper reference or book this class is based on.
|
int |
getTrainSize()
Get the training size
|
double |
getWBias()
Get the calculated bias according to the Webb definition
|
double |
getWVariance()
Get the calculated variance according to the Webb definition
|
java.lang.String |
globalInfo()
Returns a string describing this object
|
java.util.Enumeration<Option> |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] args)
Test method for this class
|
void |
randomize(int[] index,
java.util.Random random)
Accepts an array of ints and randomises the values in the array, using the
random seed.
|
void |
setClassifier(Classifier newClassifier)
Set the classifiers being analysed
|
void |
setClassifyIterations(int classifyIterations)
Sets the number of times an instance is classified
|
void |
setClassIndex(int classIndex)
Sets index of attribute to discretize on
|
void |
setDataFileName(java.lang.String dataFileName)
Sets the name of the dataset file.
|
void |
setDebug(boolean debug)
Sets debugging mode
|
void |
setOptions(java.lang.String[] options)
Sets the OptionHandler's options using the given list.
|
void |
setP(double proportion)
Set the proportion of instances that are common between two training sets
used to train a classifier.
|
void |
setSeed(int seed)
Sets the random number seed
|
void |
setTrainSize(int size)
Set the training size.
|
java.lang.String |
toString()
Returns description of the bias-variance decomposition results.
|
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
makeCopy
public java.lang.String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface TechnicalInformationHandler
public java.util.Enumeration<Option> listOptions()
listOptions
in interface OptionHandler
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-c <class index> The index of the class attribute. (default last)
-D Turn on debugging output.
-l <num> The number of times each instance is classified. (default 10)
-p <proportion of objects in common> The average proportion of instances common between any two training sets
-s <seed> The random number seed used.
-t <name of arff file> The name of the arff file used for the decomposition.
-T <number of instances in training set> The number of instances in the training set.
-W <classifier class name> Full class name of the learner used in the decomposition. eg: weka.classifiers.bayes.NaiveBayes
Options specific to learner weka.classifiers.rules.ZeroR:
-D If set, classifier is run in debug mode and may output additional info to the console
setOptions
in interface OptionHandler
options
- the list of options as an array of stringsjava.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
public void setClassifier(Classifier newClassifier)
newClassifier
- the Classifier to use.public Classifier getClassifier()
public void setDebug(boolean debug)
debug
- true if debug output should be printedpublic boolean getDebug()
public void setSeed(int seed)
seed
- the random number seedpublic int getSeed()
public void setClassifyIterations(int classifyIterations)
classifyIterations
- number of times an instance is classifiedpublic int getClassifyIterations()
public void setDataFileName(java.lang.String dataFileName)
dataFileName
- name of dataset file.public java.lang.String getDataFileName()
public int getClassIndex()
public void setClassIndex(int classIndex)
classIndex
- the index (starting from 1) of the class attributepublic double getKWBias()
public double getWBias()
public double getKWVariance()
public double getWVariance()
public double getKWSigma()
public void setTrainSize(int size)
size
- the size of the training setpublic int getTrainSize()
public void setP(double proportion)
proportion
- the proportion of instances that are common between training
sets.public double getP()
public double getError()
public void decompose() throws java.lang.Exception
java.lang.Exception
- if the decomposition couldn't be carried outpublic java.util.Vector<java.lang.Integer> findCentralTendencies(double[] predProbs)
For example, instance 'x' may be classified out of 3 classes y = {1, 2, 3}, so if x is classified 10 times, and is classified as follows, '1' = 2 times, '2' = 5 times and '3' = 3 times. Then the central tendency is '2'.
However, it is important to note that this method returns a list of all classes that have the highest number of classifications. In cases where there are several classes with the largest number of classifications, then all of these classes are returned. For example if 'x' is classified '1' = 4 times, '2' = 4 times and '3' = 2 times. Then '1' and '2' are returned.
predProbs
- the array of classifications for a single instance.public java.lang.String toString()
toString
in class java.lang.Object
public java.lang.String getRevision()
getRevision
in interface RevisionHandler
public static void main(java.lang.String[] args)
args
- the command line argumentspublic final void randomize(int[] index, java.util.Random random)
index
- is the array of integersrandom
- is the Random seed.