public class Bagging extends RandomizableParallelIteratedSingleClassifierEnhancer implements WeightedInstancesHandler, AdditionalMeasureProducer, TechnicalInformationHandler, PartitionGenerator, Aggregateable<Bagging>
@article{Breiman1996, author = {Leo Breiman}, journal = {Machine Learning}, number = {2}, pages = {123-140}, title = {Bagging predictors}, volume = {24}, year = {1996} }Valid options are:
-P Size of each bag, as a percentage of the training set size. (default 100)
-O Calculate the out of bag error.
-print Print the individual classifiers in the output
-store-out-of-bag-predictions Whether to store out of bag predictions in internal evaluation object.
-output-out-of-bag-complexity-statistics Whether to output complexity-based statistics when out-of-bag evaluation is performed.
-represent-copies-using-weights Represent copies of instances using weights rather than explicitly.
-S <num> Random number seed. (default 1)
-num-slots <num> Number of execution slots. (default 1 - i.e. no parallelism)
-I <num> Number of iterations. (default 10)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.trees.REPTree)
Options specific to classifier weka.classifiers.trees.REPTree:
-M <minimum number of instances> Set minimum number of instances per leaf (default 2).
-V <minimum variance for split> Set minimum numeric class variance proportion of train variance for split (default 1e-3).
-N <number of folds> Number of folds for reduced error pruning (default 3).
-S <seed> Seed for random data shuffling (default 1).
-P No pruning.
-L Maximum tree depth (default -1, no maximum)
-I Initial class value count (default 0)
-R Spread initial count over all class values (i.e. don't use 1 per value)Options after -- are passed to the designated classifier.
BATCH_SIZE_DEFAULT, NUM_DECIMAL_PLACES_DEFAULT
Constructor and Description |
---|
Bagging()
Constructor.
|
Modifier and Type | Method and Description |
---|---|
Bagging |
aggregate(Bagging toAggregate)
Aggregate an object with this one
|
java.lang.String |
bagSizePercentTipText()
Returns the tip text for this property
|
java.lang.String |
batchSizeTipText()
Tool tip text for this property
|
void |
buildClassifier(Instances data)
Bagging method.
|
java.lang.String |
calcOutOfBagTipText()
Returns the tip text for this property
|
double[] |
distributionForInstance(Instance instance)
Calculates the class membership probabilities for the given test
instance.
|
double[][] |
distributionsForInstances(Instances insts)
Batch scoring method.
|
java.util.Enumeration<java.lang.String> |
enumerateMeasures()
Returns an enumeration of the additional measure names.
|
void |
finalizeAggregation()
Call to complete the aggregation process.
|
void |
generatePartition(Instances data)
Builds the classifier to generate a partition.
|
int |
getBagSizePercent()
Gets the size of each bag, as a percentage of the training set size.
|
java.lang.String |
getBatchSize()
Gets the preferred batch size from the base learner if it implements
BatchPredictor.
|
boolean |
getCalcOutOfBag()
Get whether the out of bag error is calculated.
|
double |
getMeasure(java.lang.String additionalMeasureName)
Returns the value of the named measure.
|
double[] |
getMembershipValues(Instance inst)
Computes an array that indicates leaf membership
|
java.lang.String[] |
getOptions()
Gets the current settings of the Classifier.
|
Evaluation |
getOutOfBagEvaluationObject()
Returns the out-of-bag evaluation object.
|
boolean |
getOutputOutOfBagComplexityStatistics()
Gets whether complexity statistics are output when OOB estimation is performed.
|
boolean |
getPrintClassifiers()
Get whether to print the individual ensemble classifiers in the output
|
boolean |
getRepresentCopiesUsingWeights()
Get whether copies of instances are represented using weights rather than explicitly.
|
java.lang.String |
getRevision()
Returns the revision string.
|
boolean |
getStoreOutOfBagPredictions()
Get whether the out of bag predictions are stored.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing
detailed information about the technical background of this class,
e.g., paper reference or book this class is based on.
|
java.lang.String |
globalInfo()
Returns a string describing classifier
|
boolean |
implementsMoreEfficientBatchPrediction()
Returns true if the base classifier implements BatchPredictor and is able
to generate batch predictions efficiently
|
java.util.Enumeration<Option> |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] argv)
Main method for testing this class.
|
double |
measureOutOfBagError()
Gets the out of bag error that was calculated as the classifier
was built.
|
int |
numElements()
Returns the number of elements in the partition.
|
java.lang.String |
outputOutOfBagComplexityStatisticsTipText()
Returns the tip text for this property
|
java.lang.String |
printClassifiersTipText()
Returns the tip text for this property
|
java.lang.String |
representCopiesUsingWeightsTipText()
Returns the tip text for this property
|
void |
setBagSizePercent(int newBagSizePercent)
Sets the size of each bag, as a percentage of the training set size.
|
void |
setBatchSize(java.lang.String size)
Set the batch size to use.
|
void |
setCalcOutOfBag(boolean calcOutOfBag)
Set whether the out of bag error is calculated.
|
void |
setOptions(java.lang.String[] options)
Parses a given list of options.
|
void |
setOutputOutOfBagComplexityStatistics(boolean b)
Sets whether complexity statistics are output when OOB estimation is performed.
|
void |
setPrintClassifiers(boolean print)
Set whether to print the individual ensemble classifiers in the output
|
void |
setRepresentCopiesUsingWeights(boolean representUsingWeights)
Set whether copies of instances are represented using weights rather than explicitly.
|
void |
setStoreOutOfBagPredictions(boolean storeOutOfBag)
Set whether the out of bag predictions are stored.
|
java.lang.String |
storeOutOfBagPredictionsTipText()
Returns the tip text for this property
|
java.lang.String |
toString()
Returns description of the bagged classifier.
|
getSeed, seedTipText, setSeed
getNumExecutionSlots, numExecutionSlotsTipText, setNumExecutionSlots
getNumIterations, numIterationsTipText, setNumIterations
classifierTipText, getCapabilities, getClassifier, postExecution, preExecution, setClassifier
classifyInstance, debugTipText, doNotCheckCapabilitiesTipText, forName, getDebug, getDoNotCheckCapabilities, getNumDecimalPlaces, makeCopies, makeCopy, numDecimalPlacesTipText, run, runClassifier, setDebug, setDoNotCheckCapabilities, setNumDecimalPlaces
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
getCapabilities
makeCopy
public java.lang.String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface TechnicalInformationHandler
public java.util.Enumeration<Option> listOptions()
listOptions
in interface OptionHandler
listOptions
in class RandomizableParallelIteratedSingleClassifierEnhancer
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-P Size of each bag, as a percentage of the training set size. (default 100)
-O Calculate the out of bag error.
-print Print the individual classifiers in the output
-store-out-of-bag-predictions Whether to store out of bag predictions in internal evaluation object.
-output-out-of-bag-complexity-statistics Whether to output complexity-based statistics when out-of-bag evaluation is performed.
-represent-copies-using-weights Represent copies of instances using weights rather than explicitly.
-S <num> Random number seed. (default 1)
-num-slots <num> Number of execution slots. (default 1 - i.e. no parallelism)
-I <num> Number of iterations. (default 10)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.trees.REPTree)
Options specific to classifier weka.classifiers.trees.REPTree:
-M <minimum number of instances> Set minimum number of instances per leaf (default 2).
-V <minimum variance for split> Set minimum numeric class variance proportion of train variance for split (default 1e-3).
-N <number of folds> Number of folds for reduced error pruning (default 3).
-S <seed> Seed for random data shuffling (default 1).
-P No pruning.
-L Maximum tree depth (default -1, no maximum)
-I Initial class value count (default 0)
-R Spread initial count over all class values (i.e. don't use 1 per value)Options after -- are passed to the designated classifier.
setOptions
in interface OptionHandler
setOptions
in class RandomizableParallelIteratedSingleClassifierEnhancer
options
- the list of options as an array of stringsjava.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class RandomizableParallelIteratedSingleClassifierEnhancer
public java.lang.String bagSizePercentTipText()
public int getBagSizePercent()
public void setBagSizePercent(int newBagSizePercent)
newBagSizePercent
- the bag size, as a percentage.public java.lang.String representCopiesUsingWeightsTipText()
public void setRepresentCopiesUsingWeights(boolean representUsingWeights)
representUsingWeights
- whether to represent copies using weightspublic boolean getRepresentCopiesUsingWeights()
public java.lang.String storeOutOfBagPredictionsTipText()
public void setStoreOutOfBagPredictions(boolean storeOutOfBag)
storeOutOfBag
- whether the out of bag predictions are storedpublic boolean getStoreOutOfBagPredictions()
public java.lang.String calcOutOfBagTipText()
public void setCalcOutOfBag(boolean calcOutOfBag)
calcOutOfBag
- whether to calculate the out of bag errorpublic boolean getCalcOutOfBag()
public java.lang.String outputOutOfBagComplexityStatisticsTipText()
public boolean getOutputOutOfBagComplexityStatistics()
public void setOutputOutOfBagComplexityStatistics(boolean b)
b
- whether statistics are calculatedpublic java.lang.String printClassifiersTipText()
public void setPrintClassifiers(boolean print)
print
- true if the individual classifiers are to be printedpublic boolean getPrintClassifiers()
public double measureOutOfBagError()
public java.util.Enumeration<java.lang.String> enumerateMeasures()
enumerateMeasures
in interface AdditionalMeasureProducer
public double getMeasure(java.lang.String additionalMeasureName)
getMeasure
in interface AdditionalMeasureProducer
additionalMeasureName
- the name of the measure to query for its valuejava.lang.IllegalArgumentException
- if the named measure is not supportedpublic Evaluation getOutOfBagEvaluationObject()
public void buildClassifier(Instances data) throws java.lang.Exception
buildClassifier
in interface Classifier
buildClassifier
in class ParallelIteratedSingleClassifierEnhancer
data
- the training data to be used for generating the
bagged classifier.java.lang.Exception
- if the classifier could not be built successfullypublic double[] distributionForInstance(Instance instance) throws java.lang.Exception
distributionForInstance
in interface Classifier
distributionForInstance
in class AbstractClassifier
instance
- the instance to be classifiedjava.lang.Exception
- if distribution can't be computed successfullypublic java.lang.String batchSizeTipText()
batchSizeTipText
in class AbstractClassifier
public void setBatchSize(java.lang.String size)
setBatchSize
in interface BatchPredictor
setBatchSize
in class AbstractClassifier
size
- the batch size to usepublic java.lang.String getBatchSize()
getBatchSize
in interface BatchPredictor
getBatchSize
in class AbstractClassifier
public double[][] distributionsForInstances(Instances insts) throws java.lang.Exception
distributionsForInstances
in interface BatchPredictor
distributionsForInstances
in class AbstractClassifier
insts
- the instances to get predictions forjava.lang.Exception
- if a problem occurspublic boolean implementsMoreEfficientBatchPrediction()
implementsMoreEfficientBatchPrediction
in interface BatchPredictor
implementsMoreEfficientBatchPrediction
in class AbstractClassifier
public java.lang.String toString()
toString
in class java.lang.Object
public void generatePartition(Instances data) throws java.lang.Exception
generatePartition
in interface PartitionGenerator
java.lang.Exception
public double[] getMembershipValues(Instance inst) throws java.lang.Exception
getMembershipValues
in interface PartitionGenerator
java.lang.Exception
public int numElements() throws java.lang.Exception
numElements
in interface PartitionGenerator
java.lang.Exception
public java.lang.String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class AbstractClassifier
public static void main(java.lang.String[] argv)
argv
- the optionspublic Bagging aggregate(Bagging toAggregate) throws java.lang.Exception
aggregate
in interface Aggregateable<Bagging>
toAggregate
- the object to aggregatejava.lang.Exception
- if the supplied object can't be aggregated for some
reasonpublic void finalizeAggregation() throws java.lang.Exception
finalizeAggregation
in interface Aggregateable<Bagging>
java.lang.Exception
- if the aggregation can't be finalized for some reason