public class RandomForest extends Bagging
@article{Breiman2001, author = {Leo Breiman}, journal = {Machine Learning}, number = {1}, pages = {5-32}, title = {Random Forests}, volume = {45}, year = {2001} }
-P Size of each bag, as a percentage of the training set size. (default 100)
-O Calculate the out of bag error.
-store-out-of-bag-predictions Whether to store out of bag predictions in internal evaluation object.
-output-out-of-bag-complexity-statistics Whether to output complexity-based statistics when out-of-bag evaluation is performed.
-print Print the individual classifiers in the output
-attribute-importance Compute and output attribute importance (mean impurity decrease method)
-I <num> Number of iterations (i.e., the number of trees in the random forest). (current value 100)
-num-slots <num> Number of execution slots. (default 1 - i.e. no parallelism) (use 0 to auto-detect number of cores)
-K <number of attributes> Number of attributes to randomly investigate. (default 0) (<1 = int(log_2(#predictors)+1)).
-M <minimum number of instances> Set minimum number of instances per leaf. (default 1)
-V <minimum variance for split> Set minimum numeric class variance proportion of train variance for split (default 1e-3).
-S <num> Seed for random number generator. (default 1)
-depth <num> The maximum depth of the tree, 0 for unlimited. (default 0)
-N <num> Number of folds for backfitting (default 0, no backfitting).
-U Allow unclassified instances.
-B Break ties randomly when several attributes look equally good.
-output-debug-info If set, classifier is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, classifier capabilities are not checked before classifier is built (use with caution).
-num-decimal-places The number of decimal places for the output of numbers in the model (default 2).
-batch-size The desired batch size for batch prediction (default 100).
BATCH_SIZE_DEFAULT, NUM_DECIMAL_PLACES_DEFAULT
Constructor and Description |
---|
RandomForest()
Constructor that sets base classifier for bagging to RandomTre and default
number of iterations to 100.
|
Modifier and Type | Method and Description |
---|---|
java.lang.String |
breakTiesRandomlyTipText()
Returns the tip text for this property
|
java.lang.String |
computeAttributeImportanceTipText()
Returns the tip text for this property
|
double[] |
computeAverageImpurityDecreasePerAttribute(double[] nodeCounts)
Computes the average impurity decrease per attribute over the trees
|
boolean |
getBreakTiesRandomly()
Get whether to break ties randomly.
|
Capabilities |
getCapabilities()
Returns default capabilities of the base classifier.
|
boolean |
getComputeAttributeImportance()
Get whether to compute and output attribute importance scores
|
int |
getMaxDepth()
Get the maximum depth of trh tree, 0 for unlimited.
|
int |
getNumFeatures()
Get the number of features used in random selection.
|
java.lang.String[] |
getOptions()
Gets the current settings of the forest.
|
java.lang.String |
getRevision()
Returns the revision string.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed
information about the technical background of this class, e.g., paper
reference or book this class is based on.
|
java.lang.String |
globalInfo()
Returns a string describing classifier
|
java.util.Enumeration<Option> |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] argv)
Main method for this class.
|
java.lang.String |
maxDepthTipText()
Returns the tip text for this property
|
java.lang.String |
numFeaturesTipText()
Returns the tip text for this property
|
java.lang.String |
numIterationsTipText()
Returns the tip text for the number of iterations.
|
void |
setBatchSize(java.lang.String size)
Set the preferred batch size for batch prediction.
|
void |
setBreakTiesRandomly(boolean newBreakTiesRandomly)
Set whether to break ties randomly.
|
void |
setClassifier(Classifier newClassifier)
This method only accepts RandomTree arguments.
|
void |
setComputeAttributeImportance(boolean computeAttributeImportance)
Set whether to compute and output attribute importance scores
|
void |
setDebug(boolean debug)
Set debugging mode.
|
void |
setMaxDepth(int value)
Set the maximum depth of the tree, 0 for unlimited.
|
void |
setNumDecimalPlaces(int num)
Set the number of decimal places.
|
void |
setNumFeatures(int newNumFeatures)
Set the number of features to use in random selection.
|
void |
setOptions(java.lang.String[] options)
Parses a given list of options.
|
void |
setRepresentCopiesUsingWeights(boolean representUsingWeights)
This method only accepts true as its argument
|
void |
setSeed(int s)
Sets the seed for the random number generator.
|
java.lang.String |
toString()
Returns description of the bagged classifier.
|
aggregate, bagSizePercentTipText, batchSizeTipText, buildClassifier, calcOutOfBagTipText, distributionForInstance, distributionsForInstances, enumerateMeasures, finalizeAggregation, generatePartition, getBagSizePercent, getBatchSize, getCalcOutOfBag, getMeasure, getMembershipValues, getOutOfBagEvaluationObject, getOutputOutOfBagComplexityStatistics, getPrintClassifiers, getRepresentCopiesUsingWeights, getStoreOutOfBagPredictions, implementsMoreEfficientBatchPrediction, measureOutOfBagError, numElements, outputOutOfBagComplexityStatisticsTipText, printClassifiersTipText, representCopiesUsingWeightsTipText, setBagSizePercent, setCalcOutOfBag, setOutputOutOfBagComplexityStatistics, setPrintClassifiers, setStoreOutOfBagPredictions, storeOutOfBagPredictionsTipText
getSeed, seedTipText
getNumExecutionSlots, numExecutionSlotsTipText, setNumExecutionSlots
getNumIterations, setNumIterations
classifierTipText, getClassifier, postExecution, preExecution
classifyInstance, debugTipText, doNotCheckCapabilitiesTipText, forName, getDebug, getDoNotCheckCapabilities, getNumDecimalPlaces, makeCopies, makeCopy, numDecimalPlacesTipText, run, runClassifier, setDoNotCheckCapabilities
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
makeCopy
public RandomForest()
public Capabilities getCapabilities()
getCapabilities
in interface Classifier
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class SingleClassifierEnhancer
Capabilities
public java.lang.String globalInfo()
globalInfo
in class Bagging
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface TechnicalInformationHandler
getTechnicalInformation
in class Bagging
public java.lang.String numIterationsTipText()
numIterationsTipText
in class IteratedSingleClassifierEnhancer
@ProgrammaticProperty public void setClassifier(Classifier newClassifier)
setClassifier
in class SingleClassifierEnhancer
newClassifier
- the RandomTree to use.if
- argument is not a RandomTree@ProgrammaticProperty public void setRepresentCopiesUsingWeights(boolean representUsingWeights)
setRepresentCopiesUsingWeights
in class Bagging
representUsingWeights
- must be set to true.if
- argument is not truepublic java.lang.String numFeaturesTipText()
public int getNumFeatures()
public void setNumFeatures(int newNumFeatures)
newNumFeatures
- Value to assign to numFeatures.public java.lang.String computeAttributeImportanceTipText()
public void setComputeAttributeImportance(boolean computeAttributeImportance)
computeAttributeImportance
- true to compute attribute importance
scorespublic boolean getComputeAttributeImportance()
public java.lang.String maxDepthTipText()
public int getMaxDepth()
public void setMaxDepth(int value)
value
- the maximum depth.public java.lang.String breakTiesRandomlyTipText()
public boolean getBreakTiesRandomly()
public void setBreakTiesRandomly(boolean newBreakTiesRandomly)
newBreakTiesRandomly
- true if ties are to be broken randomlypublic void setDebug(boolean debug)
setDebug
in class AbstractClassifier
debug
- true if debug output should be printedpublic void setNumDecimalPlaces(int num)
setNumDecimalPlaces
in class AbstractClassifier
public void setBatchSize(java.lang.String size)
setBatchSize
in interface BatchPredictor
setBatchSize
in class Bagging
size
- the batch size to usepublic void setSeed(int s)
setSeed
in interface Randomizable
setSeed
in class RandomizableParallelIteratedSingleClassifierEnhancer
s
- the seed to be usedpublic java.lang.String toString()
public double[] computeAverageImpurityDecreasePerAttribute(double[] nodeCounts) throws WekaException
nodeCounts
- an optional array that, if non-null, will hold the count
of the number of nodes at which each attribute was used for
splittingWekaException
public java.util.Enumeration<Option> listOptions()
listOptions
in interface OptionHandler
listOptions
in class Bagging
public java.lang.String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class Bagging
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-P Size of each bag, as a percentage of the training set size. (default 100)
-O Calculate the out of bag error.
-store-out-of-bag-predictions Whether to store out of bag predictions in internal evaluation object.
-output-out-of-bag-complexity-statistics Whether to output complexity-based statistics when out-of-bag evaluation is performed.
-print Print the individual classifiers in the output
-attribute-importance Compute and output attribute importance (mean impurity decrease method)
-I <num> Number of iterations (i.e., the number of trees in the random forest). (current value 100)
-num-slots <num> Number of execution slots. (default 1 - i.e. no parallelism) (use 0 to auto-detect number of cores)
-K <number of attributes> Number of attributes to randomly investigate. (default 0) (<1 = int(log_2(#predictors)+1)).
-M <minimum number of instances> Set minimum number of instances per leaf. (default 1)
-V <minimum variance for split> Set minimum numeric class variance proportion of train variance for split (default 1e-3).
-S <num> Seed for random number generator. (default 1)
-depth <num> The maximum depth of the tree, 0 for unlimited. (default 0)
-N <num> Number of folds for backfitting (default 0, no backfitting).
-U Allow unclassified instances.
-B Break ties randomly when several attributes look equally good.
-output-debug-info If set, classifier is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, classifier capabilities are not checked before classifier is built (use with caution).
-num-decimal-places The number of decimal places for the output of numbers in the model (default 2).
-batch-size The desired batch size for batch prediction (default 100).
setOptions
in interface OptionHandler
setOptions
in class Bagging
options
- the list of options as an array of stringsjava.lang.Exception
- if an option is not supportedpublic java.lang.String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class Bagging
public static void main(java.lang.String[] argv)
argv
- the options