public class BFTree extends RandomizableClassifier implements AdditionalMeasureProducer, TechnicalInformationHandler
@mastersthesis{Shi2007,
address = {Hamilton, NZ},
author = {Haijian Shi},
note = {COMP594},
school = {University of Waikato},
title = {Best-first decision tree learning},
year = {2007}
}
@article{Friedman2000,
author = {Jerome Friedman and Trevor Hastie and Robert Tibshirani},
journal = {Annals of statistics},
number = {2},
pages = {337-407},
title = {Additive logistic regression : A statistical view of boosting},
volume = {28},
year = {2000},
ISSN = {0090-5364}
}
Valid options are:
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-P <UNPRUNED|POSTPRUNED|PREPRUNED> The pruning strategy. (default: POSTPRUNED)
-M <min no> The minimal number of instances at the terminal nodes. (default 2)
-N <num folds> The number of folds used in the pruning. (default 5)
-H Don't use heuristic search for nominal attributes in multi-class problem (default yes).
-G Don't use Gini index for splitting (default yes), if not information is used.
-R Don't use error rate in internal cross-validation (default yes), but root mean squared error.
-A Use the 1 SE rule to make pruning decision. (default no).
-C Percentage of training data size (0-1] (default 1).
| Modifier and Type | Field and Description |
|---|---|
static int |
PRUNING_POSTPRUNING
pruning strategy: post-pruning
|
static int |
PRUNING_PREPRUNING
pruning strategy: pre-pruning
|
static int |
PRUNING_UNPRUNED
pruning strategy: un-pruned
|
static Tag[] |
TAGS_PRUNING
pruning strategy
|
| Constructor and Description |
|---|
BFTree() |
| Modifier and Type | Method and Description |
|---|---|
void |
buildClassifier(Instances data)
Method for building a BestFirst decision tree classifier.
|
double[] |
distributionForInstance(Instance instance)
Computes class probabilities for instance using the decision tree.
|
java.util.Enumeration |
enumerateMeasures()
Return an enumeration of the measure names.
|
Capabilities |
getCapabilities()
Returns default capabilities of the classifier.
|
boolean |
getHeuristic()
Get if use heuristic search for nominal attributes in multi-class problems.
|
double |
getMeasure(java.lang.String additionalMeasureName)
Returns the value of the named measure
|
int |
getMinNumObj()
Get minimal number of instances at the terminal nodes.
|
int |
getNumFoldsPruning()
Set number of folds in internal cross-validation.
|
java.lang.String[] |
getOptions()
Gets the current settings of the Classifier.
|
SelectedTag |
getPruningStrategy()
Gets the pruning strategy.
|
java.lang.String |
getRevision()
Returns the revision string.
|
double |
getSizePer()
Get training set size.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing
detailed information about the technical background of this class,
e.g., paper reference or book this class is based on.
|
boolean |
getUseErrorRate()
Get if use error rate in internal cross-validation.
|
boolean |
getUseGini()
Get if use Gini index as splitting criterion.
|
boolean |
getUseOneSE()
Get if use the 1SE rule to choose final model.
|
java.lang.String |
globalInfo()
Returns a string describing classifier
|
java.lang.String |
heuristicTipText()
Returns the tip text for this property
|
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] args)
Main method.
|
double |
measureTreeSize()
Return number of tree size.
|
java.lang.String |
minNumObjTipText()
Returns the tip text for this property
|
java.lang.String |
numFoldsPruningTipText()
Returns the tip text for this property
|
int |
numLeaves()
Compute number of leaf nodes.
|
int |
numNodes()
Compute size of the tree.
|
java.lang.String |
pruningStrategyTipText()
Returns the tip text for this property
|
void |
setHeuristic(boolean value)
Set if use heuristic search for nominal attributes in multi-class problems.
|
void |
setMinNumObj(int value)
Set minimal number of instances at the terminal nodes.
|
void |
setNumFoldsPruning(int value)
Set number of folds in internal cross-validation.
|
void |
setOptions(java.lang.String[] options)
Parses the options for this object.
|
void |
setPruningStrategy(SelectedTag value)
Sets the pruning strategy.
|
void |
setSizePer(double value)
Set training set size.
|
void |
setUseErrorRate(boolean value)
Set if use error rate in internal cross-validation.
|
void |
setUseGini(boolean value)
Set if use Gini index as splitting criterion.
|
void |
setUseOneSE(boolean value)
Set if use the 1SE rule to choose final model.
|
java.lang.String |
sizePerTipText()
Returns the tip text for this property
|
java.lang.String |
toString()
Prints the decision tree using the protected toString method from below.
|
java.lang.String |
useErrorRateTipText()
Returns the tip text for this property
|
java.lang.String |
useGiniTipText()
Returns the tip text for this property
|
java.lang.String |
useOneSETipText()
Returns the tip text for this property
|
getSeed, seedTipText, setSeedclassifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, setDebugpublic static final int PRUNING_UNPRUNED
public static final int PRUNING_POSTPRUNING
public static final int PRUNING_PREPRUNING
public static final Tag[] TAGS_PRUNING
public java.lang.String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation in interface TechnicalInformationHandlerpublic Capabilities getCapabilities()
getCapabilities in interface CapabilitiesHandlergetCapabilities in class ClassifierCapabilitiespublic void buildClassifier(Instances data) throws java.lang.Exception
buildClassifier in class Classifierdata - set of instances serving as training datajava.lang.Exception - if decision tree cannot be built successfullypublic double[] distributionForInstance(Instance instance) throws java.lang.Exception
distributionForInstance in class Classifierinstance - the instance for which class probabilities is to be computedjava.lang.Exception - if something goes wrongpublic java.lang.String toString()
toString in class java.lang.Objectpublic int numNodes()
public int numLeaves()
public java.util.Enumeration listOptions()
listOptions in interface OptionHandlerlistOptions in class RandomizableClassifierpublic void setOptions(java.lang.String[] options)
throws java.lang.Exception
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-P <UNPRUNED|POSTPRUNED|PREPRUNED> The pruning strategy. (default: POSTPRUNED)
-M <min no> The minimal number of instances at the terminal nodes. (default 2)
-N <num folds> The number of folds used in the pruning. (default 5)
-H Don't use heuristic search for nominal attributes in multi-class problem (default yes).
-G Don't use Gini index for splitting (default yes), if not information is used.
-R Don't use error rate in internal cross-validation (default yes), but root mean squared error.
-A Use the 1 SE rule to make pruning decision. (default no).
-C Percentage of training data size (0-1] (default 1).
setOptions in interface OptionHandlersetOptions in class RandomizableClassifieroptions - the options to usejava.lang.Exception - if setting of options failspublic java.lang.String[] getOptions()
getOptions in interface OptionHandlergetOptions in class RandomizableClassifierpublic java.util.Enumeration enumerateMeasures()
enumerateMeasures in interface AdditionalMeasureProducerpublic double measureTreeSize()
public double getMeasure(java.lang.String additionalMeasureName)
getMeasure in interface AdditionalMeasureProduceradditionalMeasureName - the name of the measure to query for its valuejava.lang.IllegalArgumentException - if the named measure is not supportedpublic java.lang.String pruningStrategyTipText()
public void setPruningStrategy(SelectedTag value)
value - the strategypublic SelectedTag getPruningStrategy()
public java.lang.String minNumObjTipText()
public void setMinNumObj(int value)
value - minimal number of instances at the terminal nodespublic int getMinNumObj()
public java.lang.String numFoldsPruningTipText()
public void setNumFoldsPruning(int value)
value - the number of foldspublic int getNumFoldsPruning()
public java.lang.String heuristicTipText()
public void setHeuristic(boolean value)
value - if use heuristic search for nominal attributes in
multi-class problemspublic boolean getHeuristic()
public java.lang.String useGiniTipText()
public void setUseGini(boolean value)
value - if use Gini index splitting criterionpublic boolean getUseGini()
public java.lang.String useErrorRateTipText()
public void setUseErrorRate(boolean value)
value - if use error rate in internal cross-validationpublic boolean getUseErrorRate()
public java.lang.String useOneSETipText()
public void setUseOneSE(boolean value)
value - if use the 1SE rule to choose final modelpublic boolean getUseOneSE()
public java.lang.String sizePerTipText()
public void setSizePer(double value)
value - training set sizepublic double getSizePer()
public java.lang.String getRevision()
getRevision in interface RevisionHandlergetRevision in class Classifierpublic static void main(java.lang.String[] args)
args - the options for the classifier