public class BFTree extends RandomizableClassifier implements AdditionalMeasureProducer, TechnicalInformationHandler
@mastersthesis{Shi2007, address = {Hamilton, NZ}, author = {Haijian Shi}, note = {COMP594}, school = {University of Waikato}, title = {Best-first decision tree learning}, year = {2007} } @article{Friedman2000, author = {Jerome Friedman and Trevor Hastie and Robert Tibshirani}, journal = {Annals of statistics}, number = {2}, pages = {337-407}, title = {Additive logistic regression : A statistical view of boosting}, volume = {28}, year = {2000}, ISSN = {0090-5364} }Valid options are:
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-P <UNPRUNED|POSTPRUNED|PREPRUNED> The pruning strategy. (default: POSTPRUNED)
-M <min no> The minimal number of instances at the terminal nodes. (default 2)
-N <num folds> The number of folds used in the pruning. (default 5)
-H Don't use heuristic search for nominal attributes in multi-class problem (default yes).
-G Don't use Gini index for splitting (default yes), if not information is used.
-R Don't use error rate in internal cross-validation (default yes), but root mean squared error.
-A Use the 1 SE rule to make pruning decision. (default no).
-C Percentage of training data size (0-1] (default 1).
Modifier and Type | Field and Description |
---|---|
static int |
PRUNING_POSTPRUNING
pruning strategy: post-pruning
|
static int |
PRUNING_PREPRUNING
pruning strategy: pre-pruning
|
static int |
PRUNING_UNPRUNED
pruning strategy: un-pruned
|
static Tag[] |
TAGS_PRUNING
pruning strategy
|
Constructor and Description |
---|
BFTree() |
Modifier and Type | Method and Description |
---|---|
void |
buildClassifier(Instances data)
Method for building a BestFirst decision tree classifier.
|
double[] |
distributionForInstance(Instance instance)
Computes class probabilities for instance using the decision tree.
|
java.util.Enumeration |
enumerateMeasures()
Return an enumeration of the measure names.
|
Capabilities |
getCapabilities()
Returns default capabilities of the classifier.
|
boolean |
getHeuristic()
Get if use heuristic search for nominal attributes in multi-class problems.
|
double |
getMeasure(java.lang.String additionalMeasureName)
Returns the value of the named measure
|
int |
getMinNumObj()
Get minimal number of instances at the terminal nodes.
|
int |
getNumFoldsPruning()
Set number of folds in internal cross-validation.
|
java.lang.String[] |
getOptions()
Gets the current settings of the Classifier.
|
SelectedTag |
getPruningStrategy()
Gets the pruning strategy.
|
java.lang.String |
getRevision()
Returns the revision string.
|
double |
getSizePer()
Get training set size.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing
detailed information about the technical background of this class,
e.g., paper reference or book this class is based on.
|
boolean |
getUseErrorRate()
Get if use error rate in internal cross-validation.
|
boolean |
getUseGini()
Get if use Gini index as splitting criterion.
|
boolean |
getUseOneSE()
Get if use the 1SE rule to choose final model.
|
java.lang.String |
globalInfo()
Returns a string describing classifier
|
java.lang.String |
heuristicTipText()
Returns the tip text for this property
|
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] args)
Main method.
|
double |
measureTreeSize()
Return number of tree size.
|
java.lang.String |
minNumObjTipText()
Returns the tip text for this property
|
java.lang.String |
numFoldsPruningTipText()
Returns the tip text for this property
|
int |
numLeaves()
Compute number of leaf nodes.
|
int |
numNodes()
Compute size of the tree.
|
java.lang.String |
pruningStrategyTipText()
Returns the tip text for this property
|
void |
setHeuristic(boolean value)
Set if use heuristic search for nominal attributes in multi-class problems.
|
void |
setMinNumObj(int value)
Set minimal number of instances at the terminal nodes.
|
void |
setNumFoldsPruning(int value)
Set number of folds in internal cross-validation.
|
void |
setOptions(java.lang.String[] options)
Parses the options for this object.
|
void |
setPruningStrategy(SelectedTag value)
Sets the pruning strategy.
|
void |
setSizePer(double value)
Set training set size.
|
void |
setUseErrorRate(boolean value)
Set if use error rate in internal cross-validation.
|
void |
setUseGini(boolean value)
Set if use Gini index as splitting criterion.
|
void |
setUseOneSE(boolean value)
Set if use the 1SE rule to choose final model.
|
java.lang.String |
sizePerTipText()
Returns the tip text for this property
|
java.lang.String |
toString()
Prints the decision tree using the protected toString method from below.
|
java.lang.String |
useErrorRateTipText()
Returns the tip text for this property
|
java.lang.String |
useGiniTipText()
Returns the tip text for this property
|
java.lang.String |
useOneSETipText()
Returns the tip text for this property
|
getSeed, seedTipText, setSeed
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, setDebug
public static final int PRUNING_UNPRUNED
public static final int PRUNING_POSTPRUNING
public static final int PRUNING_PREPRUNING
public static final Tag[] TAGS_PRUNING
public java.lang.String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface TechnicalInformationHandler
public Capabilities getCapabilities()
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class Classifier
Capabilities
public void buildClassifier(Instances data) throws java.lang.Exception
buildClassifier
in class Classifier
data
- set of instances serving as training datajava.lang.Exception
- if decision tree cannot be built successfullypublic double[] distributionForInstance(Instance instance) throws java.lang.Exception
distributionForInstance
in class Classifier
instance
- the instance for which class probabilities is to be computedjava.lang.Exception
- if something goes wrongpublic java.lang.String toString()
toString
in class java.lang.Object
public int numNodes()
public int numLeaves()
public java.util.Enumeration listOptions()
listOptions
in interface OptionHandler
listOptions
in class RandomizableClassifier
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-P <UNPRUNED|POSTPRUNED|PREPRUNED> The pruning strategy. (default: POSTPRUNED)
-M <min no> The minimal number of instances at the terminal nodes. (default 2)
-N <num folds> The number of folds used in the pruning. (default 5)
-H Don't use heuristic search for nominal attributes in multi-class problem (default yes).
-G Don't use Gini index for splitting (default yes), if not information is used.
-R Don't use error rate in internal cross-validation (default yes), but root mean squared error.
-A Use the 1 SE rule to make pruning decision. (default no).
-C Percentage of training data size (0-1] (default 1).
setOptions
in interface OptionHandler
setOptions
in class RandomizableClassifier
options
- the options to usejava.lang.Exception
- if setting of options failspublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class RandomizableClassifier
public java.util.Enumeration enumerateMeasures()
enumerateMeasures
in interface AdditionalMeasureProducer
public double measureTreeSize()
public double getMeasure(java.lang.String additionalMeasureName)
getMeasure
in interface AdditionalMeasureProducer
additionalMeasureName
- the name of the measure to query for its valuejava.lang.IllegalArgumentException
- if the named measure is not supportedpublic java.lang.String pruningStrategyTipText()
public void setPruningStrategy(SelectedTag value)
value
- the strategypublic SelectedTag getPruningStrategy()
public java.lang.String minNumObjTipText()
public void setMinNumObj(int value)
value
- minimal number of instances at the terminal nodespublic int getMinNumObj()
public java.lang.String numFoldsPruningTipText()
public void setNumFoldsPruning(int value)
value
- the number of foldspublic int getNumFoldsPruning()
public java.lang.String heuristicTipText()
public void setHeuristic(boolean value)
value
- if use heuristic search for nominal attributes in
multi-class problemspublic boolean getHeuristic()
public java.lang.String useGiniTipText()
public void setUseGini(boolean value)
value
- if use Gini index splitting criterionpublic boolean getUseGini()
public java.lang.String useErrorRateTipText()
public void setUseErrorRate(boolean value)
value
- if use error rate in internal cross-validationpublic boolean getUseErrorRate()
public java.lang.String useOneSETipText()
public void setUseOneSE(boolean value)
value
- if use the 1SE rule to choose final modelpublic boolean getUseOneSE()
public java.lang.String sizePerTipText()
public void setSizePer(double value)
value
- training set sizepublic double getSizePer()
public java.lang.String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class Classifier
public static void main(java.lang.String[] args)
args
- the options for the classifier