public class SimpleCart extends RandomizableClassifier implements AdditionalMeasureProducer, TechnicalInformationHandler
@book{Breiman1984, address = {Belmont, California}, author = {Leo Breiman and Jerome H. Friedman and Richard A. Olshen and Charles J. Stone}, publisher = {Wadsworth International Group}, title = {Classification and Regression Trees}, year = {1984} }Valid options are:
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-M <min no> The minimal number of instances at the terminal nodes. (default 2)
-N <num folds> The number of folds used in the minimal cost-complexity pruning. (default 5)
-U Don't use the minimal cost-complexity pruning. (default yes).
-H Don't use the heuristic method for binary split. (default true).
-A Use 1 SE rule to make pruning decision. (default no).
-C Percentage of training data size (0-1]. (default 1).
Constructor and Description |
---|
SimpleCart() |
Modifier and Type | Method and Description |
---|---|
void |
buildClassifier(Instances data)
Build the classifier.
|
void |
calculateAlphas()
Updates the alpha field for all nodes.
|
double[] |
distributionForInstance(Instance instance)
Computes class probabilities for instance using the decision tree.
|
java.util.Enumeration |
enumerateMeasures()
Return an enumeration of the measure names.
|
Capabilities |
getCapabilities()
Returns default capabilities of the classifier.
|
boolean |
getHeuristic()
Get if use heuristic search for nominal attributes in multi-class problems.
|
double |
getMeasure(java.lang.String additionalMeasureName)
Returns the value of the named measure.
|
double |
getMinNumObj()
Get minimal number of instances at the terminal nodes.
|
int |
getNumFoldsPruning()
Set number of folds in internal cross-validation.
|
java.lang.String[] |
getOptions()
Gets the current settings of the classifier.
|
java.lang.String |
getRevision()
Returns the revision string.
|
double |
getSizePer()
Get training set size.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing
detailed information about the technical background of this class,
e.g., paper reference or book this class is based on.
|
boolean |
getUseOneSE()
Get if use the 1SE rule to choose final model.
|
boolean |
getUsePrune()
Get if use minimal cost-complexity pruning.
|
java.lang.String |
globalInfo()
Return a description suitable for displaying in the explorer/experimenter.
|
java.lang.String |
heuristicTipText()
Returns the tip text for this property
|
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] args)
Main method.
|
double |
measureTreeSize()
Return number of tree size.
|
java.lang.String |
minNumObjTipText()
Returns the tip text for this property
|
void |
modelErrors()
Updates the numIncorrectModel field for all nodes when subtree (to be
pruned) is rooted.
|
java.lang.String |
numFoldsPruningTipText()
Returns the tip text for this property
|
int |
numInnerNodes()
Method to count the number of inner nodes in the tree.
|
int |
numLeaves()
Compute number of leaf nodes.
|
int |
numNodes()
Compute size of the tree.
|
void |
prune(double alpha)
Prunes the original tree using the CART pruning scheme, given a
cost-complexity parameter alpha.
|
int |
prune(double[] alphas,
double[] errors,
Instances test)
Method for performing one fold in the cross-validation of minimal
cost-complexity pruning.
|
void |
setHeuristic(boolean value)
Set if use heuristic search for nominal attributes in multi-class problems.
|
void |
setMinNumObj(double value)
Set minimal number of instances at the terminal nodes.
|
void |
setNumFoldsPruning(int value)
Set number of folds in internal cross-validation.
|
void |
setOptions(java.lang.String[] options)
Parses a given list of options.
|
void |
setSizePer(double value)
Set training set size.
|
void |
setUseOneSE(boolean value)
Set if use the 1SE rule to choose final model.
|
void |
setUsePrune(boolean value)
Set if use minimal cost-complexity pruning.
|
java.lang.String |
sizePerTipText()
Returns the tip text for this property
|
java.lang.String |
toString()
Prints the decision tree using the protected toString method from below.
|
void |
treeErrors()
Updates the numIncorrectTree field for all nodes.
|
java.lang.String |
useOneSETipText()
Returns the tip text for this property
|
java.lang.String |
usePruneTipText()
Return the tip text for this property
|
getSeed, seedTipText, setSeed
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, setDebug
public java.lang.String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface TechnicalInformationHandler
public Capabilities getCapabilities()
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class Classifier
Capabilities
public void buildClassifier(Instances data) throws java.lang.Exception
buildClassifier
in class Classifier
data
- the training instancesjava.lang.Exception
- if something goes wrongpublic void prune(double alpha) throws java.lang.Exception
alpha
- the cost-complexity parameterjava.lang.Exception
- if something goes wrongpublic int prune(double[] alphas, double[] errors, Instances test) throws java.lang.Exception
alphas
- array to hold the generated alpha-valueserrors
- array to hold the corresponding error estimatestest
- test set of that fold (to obtain error estimates)java.lang.Exception
- if something goes wrongpublic void modelErrors() throws java.lang.Exception
java.lang.Exception
- if something goes wrongpublic void treeErrors() throws java.lang.Exception
java.lang.Exception
- if something goes wrongpublic void calculateAlphas() throws java.lang.Exception
java.lang.Exception
- if something goes wrongpublic double[] distributionForInstance(Instance instance) throws java.lang.Exception
distributionForInstance
in class Classifier
instance
- the instance for which class probabilities is to be computedjava.lang.Exception
- if something goes wrongpublic java.lang.String toString()
toString
in class java.lang.Object
public int numNodes()
public int numInnerNodes()
public int numLeaves()
public java.util.Enumeration listOptions()
listOptions
in interface OptionHandler
listOptions
in class RandomizableClassifier
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-M <min no> The minimal number of instances at the terminal nodes. (default 2)
-N <num folds> The number of folds used in the minimal cost-complexity pruning. (default 5)
-U Don't use the minimal cost-complexity pruning. (default yes).
-H Don't use the heuristic method for binary split. (default true).
-A Use 1 SE rule to make pruning decision. (default no).
-C Percentage of training data size (0-1]. (default 1).
setOptions
in interface OptionHandler
setOptions
in class RandomizableClassifier
options
- the list of options as an array of stringsjava.lang.Exception
- if an options is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class RandomizableClassifier
public java.util.Enumeration enumerateMeasures()
enumerateMeasures
in interface AdditionalMeasureProducer
public double measureTreeSize()
public double getMeasure(java.lang.String additionalMeasureName)
getMeasure
in interface AdditionalMeasureProducer
additionalMeasureName
- the name of the measure to query for its valuejava.lang.IllegalArgumentException
- if the named measure is not supportedpublic java.lang.String minNumObjTipText()
public void setMinNumObj(double value)
value
- minimal number of instances at the terminal nodespublic double getMinNumObj()
public java.lang.String numFoldsPruningTipText()
public void setNumFoldsPruning(int value)
value
- number of folds in internal cross-validation.public int getNumFoldsPruning()
public java.lang.String usePruneTipText()
public void setUsePrune(boolean value)
value
- if use minimal cost-complexity pruningpublic boolean getUsePrune()
public java.lang.String heuristicTipText()
public void setHeuristic(boolean value)
value
- if use heuristic search for nominal attributes in
multi-class problemspublic boolean getHeuristic()
public java.lang.String useOneSETipText()
public void setUseOneSE(boolean value)
value
- if use the 1SE rule to choose final modelpublic boolean getUseOneSE()
public java.lang.String sizePerTipText()
public void setSizePer(double value)
value
- training set sizepublic double getSizePer()
public java.lang.String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class Classifier
public static void main(java.lang.String[] args)
args
- the options for the classifier