public class Evaluation extends java.lang.Object implements Summarizable, RevisionHandler
public static void main(String [] args) {
runClassifier(new FunkyClassifier(), args);
}
------------------------------------------------------------------
Example usage from within an application:
Instances trainInstances = ... instances got from somewhere
Instances testInstances = ... instances got from somewhere
Classifier scheme = ... scheme got from somewhere
Evaluation evaluation = new Evaluation(trainInstances);
evaluation.evaluateModel(scheme, testInstances);
System.out.println(evaluation.toSummaryString());
Constructor and Description |
---|
Evaluation(Instances data)
Initializes all the counters for the evaluation.
|
Evaluation(Instances data,
CostMatrix costMatrix)
Initializes all the counters for the evaluation and also takes a cost
matrix as parameter.
|
Modifier and Type | Method and Description |
---|---|
double |
areaUnderROC(int classIndex)
Returns the area under ROC for those predictions that have been collected
in the evaluateClassifier(Classifier, Instances) method.
|
double |
avgCost()
Gets the average cost, that is, total cost of misclassifications (incorrect
plus unclassified) over the total number of instances.
|
double[][] |
confusionMatrix()
Returns a copy of the confusion matrix.
|
double |
correct()
Gets the number of instances correctly classified (that is, for which a
correct prediction was made).
|
double |
correlationCoefficient()
Returns the correlation coefficient if the class is numeric.
|
void |
crossValidateModel(Classifier classifier,
Instances data,
int numFolds,
java.util.Random random,
java.lang.Object... forPredictionsPrinting)
Performs a (stratified if class is nominal) cross-validation for a
classifier on a set of instances.
|
void |
crossValidateModel(java.lang.String classifierString,
Instances data,
int numFolds,
java.lang.String[] options,
java.util.Random random)
Performs a (stratified if class is nominal) cross-validation for a
classifier on a set of instances.
|
boolean |
equals(java.lang.Object obj)
Tests whether the current evaluation object is equal to another evaluation
object
|
double |
errorRate()
Returns the estimated error rate or the root mean squared error (if the
class is numeric).
|
double[] |
evaluateModel(Classifier classifier,
Instances data,
java.lang.Object... forPredictionsPrinting)
Evaluates the classifier on a given set of instances.
|
static java.lang.String |
evaluateModel(Classifier classifier,
java.lang.String[] options)
Evaluates a classifier with the options given in an array of strings.
|
static java.lang.String |
evaluateModel(java.lang.String classifierString,
java.lang.String[] options)
Evaluates a classifier with the options given in an array of strings.
|
double |
evaluateModelOnce(Classifier classifier,
Instance instance)
Evaluates the classifier on a single instance.
|
double |
evaluateModelOnce(double[] dist,
Instance instance)
Evaluates the supplied distribution on a single instance.
|
void |
evaluateModelOnce(double prediction,
Instance instance)
Evaluates the supplied prediction on a single instance.
|
double |
evaluateModelOnceAndRecordPrediction(Classifier classifier,
Instance instance)
Evaluates the classifier on a single instance and records the prediction
(if the class is nominal).
|
double |
evaluateModelOnceAndRecordPrediction(double[] dist,
Instance instance)
Evaluates the supplied distribution on a single instance.
|
double |
falseNegativeRate(int classIndex)
Calculate the false negative rate with respect to a particular class.
|
double |
falsePositiveRate(int classIndex)
Calculate the false positive rate with respect to a particular class.
|
double |
fMeasure(int classIndex)
Calculate the F-Measure with respect to a particular class.
|
double[] |
getClassPriors()
Get the current weighted class counts
|
java.lang.String |
getRevision()
Returns the revision string.
|
double |
incorrect()
Gets the number of instances incorrectly classified (that is, for which an
incorrect prediction was made).
|
double |
kappa()
Returns value of kappa statistic if class is nominal.
|
double |
KBInformation()
Return the total Kononenko & Bratko Information score in bits
|
double |
KBMeanInformation()
Return the Kononenko & Bratko Information score in bits per instance.
|
double |
KBRelativeInformation()
Return the Kononenko & Bratko Relative Information score
|
static void |
main(java.lang.String[] args)
A test method for this class.
|
double |
meanAbsoluteError()
Returns the mean absolute error.
|
double |
meanPriorAbsoluteError()
Returns the mean absolute error of the prior.
|
double |
numFalseNegatives(int classIndex)
Calculate number of false negatives with respect to a particular class.
|
double |
numFalsePositives(int classIndex)
Calculate number of false positives with respect to a particular class.
|
double |
numInstances()
Gets the number of test instances that had a known class value (actually
the sum of the weights of test instances with known class value).
|
double |
numTrueNegatives(int classIndex)
Calculate the number of true negatives with respect to a particular class.
|
double |
numTruePositives(int classIndex)
Calculate the number of true positives with respect to a particular class.
|
double |
pctCorrect()
Gets the percentage of instances correctly classified (that is, for which a
correct prediction was made).
|
double |
pctIncorrect()
Gets the percentage of instances incorrectly classified (that is, for which
an incorrect prediction was made).
|
double |
pctUnclassified()
Gets the percentage of instances not classified (that is, for which no
prediction was made by the classifier).
|
double |
precision(int classIndex)
Calculate the precision with respect to a particular class.
|
FastVector |
predictions()
Returns the predictions that have been collected.
|
static void |
printClassifications(Classifier classifier,
Instances train,
ConverterUtils.DataSource testSource,
int classIndex,
Range attributesToOutput,
boolean printDistribution,
java.lang.StringBuffer text)
Prints the predictions for the given dataset into a supplied StringBuffer
|
static void |
printClassifications(Classifier classifier,
Instances train,
ConverterUtils.DataSource testSource,
int classIndex,
Range attributesToOutput,
java.lang.StringBuffer predsText)
Prints the predictions for the given dataset into a String variable.
|
double |
priorEntropy()
Calculate the entropy of the prior distribution
|
double |
recall(int classIndex)
Calculate the recall with respect to a particular class.
|
double |
relativeAbsoluteError()
Returns the relative absolute error.
|
double |
rootMeanPriorSquaredError()
Returns the root mean prior squared error.
|
double |
rootMeanSquaredError()
Returns the root mean squared error.
|
double |
rootRelativeSquaredError()
Returns the root relative squared error if the class is numeric.
|
void |
setPriors(Instances train)
Sets the class prior probabilities
|
double |
SFEntropyGain()
Returns the total SF, which is the null model entropy minus the scheme
entropy.
|
double |
SFMeanEntropyGain()
Returns the SF per instance, which is the null model entropy minus the
scheme entropy, per instance.
|
double |
SFMeanPriorEntropy()
Returns the entropy per instance for the null model
|
double |
SFMeanSchemeEntropy()
Returns the entropy per instance for the scheme
|
double |
SFPriorEntropy()
Returns the total entropy for the null model
|
double |
SFSchemeEntropy()
Returns the total entropy for the scheme
|
java.lang.String |
toClassDetailsString()
Generates a breakdown of the accuracy for each class (with default title),
incorporating various information-retrieval statistics, such as true/false
positive rate, precision/recall/F-Measure.
|
java.lang.String |
toClassDetailsString(java.lang.String title)
Generates a breakdown of the accuracy for each class, incorporating various
information-retrieval statistics, such as true/false positive rate,
precision/recall/F-Measure.
|
java.lang.String |
toCumulativeMarginDistributionString()
Output the cumulative margin distribution as a string suitable for input
for gnuplot or similar package.
|
java.lang.String |
toMatrixString()
Calls toMatrixString() with a default title.
|
java.lang.String |
toMatrixString(java.lang.String title)
Outputs the performance statistics as a classification confusion matrix.
|
java.lang.String |
toSummaryString()
Calls toSummaryString() with no title and no complexity stats
|
java.lang.String |
toSummaryString(boolean printComplexityStatistics)
Calls toSummaryString() with a default title.
|
java.lang.String |
toSummaryString(java.lang.String title,
boolean printComplexityStatistics)
Outputs the performance statistics in summary form.
|
double |
totalCost()
Gets the total cost, that is, the cost of each prediction times the weight
of the instance, summed over all instances.
|
double |
trueNegativeRate(int classIndex)
Calculate the true negative rate with respect to a particular class.
|
double |
truePositiveRate(int classIndex)
Calculate the true positive rate with respect to a particular class.
|
double |
unclassified()
Gets the number of instances not classified (that is, for which no
prediction was made by the classifier).
|
void |
updatePriors(Instance instance)
Updates the class prior probabilities (when incrementally training)
|
void |
useNoPriors()
disables the use of priors, e.g., in case of de-serialized schemes that
have no access to the original training set, but are evaluated on a set
set.
|
double |
weightedAreaUnderROC()
Calculates the weighted (by class size) AUC.
|
double |
weightedFalseNegativeRate()
Calculates the weighted (by class size) false negative rate.
|
double |
weightedFalsePositiveRate()
Calculates the weighted (by class size) false positive rate.
|
double |
weightedFMeasure()
Calculates the weighted (by class size) F-Measure.
|
double |
weightedPrecision()
Calculates the weighted (by class size) false precision.
|
double |
weightedRecall()
Calculates the weighted (by class size) recall.
|
double |
weightedTrueNegativeRate()
Calculates the weighted (by class size) true negative rate.
|
double |
weightedTruePositiveRate()
Calculates the weighted (by class size) true positive rate.
|
static java.lang.String |
wekaStaticWrapper(Sourcable classifier,
java.lang.String className)
Wraps a static classifier in enough source to test using the weka class
libraries.
|
public Evaluation(Instances data) throws java.lang.Exception
useNoPriors()
if the dataset is the test set and you can't
initialize with the priors from the training set via
setPriors(Instances)
.data
- set of training instances, to get some header information and
prior class distribution informationjava.lang.Exception
- if the class is not defineduseNoPriors()
,
setPriors(Instances)
public Evaluation(Instances data, CostMatrix costMatrix) throws java.lang.Exception
useNoPriors()
if the dataset is the
test set and you can't initialize with the priors from the training set via
setPriors(Instances)
.data
- set of training instances, to get some header information and
prior class distribution informationcostMatrix
- the cost matrix---if null, default costs will be usedjava.lang.Exception
- if cost matrix is not compatible with data, the class is
not defined or the class is numericuseNoPriors()
,
setPriors(Instances)
public double areaUnderROC(int classIndex)
classIndex
- the index of the class to consider as "positive"public double weightedAreaUnderROC()
public double[][] confusionMatrix()
public void crossValidateModel(Classifier classifier, Instances data, int numFolds, java.util.Random random, java.lang.Object... forPredictionsPrinting) throws java.lang.Exception
classifier
- the classifier with any options set.data
- the data on which the cross-validation is to be performednumFolds
- the number of folds for the cross-validationrandom
- random number generator for randomizationforPredictionsString
- varargs parameter that, if supplied, is
expected to hold a StringBuffer to print predictions to, a Range
of attributes to output and a Boolean (true if the distribution is
to be printed)java.lang.Exception
- if a classifier could not be generated successfully or
the class is not definedpublic void crossValidateModel(java.lang.String classifierString, Instances data, int numFolds, java.lang.String[] options, java.util.Random random) throws java.lang.Exception
classifierString
- a string naming the class of the classifierdata
- the data on which the cross-validation is to be performednumFolds
- the number of folds for the cross-validationoptions
- the options to the classifier. Any optionsrandom
- the random number generator for randomizing the data accepted
by the classifier will be removed from this array.java.lang.Exception
- if a classifier could not be generated successfully or
the class is not definedpublic static java.lang.String evaluateModel(java.lang.String classifierString, java.lang.String[] options) throws java.lang.Exception
classifierString
- class of machine learning classifier as a stringoptions
- the array of string containing the optionsjava.lang.Exception
- if model could not be evaluated successfullypublic static void main(java.lang.String[] args)
args
- an array of command line arguments, the first of which must be
the class name of a classifier.public static java.lang.String evaluateModel(Classifier classifier, java.lang.String[] options) throws java.lang.Exception
classifier
- machine learning classifieroptions
- the array of string containing the optionsjava.lang.Exception
- if model could not be evaluated successfullypublic double[] evaluateModel(Classifier classifier, Instances data, java.lang.Object... forPredictionsPrinting) throws java.lang.Exception
classifier
- machine learning classifierdata
- set of test instances for evaluationforPredictionsString
- varargs parameter that, if supplied, is
expected to hold a StringBuffer to print predictions to, a Range
of attributes to output and a Boolean (true if the distribution is
to be printed)java.lang.Exception
- if model could not be evaluated successfullypublic double evaluateModelOnceAndRecordPrediction(Classifier classifier, Instance instance) throws java.lang.Exception
classifier
- machine learning classifierinstance
- the test instance to be classifiedjava.lang.Exception
- if model could not be evaluated successfully or the data
contains string attributespublic double evaluateModelOnce(Classifier classifier, Instance instance) throws java.lang.Exception
classifier
- machine learning classifierinstance
- the test instance to be classifiedjava.lang.Exception
- if model could not be evaluated successfully or the data
contains string attributespublic double evaluateModelOnce(double[] dist, Instance instance) throws java.lang.Exception
dist
- the supplied distributioninstance
- the test instance to be classifiedjava.lang.Exception
- if model could not be evaluated successfullypublic double evaluateModelOnceAndRecordPrediction(double[] dist, Instance instance) throws java.lang.Exception
dist
- the supplied distributioninstance
- the test instance to be classifiedjava.lang.Exception
- if model could not be evaluated successfullypublic void evaluateModelOnce(double prediction, Instance instance) throws java.lang.Exception
prediction
- the supplied predictioninstance
- the test instance to be classifiedjava.lang.Exception
- if model could not be evaluated successfullypublic FastVector predictions()
public static java.lang.String wekaStaticWrapper(Sourcable classifier, java.lang.String className) throws java.lang.Exception
classifier
- a Sourcable ClassifierclassName
- the name to give to the source code classjava.lang.Exception
- if code-generation failspublic final double numInstances()
public final double incorrect()
public final double pctIncorrect()
public final double totalCost()
public final double avgCost()
public final double correct()
public final double pctCorrect()
public final double unclassified()
public final double pctUnclassified()
public final double errorRate()
public final double kappa()
public final double correlationCoefficient() throws java.lang.Exception
java.lang.Exception
- if class is not numericpublic final double meanAbsoluteError()
public final double meanPriorAbsoluteError()
public final double relativeAbsoluteError() throws java.lang.Exception
java.lang.Exception
- if it can't be computedpublic final double rootMeanSquaredError()
public final double rootMeanPriorSquaredError()
public final double rootRelativeSquaredError()
public final double priorEntropy() throws java.lang.Exception
java.lang.Exception
- if the class is not nominalpublic final double KBInformation() throws java.lang.Exception
java.lang.Exception
- if the class is not nominalpublic final double KBMeanInformation() throws java.lang.Exception
java.lang.Exception
- if the class is not nominalpublic final double KBRelativeInformation() throws java.lang.Exception
java.lang.Exception
- if the class is not nominalpublic final double SFPriorEntropy()
public final double SFMeanPriorEntropy()
public final double SFSchemeEntropy()
public final double SFMeanSchemeEntropy()
public final double SFEntropyGain()
public final double SFMeanEntropyGain()
public java.lang.String toCumulativeMarginDistributionString() throws java.lang.Exception
java.lang.Exception
- if the class attribute is nominalpublic java.lang.String toSummaryString()
toSummaryString
in interface Summarizable
public java.lang.String toSummaryString(boolean printComplexityStatistics)
printComplexityStatistics
- if true, complexity statistics are
returned as wellpublic java.lang.String toSummaryString(java.lang.String title, boolean printComplexityStatistics)
title
- the title for the statisticsprintComplexityStatistics
- if true, complexity statistics are
returned as wellpublic java.lang.String toMatrixString() throws java.lang.Exception
java.lang.Exception
- if the class is numericpublic java.lang.String toMatrixString(java.lang.String title) throws java.lang.Exception
title
- the title for the confusion matrixjava.lang.Exception
- if the class is numericpublic java.lang.String toClassDetailsString() throws java.lang.Exception
java.lang.Exception
- if class is not nominalpublic java.lang.String toClassDetailsString(java.lang.String title) throws java.lang.Exception
title
- the title to prepend the stats string withjava.lang.Exception
- if class is not nominalpublic double numTruePositives(int classIndex)
correctly classified positives
classIndex
- the index of the class to consider as "positive"public double truePositiveRate(int classIndex)
correctly classified positives ------------------------------ total positives
classIndex
- the index of the class to consider as "positive"public double weightedTruePositiveRate()
public double numTrueNegatives(int classIndex)
correctly classified negatives
classIndex
- the index of the class to consider as "positive"public double trueNegativeRate(int classIndex)
correctly classified negatives ------------------------------ total negatives
classIndex
- the index of the class to consider as "positive"public double weightedTrueNegativeRate()
public double numFalsePositives(int classIndex)
incorrectly classified negatives
classIndex
- the index of the class to consider as "positive"public double falsePositiveRate(int classIndex)
incorrectly classified negatives -------------------------------- total negatives
classIndex
- the index of the class to consider as "positive"public double weightedFalsePositiveRate()
public double numFalseNegatives(int classIndex)
incorrectly classified positives
classIndex
- the index of the class to consider as "positive"public double falseNegativeRate(int classIndex)
incorrectly classified positives -------------------------------- total positives
classIndex
- the index of the class to consider as "positive"public double weightedFalseNegativeRate()
public double recall(int classIndex)
correctly classified positives ------------------------------ total positives(Which is also the same as the truePositiveRate.)
classIndex
- the index of the class to consider as "positive"public double weightedRecall()
public double precision(int classIndex)
correctly classified positives ------------------------------ total predicted as positive
classIndex
- the index of the class to consider as "positive"public double weightedPrecision()
public double fMeasure(int classIndex)
2 * recall * precision ---------------------- recall + precision
classIndex
- the index of the class to consider as "positive"public double weightedFMeasure()
public void setPriors(Instances train) throws java.lang.Exception
train
- the training instances used to determine the prior
probabilitiesjava.lang.Exception
- if the class attribute of the instances is not setpublic double[] getClassPriors()
public void updatePriors(Instance instance) throws java.lang.Exception
instance
- the new training instance seenjava.lang.Exception
- if the class of the instance is not setpublic void useNoPriors()
public boolean equals(java.lang.Object obj)
equals
in class java.lang.Object
obj
- the object to compare againstpublic static void printClassifications(Classifier classifier, Instances train, ConverterUtils.DataSource testSource, int classIndex, Range attributesToOutput, java.lang.StringBuffer predsText) throws java.lang.Exception
classifier
- the classifier to usetrain
- the training datatestSource
- the test setclassIndex
- the class index (1-based), if -1 ot does not override the
class index is stored in the data file (by using the last
attribute)attributesToOutput
- the indices of the attributes to outputjava.lang.Exception
- if test file cannot be openedpublic static void printClassifications(Classifier classifier, Instances train, ConverterUtils.DataSource testSource, int classIndex, Range attributesToOutput, boolean printDistribution, java.lang.StringBuffer text) throws java.lang.Exception
classifier
- the classifier to usetrain
- the training datatestSource
- the test setclassIndex
- the class index (1-based), if -1 ot does not override the
class index is stored in the data file (by using the last
attribute)attributesToOutput
- the indices of the attributes to outputprintDistribution
- prints the complete distribution for nominal
classes, not just the predicted valuetext
- StringBuffer to hold the printed predictionsjava.lang.Exception
- if test file cannot be openedpublic java.lang.String getRevision()
getRevision
in interface RevisionHandler