public class ComplementNaiveBayes extends Classifier implements OptionHandler, WeightedInstancesHandler, TechnicalInformationHandler
@inproceedings{Rennie2003,
author = {Jason D. Rennie and Lawrence Shih and Jaime Teevan and David R. Karger},
booktitle = {ICML},
pages = {616-623},
publisher = {AAAI Press},
title = {Tackling the Poor Assumptions of Naive Bayes Text Classifiers},
year = {2003}
}
Valid options are:
-N Normalize the word weights for each class
-S Smoothing value to avoid zero WordGivenClass probabilities (default=1.0).
| Constructor and Description |
|---|
ComplementNaiveBayes() |
| Modifier and Type | Method and Description |
|---|---|
void |
buildClassifier(Instances instances)
Generates the classifier.
|
double |
classifyInstance(Instance instance)
Classifies a given instance.
|
Capabilities |
getCapabilities()
Returns default capabilities of the classifier.
|
boolean |
getNormalizeWordWeights()
Returns true if the word weights for each class are to be normalized
|
java.lang.String[] |
getOptions()
Gets the current settings of the classifier.
|
java.lang.String |
getRevision()
Returns the revision string.
|
double |
getSmoothingParameter()
Gets the smoothing value to be used to avoid zero WordGivenClass
probabilities.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing
detailed information about the technical background of this class,
e.g., paper reference or book this class is based on.
|
java.lang.String |
globalInfo()
Returns a string describing this classifier
|
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] argv)
Main method for testing this class.
|
java.lang.String |
normalizeWordWeightsTipText()
Returns the tip text for this property
|
void |
setNormalizeWordWeights(boolean doNormalize)
Sets whether if the word weights for each class should be normalized
|
void |
setOptions(java.lang.String[] options)
Parses a given list of options.
|
void |
setSmoothingParameter(double val)
Sets the smoothing value used to avoid zero WordGivenClass probabilities
|
java.lang.String |
smoothingParameterTipText()
Returns the tip text for this property
|
java.lang.String |
toString()
Prints out the internal model built by the classifier.
|
debugTipText, distributionForInstance, forName, getDebug, makeCopies, makeCopy, setDebugpublic java.util.Enumeration listOptions()
listOptions in interface OptionHandlerlistOptions in class Classifierpublic java.lang.String[] getOptions()
getOptions in interface OptionHandlergetOptions in class Classifierpublic void setOptions(java.lang.String[] options)
throws java.lang.Exception
-N Normalize the word weights for each class
-S Smoothing value to avoid zero WordGivenClass probabilities (default=1.0).
setOptions in interface OptionHandlersetOptions in class Classifieroptions - the list of options as an array of stringsjava.lang.Exception - if an option is not supportedpublic boolean getNormalizeWordWeights()
public void setNormalizeWordWeights(boolean doNormalize)
doNormalize - whether the word weights are to be normalizedpublic java.lang.String normalizeWordWeightsTipText()
public double getSmoothingParameter()
public void setSmoothingParameter(double val)
val - the new smooting valuepublic java.lang.String smoothingParameterTipText()
public java.lang.String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation in interface TechnicalInformationHandlerpublic Capabilities getCapabilities()
getCapabilities in interface CapabilitiesHandlergetCapabilities in class ClassifierCapabilitiespublic void buildClassifier(Instances instances) throws java.lang.Exception
buildClassifier in class Classifierinstances - set of instances serving as training datajava.lang.Exception - if the classifier has not been built successfullypublic double classifyInstance(Instance instance) throws java.lang.Exception
The classification rule is:
MinC(forAllWords(ti*Wci))
where
ti is the frequency of word i in the given instance
Wci is the weight of word i in Class c.
For more information see section 4.4 of the paper mentioned above in the classifiers description.
classifyInstance in class Classifierinstance - the instance to classifyjava.lang.Exception - if the classifier has not been built yet.public java.lang.String toString()
toString in class java.lang.Objectpublic java.lang.String getRevision()
getRevision in interface RevisionHandlergetRevision in class Classifierpublic static void main(java.lang.String[] argv)
argv - the options