public class Discretize extends Filter implements SupervisedFilter, OptionHandler, WeightedInstancesHandler, WeightedAttributesHandler, TechnicalInformationHandler
@inproceedings{Fayyad1993, author = {Usama M. Fayyad and Keki B. Irani}, booktitle = {Thirteenth International Joint Conference on Articial Intelligence}, pages = {1022-1027}, publisher = {Morgan Kaufmann Publishers}, title = {Multi-interval discretization of continuousvalued attributes for classification learning}, volume = {2}, year = {1993} } @inproceedings{Kononenko1995, author = {Igor Kononenko}, booktitle = {14th International Joint Conference on Articial Intelligence}, pages = {1034-1040}, title = {On Biases in Estimating Multi-Valued Attributes}, year = {1995}, PS = {http://ai.fri.uni-lj.si/papers/kononenko95-ijcai.ps.gz} }Valid options are:
-R <col1,col2-col4,...> Specifies list of columns to Discretize. First and last are valid indexes. (default none)
-V Invert matching sense of column indexes.
-D Output binary attributes for discretized attributes.
-Y Use bin numbers rather than ranges for discretized attributes.
-E Use better encoding of split point for MDL.
-K Use Kononenko's MDL criterion.
-precision <integer> Precision for bin boundary labels. (default = 6 decimal places).
-spread-attribute-weight When generating binary attributes, spread weight of old attribute across new attributes. Do not give each new attribute the old weight.
Constructor and Description |
---|
Discretize()
Constructor - initialises the filter
|
Modifier and Type | Method and Description |
---|---|
java.lang.String |
attributeIndicesTipText()
Returns the tip text for this property
|
boolean |
batchFinished()
Signifies that this batch of input to the filter is finished.
|
java.lang.String |
binRangePrecisionTipText()
Returns the tip text for this property
|
java.lang.String |
getAttributeIndices()
Gets the current range selection
|
int |
getBinRangePrecision()
Get the precision for bin boundaries.
|
java.lang.String |
getBinRangesString(int attributeIndex)
Gets the bin ranges string for an attribute
|
Capabilities |
getCapabilities()
Returns the Capabilities of this filter.
|
double[] |
getCutPoints(int attributeIndex)
Gets the cut points for an attribute
|
boolean |
getInvertSelection()
Gets whether the supplied columns are to be removed or kept
|
boolean |
getMakeBinary()
Gets whether binary attributes should be made for discretized ones.
|
java.lang.String[] |
getOptions()
Gets the current settings of the filter.
|
java.lang.String |
getRevision()
Returns the revision string.
|
boolean |
getSpreadAttributeWeight()
If true, when generating binary attributes, spread weight of old
attribute across new attributes.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed
information about the technical background of this class, e.g., paper
reference or book this class is based on.
|
boolean |
getUseBetterEncoding()
Gets whether better encoding is to be used for MDL.
|
boolean |
getUseBinNumbers()
Gets whether bin numbers rather than ranges should be used for discretized
attributes.
|
boolean |
getUseKononenko()
Gets whether Kononenko's MDL criterion is to be used.
|
java.lang.String |
globalInfo()
Returns a string describing this filter
|
boolean |
input(Instance instance)
Input an instance for filtering.
|
java.lang.String |
invertSelectionTipText()
Returns the tip text for this property
|
java.util.Enumeration<Option> |
listOptions()
Gets an enumeration describing the available options.
|
static void |
main(java.lang.String[] argv)
Main method for testing this class.
|
java.lang.String |
makeBinaryTipText()
Returns the tip text for this property
|
void |
setAttributeIndices(java.lang.String rangeList)
Sets which attributes are to be Discretized (only numeric attributes among
the selection will be Discretized).
|
void |
setAttributeIndicesArray(int[] attributes)
Sets which attributes are to be Discretized (only numeric attributes among
the selection will be Discretized).
|
void |
setBinRangePrecision(int p)
Set the precision for bin boundaries.
|
boolean |
setInputFormat(Instances instanceInfo)
Sets the format of the input instances.
|
void |
setInvertSelection(boolean invert)
Sets whether selected columns should be removed or kept.
|
void |
setMakeBinary(boolean makeBinary)
Sets whether binary attributes should be made for discretized ones.
|
void |
setOptions(java.lang.String[] options)
Parses a given list of options.
|
void |
setSpreadAttributeWeight(boolean p)
If true, when generating binary attributes, spread weight of old
attribute across new attributes.
|
void |
setUseBetterEncoding(boolean useBetterEncoding)
Sets whether better encoding is to be used for MDL.
|
void |
setUseBinNumbers(boolean useBinNumbers)
Sets whether bin numbers rather than ranges should be used for discretized
attributes.
|
void |
setUseKononenko(boolean useKon)
Sets whether Kononenko's MDL criterion is to be used.
|
java.lang.String |
spreadAttributeWeightTipText()
Returns the tip text for this property
|
java.lang.String |
useBetterEncodingTipText()
Returns the tip text for this property
|
java.lang.String |
useBinNumbersTipText()
Returns the tip text for this property
|
java.lang.String |
useKononenkoTipText()
Returns the tip text for this property
|
batchFilterFile, debugTipText, doNotCheckCapabilitiesTipText, filterFile, getCapabilities, getCopyOfInputFormat, getDebug, getDoNotCheckCapabilities, getOutputFormat, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputPeek, postExecution, preExecution, run, runFilter, setDebug, setDoNotCheckCapabilities, toString, useFilter, wekaStaticWrapper
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
makeCopy
public java.util.Enumeration<Option> listOptions()
listOptions
in interface OptionHandler
listOptions
in class Filter
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-R <col1,col2-col4,...> Specifies list of columns to Discretize. First and last are valid indexes. (default none)
-V Invert matching sense of column indexes.
-D Output binary attributes for discretized attributes.
-Y Use bin numbers rather than ranges for discretized attributes.
-E Use better encoding of split point for MDL.
-K Use Kononenko's MDL criterion.
-precision <integer> Precision for bin boundary labels. (default = 6 decimal places).
-spread-attribute-weight When generating binary attributes, spread weight of old attribute across new attributes. Do not give each new attribute the old weight.
setOptions
in interface OptionHandler
setOptions
in class Filter
options
- the list of options as an array of stringsjava.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class Filter
public Capabilities getCapabilities()
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class Filter
Capabilities
public boolean setInputFormat(Instances instanceInfo) throws java.lang.Exception
setInputFormat
in class Filter
instanceInfo
- an Instances object containing the input instance
structure (any instances contained in the object are ignored -
only the structure is required).java.lang.Exception
- if the input format can't be set successfullypublic boolean input(Instance instance)
public boolean batchFinished()
batchFinished
in class Filter
java.lang.IllegalStateException
- if no input structure has been definedpublic java.lang.String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface TechnicalInformationHandler
public java.lang.String spreadAttributeWeightTipText()
public void setSpreadAttributeWeight(boolean p)
p
- whether weight is spreadpublic boolean getSpreadAttributeWeight()
public java.lang.String binRangePrecisionTipText()
public void setBinRangePrecision(int p)
p
- the precision for bin boundariespublic int getBinRangePrecision()
public java.lang.String makeBinaryTipText()
public boolean getMakeBinary()
public void setMakeBinary(boolean makeBinary)
makeBinary
- if binary attributes are to be madepublic java.lang.String useBinNumbersTipText()
public boolean getUseBinNumbers()
public void setUseBinNumbers(boolean useBinNumbers)
useBinNumbers
- if bin numbers should be usedpublic java.lang.String useKononenkoTipText()
public boolean getUseKononenko()
public void setUseKononenko(boolean useKon)
useKon
- true if Kononenko's one is to be usedpublic java.lang.String useBetterEncodingTipText()
public boolean getUseBetterEncoding()
public void setUseBetterEncoding(boolean useBetterEncoding)
useBetterEncoding
- true if better encoding to be used.public java.lang.String invertSelectionTipText()
public boolean getInvertSelection()
public void setInvertSelection(boolean invert)
invert
- the new invert settingpublic java.lang.String attributeIndicesTipText()
public java.lang.String getAttributeIndices()
public void setAttributeIndices(java.lang.String rangeList)
rangeList
- a string representing the list of attributes. Since the
string will typically come from a user, attributes are indexed
from 1. java.lang.IllegalArgumentException
- if an invalid range list is suppliedpublic void setAttributeIndicesArray(int[] attributes)
attributes
- an array containing indexes of attributes to Discretize.
Since the array will typically come from a program, attributes are
indexed from 0.java.lang.IllegalArgumentException
- if an invalid set of ranges is suppliedpublic double[] getCutPoints(int attributeIndex)
attributeIndex
- the index (from 0) of the attribute to get the cut
points ofpublic java.lang.String getBinRangesString(int attributeIndex)
attributeIndex
- the index (from 0) of the attribute to get the bin
ranges string ofpublic java.lang.String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class Filter
public static void main(java.lang.String[] argv)
argv
- should contain arguments to the filter: use -h for help