public class InterquartileRange extends SimpleBatchFilter implements WeightedAttributesHandler
-D Turns on output of debugging information.
-R <col1,col2-col4,...> Specifies list of columns to base outlier/extreme value detection on. If an instance is considered in at least one of those attributes an outlier/extreme value, it is tagged accordingly. 'first' and 'last' are valid indexes. (default none)
-O <num> The factor for outlier detection. (default: 3)
-E <num> The factor for extreme values detection. (default: 2*Outlier Factor)
-E-as-O Tags extreme values also as outliers. (default: off)
-P Generates Outlier/ExtremeValue pair for each numeric attribute in the range, not just a single indicator pair for all the attributes. (default: off)
-M Generates an additional attribute 'Offset' per Outlier/ExtremeValue pair that contains the multiplier that the value is off the median. value = median + 'multiplier' * IQR Note: implicitely sets '-P'. (default: off)Thanks to Dale for a few brainstorming sessions.
Modifier and Type | Class and Description |
---|---|
static class |
InterquartileRange.ValueType
enum for obtaining the various determined IQR values.
|
Modifier and Type | Field and Description |
---|---|
static int |
NON_NUMERIC
indicator for non-numeric attributes
|
Constructor and Description |
---|
InterquartileRange() |
Modifier and Type | Method and Description |
---|---|
java.lang.String |
attributeIndicesTipText()
Returns the tip text for this property
|
java.lang.String |
detectionPerAttributeTipText()
Returns the tip text for this property
|
java.lang.String |
extremeValuesAsOutliersTipText()
Returns the tip text for this property
|
java.lang.String |
extremeValuesFactorTipText()
Returns the tip text for this property
|
java.lang.String |
getAttributeIndices()
Gets the current range selection
|
Capabilities |
getCapabilities()
Returns the Capabilities of this filter.
|
boolean |
getDetectionPerAttribute()
Gets whether an Outlier/ExtremeValue attribute pair is generated for each
numeric attribute ("true") or just one pair for all numeric attributes
together ("false").
|
boolean |
getExtremeValuesAsOutliers()
Get whether extreme values are also tagged as outliers.
|
double |
getExtremeValuesFactor()
Gets the factor for determining the thresholds for extreme values.
|
java.lang.String[] |
getOptions()
Gets the current settings of the filter.
|
double |
getOutlierFactor()
Gets the factor for determining the thresholds for outliers.
|
boolean |
getOutputOffsetMultiplier()
Gets whether an additional attribute "Offset" is generated per
Outlier/ExtremeValue attribute pair that lists the multiplier the value is
off the median: value = median + 'multiplier' * IQR.
|
java.lang.String |
getRevision()
Returns the revision string.
|
double[] |
getValues(InterquartileRange.ValueType type)
Returns the values for the specified type.
|
java.lang.String |
globalInfo()
Returns a string describing this filter
|
java.util.Enumeration<Option> |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] args)
Main method for testing this class.
|
java.lang.String |
outlierFactorTipText()
Returns the tip text for this property
|
java.lang.String |
outputOffsetMultiplierTipText()
Returns the tip text for this property
|
void |
setAttributeIndices(java.lang.String value)
Sets which attributes are to be used for interquartile calculations and
outlier/extreme value detection (only numeric attributes among the
selection will be used).
|
void |
setAttributeIndicesArray(int[] value)
Sets which attributes are to be used for interquartile calculations and
outlier/extreme value detection (only numeric attributes among the
selection will be used).
|
void |
setDetectionPerAttribute(boolean value)
Set whether an Outlier/ExtremeValue attribute pair is generated for each
numeric attribute ("true") or just one pair for all numeric attributes
together ("false").
|
void |
setExtremeValuesAsOutliers(boolean value)
Set whether extreme values are also tagged as outliers.
|
void |
setExtremeValuesFactor(double value)
Sets the factor for determining the thresholds for extreme values.
|
void |
setOptions(java.lang.String[] options)
Parses a list of options for this object.
|
void |
setOutlierFactor(double value)
Sets the factor for determining the thresholds for outliers.
|
void |
setOutputOffsetMultiplier(boolean value)
Set whether an additional attribute "Offset" is generated per
Outlier/ExtremeValue attribute pair that lists the multiplier the value is
off the median: value = median + 'multiplier' * IQR.
|
allowAccessToFullInputFormat, batchFinished, input, input
setInputFormat
batchFilterFile, debugTipText, doNotCheckCapabilitiesTipText, filterFile, getCapabilities, getCopyOfInputFormat, getDebug, getDoNotCheckCapabilities, getOutputFormat, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputPeek, postExecution, preExecution, run, runFilter, setDebug, setDoNotCheckCapabilities, toString, useFilter, wekaStaticWrapper
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
makeCopy
public static final int NON_NUMERIC
public java.lang.String globalInfo()
globalInfo
in class SimpleFilter
public java.util.Enumeration<Option> listOptions()
listOptions
in interface OptionHandler
listOptions
in class Filter
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-D Turns on output of debugging information.
-R <col1,col2-col4,...> Specifies list of columns to base outlier/extreme value detection on. If an instance is considered in at least one of those attributes an outlier/extreme value, it is tagged accordingly. 'first' and 'last' are valid indexes. (default none)
-O <num> The factor for outlier detection. (default: 3)
-E <num> The factor for extreme values detection. (default: 2*Outlier Factor)
-E-as-O Tags extreme values also as outliers. (default: off)
-P Generates Outlier/ExtremeValue pair for each numeric attribute in the range, not just a single indicator pair for all the attributes. (default: off)
-M Generates an additional attribute 'Offset' per Outlier/ExtremeValue pair that contains the multiplier that the value is off the median. value = median + 'multiplier' * IQR Note: implicitely sets '-P'. (default: off)
setOptions
in interface OptionHandler
setOptions
in class Filter
options
- the list of options as an array of stringsjava.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class Filter
public java.lang.String attributeIndicesTipText()
public java.lang.String getAttributeIndices()
public void setAttributeIndices(java.lang.String value)
value
- a string representing the list of attributes. Since the string
will typically come from a user, attributes are indexed from 1. java.lang.IllegalArgumentException
- if an invalid range list is suppliedpublic void setAttributeIndicesArray(int[] value)
value
- an array containing indexes of attributes to work on. Since
the array will typically come from a program, attributes are
indexed from 0.java.lang.IllegalArgumentException
- if an invalid set of ranges is suppliedpublic java.lang.String outlierFactorTipText()
public void setOutlierFactor(double value)
value
- the factor.public double getOutlierFactor()
public java.lang.String extremeValuesFactorTipText()
public void setExtremeValuesFactor(double value)
value
- the factor.public double getExtremeValuesFactor()
public java.lang.String extremeValuesAsOutliersTipText()
public void setExtremeValuesAsOutliers(boolean value)
value
- whether or not to tag extreme values also as outliers.public boolean getExtremeValuesAsOutliers()
public java.lang.String detectionPerAttributeTipText()
public void setDetectionPerAttribute(boolean value)
value
- whether or not to generate indicator attribute pairs for each
numeric attribute.public boolean getDetectionPerAttribute()
public java.lang.String outputOffsetMultiplierTipText()
public void setOutputOffsetMultiplier(boolean value)
value
- whether or not to generate the additional attribute.public boolean getOutputOffsetMultiplier()
public Capabilities getCapabilities()
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class Filter
Capabilities
public double[] getValues(InterquartileRange.ValueType type)
type
- the type of values to returnpublic java.lang.String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class Filter
public static void main(java.lang.String[] args)
args
- should contain arguments to the filter: use -h for help