public class InterquartileRange extends SimpleBatchFilter implements WeightedAttributesHandler
-D Turns on output of debugging information.
-R <col1,col2-col4,...> Specifies list of columns to base outlier/extreme value detection on. If an instance is considered in at least one of those attributes an outlier/extreme value, it is tagged accordingly. 'first' and 'last' are valid indexes. (default none)
-O <num> The factor for outlier detection. (default: 3)
-E <num> The factor for extreme values detection. (default: 2*Outlier Factor)
-E-as-O Tags extreme values also as outliers. (default: off)
-P Generates Outlier/ExtremeValue pair for each numeric attribute in the range, not just a single indicator pair for all the attributes. (default: off)
-M
Generates an additional attribute 'Offset' per Outlier/ExtremeValue
pair that contains the multiplier that the value is off the median.
value = median + 'multiplier' * IQR
Note: implicitely sets '-P'. (default: off)
Thanks to Dale for a few brainstorming sessions.| Modifier and Type | Class and Description |
|---|---|
static class |
InterquartileRange.ValueType
enum for obtaining the various determined IQR values.
|
| Modifier and Type | Field and Description |
|---|---|
static int |
NON_NUMERIC
indicator for non-numeric attributes
|
| Constructor and Description |
|---|
InterquartileRange() |
| Modifier and Type | Method and Description |
|---|---|
java.lang.String |
attributeIndicesTipText()
Returns the tip text for this property
|
java.lang.String |
detectionPerAttributeTipText()
Returns the tip text for this property
|
java.lang.String |
extremeValuesAsOutliersTipText()
Returns the tip text for this property
|
java.lang.String |
extremeValuesFactorTipText()
Returns the tip text for this property
|
java.lang.String |
getAttributeIndices()
Gets the current range selection
|
Capabilities |
getCapabilities()
Returns the Capabilities of this filter.
|
boolean |
getDetectionPerAttribute()
Gets whether an Outlier/ExtremeValue attribute pair is generated for each
numeric attribute ("true") or just one pair for all numeric attributes
together ("false").
|
boolean |
getExtremeValuesAsOutliers()
Get whether extreme values are also tagged as outliers.
|
double |
getExtremeValuesFactor()
Gets the factor for determining the thresholds for extreme values.
|
java.lang.String[] |
getOptions()
Gets the current settings of the filter.
|
double |
getOutlierFactor()
Gets the factor for determining the thresholds for outliers.
|
boolean |
getOutputOffsetMultiplier()
Gets whether an additional attribute "Offset" is generated per
Outlier/ExtremeValue attribute pair that lists the multiplier the value is
off the median: value = median + 'multiplier' * IQR.
|
java.lang.String |
getRevision()
Returns the revision string.
|
double[] |
getValues(InterquartileRange.ValueType type)
Returns the values for the specified type.
|
java.lang.String |
globalInfo()
Returns a string describing this filter
|
java.util.Enumeration<Option> |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] args)
Main method for testing this class.
|
java.lang.String |
outlierFactorTipText()
Returns the tip text for this property
|
java.lang.String |
outputOffsetMultiplierTipText()
Returns the tip text for this property
|
void |
setAttributeIndices(java.lang.String value)
Sets which attributes are to be used for interquartile calculations and
outlier/extreme value detection (only numeric attributes among the
selection will be used).
|
void |
setAttributeIndicesArray(int[] value)
Sets which attributes are to be used for interquartile calculations and
outlier/extreme value detection (only numeric attributes among the
selection will be used).
|
void |
setDetectionPerAttribute(boolean value)
Set whether an Outlier/ExtremeValue attribute pair is generated for each
numeric attribute ("true") or just one pair for all numeric attributes
together ("false").
|
void |
setExtremeValuesAsOutliers(boolean value)
Set whether extreme values are also tagged as outliers.
|
void |
setExtremeValuesFactor(double value)
Sets the factor for determining the thresholds for extreme values.
|
void |
setOptions(java.lang.String[] options)
Parses a list of options for this object.
|
void |
setOutlierFactor(double value)
Sets the factor for determining the thresholds for outliers.
|
void |
setOutputOffsetMultiplier(boolean value)
Set whether an additional attribute "Offset" is generated per
Outlier/ExtremeValue attribute pair that lists the multiplier the value is
off the median: value = median + 'multiplier' * IQR.
|
allowAccessToFullInputFormat, batchFinished, input, inputsetInputFormatbatchFilterFile, debugTipText, doNotCheckCapabilitiesTipText, filterFile, getCapabilities, getCopyOfInputFormat, getDebug, getDoNotCheckCapabilities, getOutputFormat, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputPeek, postExecution, preExecution, run, runFilter, setDebug, setDoNotCheckCapabilities, toString, useFilter, wekaStaticWrapperequals, getClass, hashCode, notify, notifyAll, wait, wait, waitmakeCopypublic static final int NON_NUMERIC
public java.lang.String globalInfo()
globalInfo in class SimpleFilterpublic java.util.Enumeration<Option> listOptions()
listOptions in interface OptionHandlerlistOptions in class Filterpublic void setOptions(java.lang.String[] options)
throws java.lang.Exception
-D Turns on output of debugging information.
-R <col1,col2-col4,...> Specifies list of columns to base outlier/extreme value detection on. If an instance is considered in at least one of those attributes an outlier/extreme value, it is tagged accordingly. 'first' and 'last' are valid indexes. (default none)
-O <num> The factor for outlier detection. (default: 3)
-E <num> The factor for extreme values detection. (default: 2*Outlier Factor)
-E-as-O Tags extreme values also as outliers. (default: off)
-P Generates Outlier/ExtremeValue pair for each numeric attribute in the range, not just a single indicator pair for all the attributes. (default: off)
-M
Generates an additional attribute 'Offset' per Outlier/ExtremeValue
pair that contains the multiplier that the value is off the median.
value = median + 'multiplier' * IQR
Note: implicitely sets '-P'. (default: off)
setOptions in interface OptionHandlersetOptions in class Filteroptions - the list of options as an array of stringsjava.lang.Exception - if an option is not supportedpublic java.lang.String[] getOptions()
getOptions in interface OptionHandlergetOptions in class Filterpublic java.lang.String attributeIndicesTipText()
public java.lang.String getAttributeIndices()
public void setAttributeIndices(java.lang.String value)
value - a string representing the list of attributes. Since the string
will typically come from a user, attributes are indexed from 1. java.lang.IllegalArgumentException - if an invalid range list is suppliedpublic void setAttributeIndicesArray(int[] value)
value - an array containing indexes of attributes to work on. Since
the array will typically come from a program, attributes are
indexed from 0.java.lang.IllegalArgumentException - if an invalid set of ranges is suppliedpublic java.lang.String outlierFactorTipText()
public void setOutlierFactor(double value)
value - the factor.public double getOutlierFactor()
public java.lang.String extremeValuesFactorTipText()
public void setExtremeValuesFactor(double value)
value - the factor.public double getExtremeValuesFactor()
public java.lang.String extremeValuesAsOutliersTipText()
public void setExtremeValuesAsOutliers(boolean value)
value - whether or not to tag extreme values also as outliers.public boolean getExtremeValuesAsOutliers()
public java.lang.String detectionPerAttributeTipText()
public void setDetectionPerAttribute(boolean value)
value - whether or not to generate indicator attribute pairs for each
numeric attribute.public boolean getDetectionPerAttribute()
public java.lang.String outputOffsetMultiplierTipText()
public void setOutputOffsetMultiplier(boolean value)
value - whether or not to generate the additional attribute.public boolean getOutputOffsetMultiplier()
public Capabilities getCapabilities()
getCapabilities in interface CapabilitiesHandlergetCapabilities in class FilterCapabilitiespublic double[] getValues(InterquartileRange.ValueType type)
type - the type of values to returnpublic java.lang.String getRevision()
getRevision in interface RevisionHandlergetRevision in class Filterpublic static void main(java.lang.String[] args)
args - should contain arguments to the filter: use -h for help