Class | Description |
---|---|
AbstractTimeSeries |
An abstract instance filter that assumes instances form time-series data and
performs some merging of attribute values in the current instance with
attribute attribute values of some previous (or future) instance.
|
Add |
An instance filter that adds a new attribute to the dataset.
|
AddCluster |
A filter that adds a new nominal attribute representing the cluster assigned to each instance by the specified clustering algorithm.
|
AddExpression |
An instance filter that creates a new attribute by applying a mathematical expression to existing attributes.
|
AddID |
An instance filter that adds an ID attribute to the dataset.
|
AddNoise |
An instance filter that changes a percentage of a given attributes values.
|
AddValues |
Adds the labels from the given list to an attribute if they are missing.
|
Center |
Centers all numeric attributes in the given dataset to have zero mean (apart from the class attribute, if set).
|
ChangeDateFormat |
Changes the date format used by a date attribute.
|
ClassAssigner |
Filter that can set and unset the class index.
|
ClusterMembership |
A filter that uses a density-based clusterer to generate cluster membership values; filtered instances are composed of these values plus the class attribute (if set in the input data).
|
Copy |
An instance filter that copies a range of attributes in the dataset.
|
Discretize |
An instance filter that discretizes a range of numeric attributes in the dataset into nominal attributes.
|
FirstOrder |
This instance filter takes a range of N numeric attributes and replaces them with N-1 numeric attributes, the values of which are the difference between consecutive attribute values from the original instance.
|
InterquartileRange |
A filter for detecting outliers and extreme values based on interquartile ranges.
|
KernelFilter |
Converts the given set of predictor variables into a kernel matrix.
|
MakeIndicator |
A filter that creates a new dataset with a boolean attribute replacing a nominal attribute.
|
MathExpression |
Modify numeric attributes according to a given expression
Valid options are:
|
MergeTwoValues |
Merges two values of a nominal attribute into one value.
|
MultiInstanceToPropositional |
Converts the multi-instance dataset into single instance dataset so that the Nominalize, Standardize and other type of filters or transformation can be applied to these data for the further preprocessing.
Note: the first attribute of the converted dataset is a nominal attribute and refers to the bagId. |
NominalToBinary |
Converts all nominal attributes into binary numeric attributes.
|
NominalToString |
Converts a nominal attribute (i.e.
|
Normalize |
Normalizes all numeric values in the given dataset (apart from the class attribute, if set).
|
NumericCleaner |
A filter that 'cleanses' the numeric data from values that are too small, too big or very close to a certain value (e.g., 0) and sets these values to a pre-defined default.
|
NumericToBinary |
Converts all numeric attributes into binary attributes (apart from the class attribute, if set): if the value of the numeric attribute is exactly zero, the value of the new attribute will be zero.
|
NumericToNominal |
A filter for turning numeric attributes into
nominal ones.
|
NumericTransform |
Transforms numeric attributes using a given transformation method.
|
Obfuscate |
A simple instance filter that renames the relation, all attribute names and all nominal (and string) attribute values.
|
PartitionedMultiFilter |
A filter that applies filters on subsets of attributes and assembles the output into a new dataset.
|
PKIDiscretize |
Discretizes numeric attributes using equal frequency binning, where the number of bins is equal to the square root of the number of non-missing values.
For more information, see: Ying Yang, Geoffrey I. |
PotentialClassIgnorer |
This filter should be extended by other unsupervised attribute
filters to allow processing of the class attribute if that's
required.
|
PrincipalComponents |
Performs a principal components analysis and transformation of the data.
Dimensionality reduction is accomplished by choosing enough eigenvectors to account for some percentage of the variance in the original data -- default 0.95 (95%). Based on code of the attribute selection scheme 'PrincipalComponents' by Mark Hall and Gabi Schmidberger. |
PropositionalToMultiInstance |
Converts a propositional dataset into a multi-instance dataset (with relational attribute).
|
RandomProjection |
Reduces the dimensionality of the data by
projecting it onto a lower dimensional subspace using a random matrix with
columns of unit length (i.e.
|
RandomSubset |
Chooses a random subset of attributes, either an absolute number or a percentage.
|
RELAGGS |
A propositionalization filter inspired by the RELAGGS algorithm.
It processes all relational attributes that fall into the user defined range (all others are skipped, i.e., not added to the output). |
Remove |
A filter that removes a range of attributes from the dataset.
|
RemoveType |
Removes attributes of a given type.
|
RemoveUseless |
This filter removes attributes that do not vary at all or that vary too much.
|
Reorder |
A filter that generates output with a new order of the attributes.
|
ReplaceMissingValues |
Replaces all missing values for nominal and numeric attributes in a dataset with the modes and means from the training data.
|
Standardize |
Standardizes all numeric attributes in the given dataset to have zero mean and unit variance (apart from the class attribute, if set).
|
StringToNominal |
Converts a string attribute (i.e.
|
StringToWordVector |
Converts String attributes into a set of attributes representing word occurrence (depending on the tokenizer) information from the text contained in the strings.
|
SwapValues |
Swaps two values of a nominal attribute.
|
TimeSeriesDelta |
An instance filter that assumes instances form time-series data and replaces attribute values in the current instance with the difference between the current value and the equivalent attribute attribute value of some previous (or future) instance.
|
TimeSeriesTranslate |
An instance filter that assumes instances form time-series data and replaces attribute values in the current instance with the equivalent attribute values of some previous (or future) instance.
|
Wavelet |
A filter for wavelet transformation.
For more information see: Wikipedia (2004). |