public class SubspaceCluster extends ClusterGenerator
-h Prints this help.
-o <file> The name of the output file, otherwise the generated data is printed to stdout.
-r <name> The name of the relation.
-d Whether to print debug informations.
-S The seed for random function (default 1)
-a <num> The number of attributes (default 1).
-c Class Flag, if set, the cluster is listed in extra attribute.
-b <range> The indices for boolean attributes.
-m <range> The indices for nominal attributes.
-C <cluster-definition> A cluster definition of class 'SubspaceClusterDefinition' (definition needs to be quoted to be recognized as a single argument).
Options specific to weka.datagenerators.clusterers.SubspaceClusterDefinition:
-A <range> Uses a random uniform distribution for the instances in the cluster.
-U <range> Generates totally uniformly distributed instances in the cluster.
-G <range> Uses a Gaussian distribution for the instances in the cluster.
-D <num>,<num> The attribute min/max (-A and -U) or mean/stddev (-G) for the cluster.
-N <num>..<num> The range of number of instances per cluster (default 1..50).
-I Uses integer instead of continuous values (default continuous).
Modifier and Type | Field and Description |
---|---|
static int |
CONTINUOUS
cluster subtype: continuous
|
static int |
GAUSSIAN
cluster type: gaussian
|
static int |
INTEGER
cluster subtype: integer
|
static Tag[] |
TAGS_CLUSTERSUBTYPE
the tags for the cluster types
|
static Tag[] |
TAGS_CLUSTERTYPE
the tags for the cluster types
|
static int |
TOTAL_UNIFORM
cluster type: total uniform
|
static int |
UNIFORM_RANDOM
cluster type: uniform/random
|
Constructor and Description |
---|
SubspaceCluster()
initializes the generator, sets the number of clusters to 0, since user has
to specify them explicitly
|
Modifier and Type | Method and Description |
---|---|
java.lang.String |
booleanColsTipText()
Returns the tip text for this property
|
java.lang.String |
clusterDefinitionsTipText()
Returns the tip text for this property
|
Instances |
defineDataFormat()
Initializes the format for the dataset produced.
|
Instance |
generateExample()
Generate an example of the dataset.
|
Instances |
generateExamples()
Generate all examples of the dataset.
|
java.lang.String |
generateFinished()
Compiles documentation about the data generation after the generation
process
|
java.lang.String |
generateStart()
Compiles documentation about the data generation before the generation
process
|
Range |
getBooleanCols()
returns the range of boolean attributes.
|
ClusterDefinition[] |
getClusterDefinitions()
returns the currently set clusters
|
Range |
getNominalCols()
returns the range of nominal attributes
|
int[] |
getNumValues()
returns array that stores the number of values for a nominal attribute.
|
java.lang.String[] |
getOptions()
Gets the current settings of the datagenerator.
|
java.lang.String |
getRevision()
Returns the revision string.
|
boolean |
getSingleModeFlag()
Gets the single mode flag.
|
java.lang.String |
globalInfo()
Returns a string describing this data generator.
|
boolean |
isBoolean(int index)
Returns true if attribute is boolean
|
boolean |
isNominal(int index)
Returns true if attribute is nominal
|
java.util.Enumeration<Option> |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] args)
Main method for testing this class.
|
java.lang.String |
nominalColsTipText()
Returns the tip text for this property
|
void |
setBooleanCols(Range value)
Sets which attributes are boolean.
|
void |
setBooleanIndices(java.lang.String rangeList)
Sets which attributes are boolean
|
void |
setClusterDefinitions(ClusterDefinition[] value)
sets the clusters to use
|
void |
setNominalCols(Range value)
Sets which attributes are nominal.
|
void |
setNominalIndices(java.lang.String rangeList)
Sets which attributes are nominal
|
void |
setOptions(java.lang.String[] options)
Parses a list of options for this object.
|
classFlagTipText, getClassFlag, getNumAttributes, numAttributesTipText, setClassFlag, setNumAttributes
debugTipText, defaultOutput, enumToVector, formatTipText, getDatasetFormat, getDebug, getEpilogue, getNumExamplesAct, getOutput, getPrologue, getRandom, getRelationName, getSeed, makeData, outputTipText, randomTipText, relationNameTipText, runDataGenerator, seedTipText, setDatasetFormat, setDebug, setOutput, setRandom, setRelationName, setSeed
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
makeCopy
public static final int UNIFORM_RANDOM
public static final int TOTAL_UNIFORM
public static final int GAUSSIAN
public static final Tag[] TAGS_CLUSTERTYPE
public static final int CONTINUOUS
public static final int INTEGER
public static final Tag[] TAGS_CLUSTERSUBTYPE
public SubspaceCluster()
public java.lang.String globalInfo()
public java.util.Enumeration<Option> listOptions()
listOptions
in interface OptionHandler
listOptions
in class ClusterGenerator
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-h Prints this help.
-o <file> The name of the output file, otherwise the generated data is printed to stdout.
-r <name> The name of the relation.
-d Whether to print debug informations.
-S The seed for random function (default 1)
-a <num> The number of attributes (default 1).
-c Class Flag, if set, the cluster is listed in extra attribute.
-b <range> The indices for boolean attributes.
-m <range> The indices for nominal attributes.
-C <cluster-definition> A cluster definition of class 'SubspaceClusterDefinition' (definition needs to be quoted to be recognized as a single argument).
Options specific to weka.datagenerators.clusterers.SubspaceClusterDefinition:
-A <range> Uses a random uniform distribution for the instances in the cluster.
-U <range> Generates totally uniformly distributed instances in the cluster.
-G <range> Uses a Gaussian distribution for the instances in the cluster.
-D <num>,<num> The attribute min/max (-A and -U) or mean/stddev (-G) for the cluster.
-N <num>..<num> The range of number of instances per cluster (default 1..50).
-I Uses integer instead of continuous values (default continuous).
setOptions
in interface OptionHandler
setOptions
in class ClusterGenerator
options
- the list of options as an array of stringsjava.lang.Exception
- if an option is not supportedpublic void setBooleanIndices(java.lang.String rangeList)
rangeList
- a string representing the list of attributes. Since the
string will typically come from a user, attributes are indexed
from 1. java.lang.IllegalArgumentException
- if an invalid range list is suppliedpublic void setBooleanCols(Range value)
value
- the range to usepublic Range getBooleanCols()
public java.lang.String booleanColsTipText()
public void setNominalIndices(java.lang.String rangeList)
rangeList
- a string representing the list of attributes. Since the
string will typically come from a user, attributes are indexed
from 1. java.lang.IllegalArgumentException
- if an invalid range list is suppliedpublic void setNominalCols(Range value)
value
- the range to usepublic Range getNominalCols()
public java.lang.String nominalColsTipText()
public java.lang.String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class ClusterGenerator
DataGenerator.removeBlacklist(String[])
public ClusterDefinition[] getClusterDefinitions()
public void setClusterDefinitions(ClusterDefinition[] value) throws java.lang.Exception
value
- the clusters do usejava.lang.Exception
- if clusters are not the correct classpublic java.lang.String clusterDefinitionsTipText()
public boolean getSingleModeFlag()
getSingleModeFlag
in class DataGenerator
public Instances defineDataFormat() throws java.lang.Exception
defineDataFormat
in class DataGenerator
java.lang.Exception
- data format could not be definedDataGenerator.defaultRelationName()
public boolean isBoolean(int index)
index
- of the attributepublic boolean isNominal(int index)
index
- of the attributepublic int[] getNumValues()
public Instance generateExample() throws java.lang.Exception
generateExample
in class DataGenerator
java.lang.Exception
- if format not defined or generating public Instances generateExamples() throws java.lang.Exception
generateExamples
in class DataGenerator
java.lang.Exception
- if format not definedpublic java.lang.String generateFinished() throws java.lang.Exception
generateFinished
in class DataGenerator
java.lang.Exception
- no input structure has been definedpublic java.lang.String generateStart()
generateStart
in class DataGenerator
public java.lang.String getRevision()
public static void main(java.lang.String[] args)
args
- should contain arguments for the data producer: