public class SimpleKMeans extends RandomizableClusterer implements NumberOfClustersRequestable, WeightedInstancesHandler
-N <num> number of clusters. (default 2).
-V Display std. deviations for centroids.
-M Replace missing values with mean/mode.
-S <num> Random number seed. (default 10)
-A <classname and options> Distance function to be used for instance comparison (default weka.core.EuclidianDistance)
-I <num> Maximum number of iterations.
-O Preserve order of instances.
RandomizableClusterer
,
Serialized FormConstructor and Description |
---|
SimpleKMeans()
the default constructor
|
Modifier and Type | Method and Description |
---|---|
void |
buildClusterer(Instances data)
Generates a clusterer.
|
int |
clusterInstance(Instance instance)
Classifies a given instance.
|
java.lang.String |
displayStdDevsTipText()
Returns the tip text for this property
|
java.lang.String |
distanceFunctionTipText()
Returns the tip text for this property.
|
java.lang.String |
dontReplaceMissingValuesTipText()
Returns the tip text for this property
|
int[] |
getAssignments()
Gets the assignments for each instance
|
Capabilities |
getCapabilities()
Returns default capabilities of the clusterer.
|
Instances |
getClusterCentroids()
Gets the the cluster centroids
|
int[][][] |
getClusterNominalCounts()
Returns for each cluster the frequency counts for the values of each
nominal attribute
|
int[] |
getClusterSizes()
Gets the number of instances in each cluster
|
Instances |
getClusterStandardDevs()
Gets the standard deviations of the numeric attributes in each cluster
|
boolean |
getDisplayStdDevs()
Gets whether standard deviations and nominal count Should be displayed in
the clustering output
|
DistanceFunction |
getDistanceFunction()
returns the distance function currently in use.
|
boolean |
getDontReplaceMissingValues()
Gets whether missing values are to be replaced
|
int |
getMaxIterations()
gets the number of maximum iterations to be executed
|
int |
getNumClusters()
gets the number of clusters to generate
|
java.lang.String[] |
getOptions()
Gets the current settings of SimpleKMeans
|
boolean |
getPreserveInstancesOrder()
Gets whether order of instances must be preserved
|
java.lang.String |
getRevision()
Returns the revision string.
|
double |
getSquaredError()
Gets the squared error for all clusters
|
java.lang.String |
globalInfo()
Returns a string describing this clusterer
|
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] argv)
Main method for testing this class.
|
java.lang.String |
maxIterationsTipText()
Returns the tip text for this property
|
int |
numberOfClusters()
Returns the number of clusters.
|
java.lang.String |
numClustersTipText()
Returns the tip text for this property
|
java.lang.String |
preserveInstancesOrderTipText()
Returns the tip text for this property
|
void |
setDisplayStdDevs(boolean stdD)
Sets whether standard deviations and nominal count Should be displayed in
the clustering output
|
void |
setDistanceFunction(DistanceFunction df)
sets the distance function to use for instance comparison.
|
void |
setDontReplaceMissingValues(boolean r)
Sets whether missing values are to be replaced
|
void |
setMaxIterations(int n)
set the maximum number of iterations to be executed
|
void |
setNumClusters(int n)
set the number of clusters to generate
|
void |
setOptions(java.lang.String[] options)
Parses a given list of options.
|
void |
setPreserveInstancesOrder(boolean r)
Sets whether order of instances must be preserved
|
java.lang.String |
toString()
return a string describing this clusterer
|
getSeed, seedTipText, setSeed
distributionForInstance, forName, makeCopies, makeCopy
public java.lang.String globalInfo()
public Capabilities getCapabilities()
getCapabilities
in interface Clusterer
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class AbstractClusterer
Capabilities
public void buildClusterer(Instances data) throws java.lang.Exception
buildClusterer
in interface Clusterer
buildClusterer
in class AbstractClusterer
data
- set of instances serving as training datajava.lang.Exception
- if the clusterer has not been generated successfullypublic int clusterInstance(Instance instance) throws java.lang.Exception
clusterInstance
in interface Clusterer
clusterInstance
in class AbstractClusterer
instance
- the instance to be assigned to a clusterjava.lang.Exception
- if instance could not be classified successfullypublic int numberOfClusters() throws java.lang.Exception
numberOfClusters
in interface Clusterer
numberOfClusters
in class AbstractClusterer
java.lang.Exception
- if number of clusters could not be returned successfullypublic java.util.Enumeration listOptions()
listOptions
in interface OptionHandler
listOptions
in class RandomizableClusterer
public java.lang.String numClustersTipText()
public void setNumClusters(int n) throws java.lang.Exception
setNumClusters
in interface NumberOfClustersRequestable
n
- the number of clusters to generatejava.lang.Exception
- if number of clusters is negativepublic int getNumClusters()
public java.lang.String maxIterationsTipText()
public void setMaxIterations(int n) throws java.lang.Exception
n
- the maximum number of iterationsjava.lang.Exception
- if maximum number of iteration is smaller than 1public int getMaxIterations()
public java.lang.String displayStdDevsTipText()
public void setDisplayStdDevs(boolean stdD)
stdD
- true if std. devs and counts should be displayedpublic boolean getDisplayStdDevs()
public java.lang.String dontReplaceMissingValuesTipText()
public void setDontReplaceMissingValues(boolean r)
r
- true if missing values are to be replacedpublic boolean getDontReplaceMissingValues()
public java.lang.String distanceFunctionTipText()
public DistanceFunction getDistanceFunction()
public void setDistanceFunction(DistanceFunction df) throws java.lang.Exception
df
- the new distance function to usejava.lang.Exception
- if instances cannot be processedpublic java.lang.String preserveInstancesOrderTipText()
public void setPreserveInstancesOrder(boolean r)
r
- true if missing values are to be replacedpublic boolean getPreserveInstancesOrder()
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-N <num> number of clusters. (default 2).
-V Display std. deviations for centroids.
-M Replace missing values with mean/mode.
-S <num> Random number seed. (default 10)
-A <classname and options> Distance function to be used for instance comparison (default weka.core.EuclidianDistance)
-I <num> Maximum number of iterations.
-O Preserve order of instances.
setOptions
in interface OptionHandler
setOptions
in class RandomizableClusterer
options
- the list of options as an array of stringsjava.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class RandomizableClusterer
public java.lang.String toString()
toString
in class java.lang.Object
public Instances getClusterCentroids()
public Instances getClusterStandardDevs()
public int[][][] getClusterNominalCounts()
public double getSquaredError()
public int[] getClusterSizes()
public int[] getAssignments() throws java.lang.Exception
java.lang.Exception
- if order of instances wasn't preserved or no assignments
were madepublic java.lang.String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class AbstractClusterer
public static void main(java.lang.String[] argv)
argv
- should contain the following arguments:
-t training file [-N number of clusters]