public class CentroidSketch
extends java.lang.Object
implements weka.core.TechnicalInformationHandler, java.io.Serializable
@article{Bahmani2012, author = {Bahman Bahmani and Benjamin Moseley and Andrea Vattani and Ravi Kumar and Sergei Vassilvitskii}, journal = {Proceedings of the VLDB Endowment}, pages = {622-633}, title = {Scalable k-means++}, year = {2012} }
Constructor and Description |
---|
CentroidSketch(weka.core.Instances initialSketch,
weka.core.NormalizableDistance distanceFunction,
int size,
int seed)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
void |
addReservoirToCurrentSketch()
Add the reservoir to the current sketch.
|
void |
aggregateReservoir(WeightedReservoirSample toAggregate)
Aggregate the supplied reservoir into our reservoir.
|
double |
distanceToSketch(weka.core.Instance toProcess)
Computes the distance between the supplied instance and the current sketch.
|
weka.core.Instances |
getCurrentSketch()
Get the current sketch as a set of instances
|
weka.core.NormalizableDistance |
getDistanceFunction()
Get the distance function being used
|
WeightedReservoirSample |
getReservoirSample()
Get the reservoir sample
|
weka.core.TechnicalInformation |
getTechnicalInformation() |
java.lang.String |
globalInfo()
Overview information for this class
|
void |
process(weka.core.Instance toProcess,
boolean updateDistanceFunction)
Processes an instance - basically updates the reservoir
|
void |
resetReservoir()
Clear the reservoir
|
void |
setDistanceFunction(weka.core.NormalizableDistance distFunc)
Set the distance function to use
|
public CentroidSketch(weka.core.Instances initialSketch, weka.core.NormalizableDistance distanceFunction, int size, int seed)
initialSketch
- the initial starting point (typically one randomly
chosen instance for the k-means|| algorithm)distanceFunction
- the distance function to usesize
- the size of the reservoir (i.e. how many points to consider
adding to the sketch at each iteration)seed
- the seed for random number generationpublic java.lang.String globalInfo()
public void process(weka.core.Instance toProcess, boolean updateDistanceFunction)
toProcess
- the instance to processupdateDistanceFunction
- true if we should update the distance
function with this instance (i.e. update the ranges for numeric
attributes)public double distanceToSketch(weka.core.Instance toProcess)
toProcess
- the instance to processpublic weka.core.NormalizableDistance getDistanceFunction()
public void setDistanceFunction(weka.core.NormalizableDistance distFunc)
distFunc
- the distance function to usepublic WeightedReservoirSample getReservoirSample()
public weka.core.Instances getCurrentSketch()
public void aggregateReservoir(WeightedReservoirSample toAggregate) throws java.lang.Exception
toAggregate
- the reservoir sample to aggregatejava.lang.Exception
- if the structure of the instances in the sample to
aggregate does not match the structure of our sketchpublic void resetReservoir()
public void addReservoirToCurrentSketch() throws java.lang.Exception
java.lang.Exception
public weka.core.TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface weka.core.TechnicalInformationHandler