public class Instances extends java.util.AbstractList<Instance> implements java.io.Serializable, RevisionHandler
Typical usage:
 import weka.core.converters.ConverterUtils.DataSource;
 ...
 
 // Read all the instances in the file (ARFF, CSV, XRFF, ...)
 DataSource source = new DataSource(filename);
 Instances instances = source.getDataSet();
 
 // Make the last attribute be the class
 instances.setClassIndex(instances.numAttributes() - 1);
 
 // Print header and instances.
 System.out.println("\nDataset:\n");
 System.out.println(instances);
 
 ...
 
 All methods that change a set of instances are safe, ie. a change of a set of instances does not affect any other sets of instances. All methods that change a datasets's attribute information clone the dataset before it is changed.
| Modifier and Type | Field and Description | 
|---|---|
| static java.lang.String | ARFF_DATAThe keyword used to denote the start of the arff data section | 
| static java.lang.String | ARFF_RELATIONThe keyword used to denote the start of an arff header | 
| static java.lang.String | FILE_EXTENSIONThe filename extension that should be used for arff files | 
| static java.lang.String | SERIALIZED_OBJ_FILE_EXTENSIONThe filename extension that should be used for bin. | 
| Constructor and Description | 
|---|
| Instances(Instances dataset)Constructor copying all instances and references to the header information
 from the given set of instances. | 
| Instances(Instances dataset,
         int capacity)Constructor creating an empty set of instances. | 
| Instances(Instances source,
         int first,
         int toCopy)Creates a new set of instances by copying a subset of another set. | 
| Instances(java.io.Reader reader)Reads an ARFF file from a reader, and assigns a weight of one to each
 instance. | 
| Instances(java.io.Reader reader,
         int capacity)Deprecated. 
 instead of using this method in conjunction with the
              readInstance(Reader)method, one should use theArffLoaderorDataSourceclass
             instead. | 
| Instances(java.lang.String name,
         java.util.ArrayList<Attribute> attInfo,
         int capacity)Creates an empty set of instances. | 
| Modifier and Type | Method and Description | 
|---|---|
| boolean | add(Instance instance)Adds one instance to the end of the set. | 
| void | add(int index,
   Instance instance)Adds one instance at the given position in the list. | 
| boolean | allAttributeWeightsIdentical()Returns true if all attribute weights are the same and false otherwise. | 
| boolean | allInstanceWeightsIdentical()Returns true if all instance weights are the same and false otherwise. | 
| Attribute | attribute(int index)Returns an attribute. | 
| Attribute | attribute(java.lang.String name)Returns an attribute given its name. | 
| AttributeStats | attributeStats(int index)Calculates summary statistics on the values that appear in this set of
 instances for a specified attribute. | 
| double[] | attributeToDoubleArray(int index)Gets the value of all instances in this dataset for a particular attribute. | 
| boolean | checkForAttributeType(int attType)Checks for attributes of the given type in the dataset | 
| boolean | checkForStringAttributes()Checks for string attributes in the dataset | 
| boolean | checkInstance(Instance instance)Checks if the given instance is compatible with this dataset. | 
| Attribute | classAttribute()Returns the class attribute. | 
| int | classIndex()Returns the class attribute's index. | 
| void | compactify()Compactifies the set of instances. | 
| void | delete()Removes all instances from the set. | 
| void | delete(int index)Removes an instance at the given position from the set. | 
| void | deleteAttributeAt(int position)Deletes an attribute at the given position (0 to numAttributes()
 - 1). | 
| void | deleteAttributeType(int attType)Deletes all attributes of the given type in the dataset. | 
| void | deleteStringAttributes()Deletes all string attributes in the dataset. | 
| void | deleteWithMissing(Attribute att)Removes all instances with missing values for a particular attribute from
 the dataset. | 
| void | deleteWithMissing(int attIndex)Removes all instances with missing values for a particular attribute from
 the dataset. | 
| void | deleteWithMissingClass()Removes all instances with a missing class value from the dataset. | 
| java.util.Enumeration<Attribute> | enumerateAttributes()Returns an enumeration of all the attributes. | 
| java.util.Enumeration<Instance> | enumerateInstances()Returns an enumeration of all instances in the dataset. | 
| boolean | equalHeaders(Instances dataset)Checks if two headers are equivalent. | 
| java.lang.String | equalHeadersMsg(Instances dataset)Checks if two headers are equivalent. | 
| Instance | firstInstance()Returns the first instance in the set. | 
| Instance | get(int index)Returns the instance at the given position. | 
| java.util.Random | getRandomNumberGenerator(long seed)Returns a random number generator. | 
| java.lang.String | getRevision()Returns the revision string. | 
| void | insertAttributeAt(Attribute att,
                 int position)Inserts an attribute at the given position (0 to numAttributes())
 and sets all values to be missing. | 
| Instance | instance(int index)Returns the instance at the given position. | 
| double | kthSmallestValue(Attribute att,
                int k)Returns the kth-smallest attribute value of a numeric attribute. | 
| double | kthSmallestValue(int attIndex,
                int k)Returns the kth-smallest attribute value of a numeric attribute. | 
| Instance | lastInstance()Returns the last instance in the set. | 
| static void | main(java.lang.String[] args)Main method for this class. | 
| double | meanOrMode(Attribute att)Returns the mean (mode) for a numeric (nominal) attribute as a
 floating-point value. | 
| double | meanOrMode(int attIndex)Returns the mean (mode) for a numeric (nominal) attribute as a
 floating-point value. | 
| static Instances | mergeInstances(Instances first,
              Instances second)Merges two sets of Instances together. | 
| int | numAttributes()Returns the number of attributes. | 
| int | numClasses()Returns the number of class labels. | 
| int | numDistinctValues(Attribute att)Returns the number of distinct values of a given attribute. | 
| int | numDistinctValues(int attIndex)Returns the number of distinct values of a given attribute. | 
| int | numInstances()Returns the number of instances in the dataset. | 
| void | randomize(java.util.Random random)Shuffles the instances in the set so that they are ordered randomly. | 
| boolean | readInstance(java.io.Reader reader)Deprecated. 
 instead of using this method in conjunction with the
              readInstance(Reader)method, one should use theArffLoaderorDataSourceclass
             instead. | 
| java.lang.String | relationName()Returns the relation's name. | 
| Instance | remove(int index)Removes the instance at the given position. | 
| void | renameAttribute(Attribute att,
               java.lang.String name)Renames an attribute. | 
| void | renameAttribute(int att,
               java.lang.String name)Renames an attribute. | 
| void | renameAttributeValue(Attribute att,
                    java.lang.String val,
                    java.lang.String name)Renames the value of a nominal (or string) attribute value. | 
| void | renameAttributeValue(int att,
                    int val,
                    java.lang.String name)Renames the value of a nominal (or string) attribute value. | 
| void | replaceAttributeAt(Attribute att,
                  int position)Replaces the attribute at the given position (0 to
 numAttributes()) with the given attribute and sets all its values to
 be missing. | 
| Instances | resample(java.util.Random random)Creates a new dataset of the same size as this dataset using random sampling with
 replacement. | 
| Instances | resampleWithWeights(java.util.Random random)Creates a new dataset of the same size as this dataset using random sampling with
 replacement according to the current instance weights. | 
| Instances | resampleWithWeights(java.util.Random random,
                   boolean representUsingWeights)Creates a new dataset of the same size as this dataset using random sampling with
 replacement according to the current instance weights. | 
| Instances | resampleWithWeights(java.util.Random random,
                   boolean[] sampled)Creates a new dataset of the same size as this dataset using random sampling with
 replacement according to the current instance weights. | 
| Instances | resampleWithWeights(java.util.Random random,
                   boolean[] sampled,
                   boolean representUsingWeights)Creates a new dataset of the same size as this dataset using random sampling with
 replacement according to the current instance weights. | 
| Instances | resampleWithWeights(java.util.Random random,
                   boolean[] sampled,
                   boolean representUsingWeights,
                   double sampleSize)Creates a new dataset from this dataset using random sampling with
 replacement according to current instance weights. | 
| Instances | resampleWithWeights(java.util.Random random,
                   double[] weights)Creates a new dataset of the same size as this dataset using random sampling with
 replacement according to the given weight vector. | 
| Instances | resampleWithWeights(java.util.Random random,
                   double[] weights,
                   boolean[] sampled)Creates a new dataset of the same size as this dataset using random sampling with
 replacement according to the given weight vector. | 
| Instances | resampleWithWeights(java.util.Random random,
                   double[] weights,
                   boolean[] sampled,
                   boolean representUsingWeights)Creates a new dataset of the same size as this dataset using random sampling with
 replacement according to the given weight vector. | 
| Instances | resampleWithWeights(java.util.Random random,
                   double[] weights,
                   boolean[] sampled,
                   boolean representUsingWeights,
                   double sampleSize)Creates a new dataset from this dataset using random sampling with
 replacement according to the given weight vector. | 
| Instance | set(int index,
   Instance instance)Replaces the instance at the given position. | 
| void | setAttributeWeight(Attribute att,
                  double weight)Sets the weight of an attribute. | 
| void | setAttributeWeight(int att,
                  double weight)Sets the weight of an attribute. | 
| void | setClass(Attribute att)Sets the class attribute. | 
| void | setClassIndex(int classIndex)Sets the class index of the set. | 
| void | setRelationName(java.lang.String newName)Sets the relation's name. | 
| int | size()Returns the number of instances in the dataset. | 
| void | sort(Attribute att)Sorts the instances based on an attribute. | 
| void | sort(int attIndex)Sorts the instances based on an attribute. | 
| void | stableSort(Attribute att)Sorts the instances based on an attribute, using a stable sort. | 
| void | stableSort(int attIndex)Sorts the instances based on an attribute, using a stable sort. | 
| void | stratify(int numFolds)Stratifies a set of instances according to its class values if the class
 attribute is nominal (so that afterwards a stratified cross-validation can
 be performed). | 
| Instances | stringFreeStructure()Create a copy of the structure. | 
| double | sumOfWeights()Computes the sum of all the instances' weights. | 
| void | swap(int i,
    int j)Swaps two instances in the set. | 
| static void | test(java.lang.String[] argv)Method for testing this class. | 
| Instances | testCV(int numFolds,
      int numFold)Creates the test set for one fold of a cross-validation on the dataset. | 
| java.lang.String | toString()Returns the dataset as a string in ARFF format. | 
| java.lang.String | toSummaryString()Generates a string summarizing the set of instances. | 
| Instances | trainCV(int numFolds,
       int numFold)Creates the training set for one fold of a cross-validation on the dataset. | 
| Instances | trainCV(int numFolds,
       int numFold,
       java.util.Random random)Creates the training set for one fold of a cross-validation on the dataset. | 
| double | variance(Attribute att)Computes the variance for a numeric attribute. | 
| double | variance(int attIndex)Computes the variance for a numeric attribute. | 
| double[] | variances()Computes the variance for all numeric attributes simultaneously. | 
addAll, clear, equals, hashCode, indexOf, iterator, lastIndexOf, listIterator, listIterator, subListaddAll, contains, containsAll, isEmpty, remove, removeAll, retainAll, toArray, toArraypublic static final java.lang.String FILE_EXTENSION
public static final java.lang.String SERIALIZED_OBJ_FILE_EXTENSION
public static final java.lang.String ARFF_RELATION
public static final java.lang.String ARFF_DATA
public Instances(java.io.Reader reader)
          throws java.io.IOException
reader - the readerjava.io.IOException - if the ARFF file is not read successfully@Deprecated
public Instances(java.io.Reader reader,
                             int capacity)
                      throws java.io.IOException
readInstance(Reader) method, one should use the
             ArffLoader or DataSource class
             instead.reader - the readercapacity - the capacityjava.lang.IllegalArgumentException - if the header is not read successfully or
           the capacity is negative.java.io.IOException - if there is a problem with the reader.ArffLoader, 
ConverterUtils.DataSourcepublic Instances(Instances dataset)
dataset - the set to be copiedpublic Instances(Instances dataset, int capacity)
dataset - the instances from which the header information is to be
          takencapacity - the capacity of the new datasetpublic Instances(Instances source, int first, int toCopy)
source - the set of instances from which a subset is to be createdfirst - the index of the first instance to be copiedtoCopy - the number of instances to be copiedjava.lang.IllegalArgumentException - if first and toCopy are out of rangepublic Instances(java.lang.String name,
                 java.util.ArrayList<Attribute> attInfo,
                 int capacity)
name - the name of the relationattInfo - the attribute informationcapacity - the capacity of the setjava.lang.IllegalArgumentException - if attribute names are not uniquepublic Instances stringFreeStructure()
public boolean add(Instance instance)
public void add(int index,
                Instance instance)
public boolean allAttributeWeightsIdentical()
public boolean allInstanceWeightsIdentical()
public Attribute attribute(int index)
index - the attribute's index (index starts with 0)public Attribute attribute(java.lang.String name)
name - the attribute's namepublic boolean checkForAttributeType(int attType)
attType - the attribute type to look forpublic boolean checkForStringAttributes()
public boolean checkInstance(Instance instance)
instance - the instance to checkpublic Attribute classAttribute()
UnassignedClassException - if the class is not setpublic int classIndex()
public void compactify()
public void delete()
public void delete(int index)
index - the instance's position (index starts with 0)public void deleteAttributeAt(int position)
position - the attribute's position (position starts with 0)java.lang.IllegalArgumentException - if the given index is out of range or the
           class attribute is being deletedpublic void deleteAttributeType(int attType)
attType - the attribute type to deletejava.lang.IllegalArgumentException - if attribute couldn't be successfully
           deleted (probably because it is the class attribute).public void deleteStringAttributes()
java.lang.IllegalArgumentException - if string attribute couldn't be
           successfully deleted (probably because it is the class
           attribute).deleteAttributeType(int)public void deleteWithMissing(int attIndex)
attIndex - the attribute's index (index starts with 0)public void deleteWithMissing(Attribute att)
att - the attributepublic void deleteWithMissingClass()
UnassignedClassException - if class is not setpublic java.util.Enumeration<Attribute> enumerateAttributes()
public java.util.Enumeration<Instance> enumerateInstances()
public java.lang.String equalHeadersMsg(Instances dataset)
dataset - another datasetpublic boolean equalHeaders(Instances dataset)
dataset - another datasetpublic Instance firstInstance()
public java.util.Random getRandomNumberGenerator(long seed)
seed - the given seedpublic void insertAttributeAt(Attribute att, int position)
att - the attribute to be insertedposition - the attribute's position (position starts with 0)java.lang.IllegalArgumentException - if the given index is out of rangepublic Instance instance(int index)
index - the instance's index (index starts with 0)public Instance get(int index)
public double kthSmallestValue(Attribute att, int k)
att - the Attribute objectk - the value of kpublic double kthSmallestValue(int attIndex,
                               int k)
attIndex - the attribute's indexk - the value of kpublic Instance lastInstance()
public double meanOrMode(int attIndex)
attIndex - the attribute's index (index starts with 0)public double meanOrMode(Attribute att)
att - the attributepublic int numAttributes()
public int numClasses()
UnassignedClassException - if the class is not setpublic int numDistinctValues(int attIndex)
attIndex - the attribute (index starts with 0)public int numDistinctValues(Attribute att)
att - the attributepublic int numInstances()
public int size()
public void randomize(java.util.Random random)
random - a random number generator@Deprecated
public boolean readInstance(java.io.Reader reader)
                                 throws java.io.IOException
readInstance(Reader) method, one should use the
             ArffLoader or DataSource class
             instead.reader - the readerjava.io.IOException - if the information is not read successfullyArffLoader, 
ConverterUtils.DataSourcepublic void replaceAttributeAt(Attribute att, int position)
att - the attribute to be insertedposition - the attribute's position (position starts with 0)java.lang.IllegalArgumentException - if the given index is out of rangepublic java.lang.String relationName()
public Instance remove(int index)
public void renameAttribute(int att,
                            java.lang.String name)
att - the attribute's index (index starts with 0)name - the new namepublic void setAttributeWeight(Attribute att, double weight)
att - the attributeweight - the new weightpublic void setAttributeWeight(int att,
                               double weight)
att - the attribute's index (index starts with 0)weight - the new weightpublic void renameAttribute(Attribute att, java.lang.String name)
att - the attributename - the new namepublic void renameAttributeValue(int att,
                                 int val,
                                 java.lang.String name)
att - the attribute's index (index starts with 0)val - the value's index (index starts with 0)name - the new namepublic void renameAttributeValue(Attribute att, java.lang.String val, java.lang.String name)
att - the attributeval - the valuename - the new namepublic Instances resample(java.util.Random random)
random - a random number generatorpublic Instances resampleWithWeights(java.util.Random random)
random - a random number generatorpublic Instances resampleWithWeights(java.util.Random random, boolean[] sampled)
random - a random number generatorsampled - an array indicating what has been sampledpublic Instances resampleWithWeights(java.util.Random random, boolean representUsingWeights)
random - a random number generatorrepresentUsingWeights - if true, copies are represented using weights
          in resampled datapublic Instances resampleWithWeights(java.util.Random random, boolean[] sampled, boolean representUsingWeights)
random - a random number generatorsampled - an array indicating what has been sampledrepresentUsingWeights - if true, copies are represented using weights
          in resampled datapublic Instances resampleWithWeights(java.util.Random random, boolean[] sampled, boolean representUsingWeights, double sampleSize)
random - a random number generatorsampled - an array indicating what has been sampled, can be nullrepresentUsingWeights - if true, copies are represented using weights
          in resampled datasampleSize - size of the new dataset as a percentage of the size of this
                   datasetjava.lang.IllegalArgumentException - if the weights array is of the wrong
           length or contains negative weights.public Instances resampleWithWeights(java.util.Random random, double[] weights)
random - a random number generatorweights - the weight vectorjava.lang.IllegalArgumentException - if the weights array is of the wrong
           length or contains negative weights.public Instances resampleWithWeights(java.util.Random random, double[] weights, boolean[] sampled)
random - a random number generatorweights - the weight vectorsampled - an array indicating what has been sampled, can be nulljava.lang.IllegalArgumentException - if the weights array is of the wrong
           length or contains negative weights.public Instances resampleWithWeights(java.util.Random random, double[] weights, boolean[] sampled, boolean representUsingWeights)
random - a random number generatorweights - the weight vectorsampled - an array indicating what has been sampled, can be nullrepresentUsingWeights - if true, copies are represented using weights
          in resampled datajava.lang.IllegalArgumentException - if the weights array is of the wrong
           length or contains negative weights.public Instances resampleWithWeights(java.util.Random random, double[] weights, boolean[] sampled, boolean representUsingWeights, double sampleSize)
random - a random number generatorweights - the weight vectorsampled - an array indicating what has been sampled, can be nullrepresentUsingWeights - if true, copies are represented using weights
          in resampled datasampleSize - size of the new dataset as a percentage of the size of this
                   datasetjava.lang.IllegalArgumentException - if the weights array is of the wrong
           length or contains negative weights.public Instance set(int index, Instance instance)
public void setClass(Attribute att)
att - attribute to be the classpublic void setClassIndex(int classIndex)
classIndex - the new class index (index starts with 0)java.lang.IllegalArgumentException - if the class index is too big or < 0public void setRelationName(java.lang.String newName)
newName - the new relation name.public void sort(int attIndex)
attIndex - the attribute's index (index starts with 0)public void sort(Attribute att)
att - the attributepublic void stableSort(int attIndex)
attIndex - the attribute's index (index starts with 0)public void stableSort(Attribute att)
att - the attributepublic void stratify(int numFolds)
numFolds - the number of folds in the cross-validationUnassignedClassException - if the class is not setpublic double sumOfWeights()
public Instances testCV(int numFolds, int numFold)
numFolds - the number of folds in the cross-validation. Must be
          greater than 1.numFold - 0 for the first fold, 1 for the second, ...java.lang.IllegalArgumentException - if the number of folds is less than 2 or
           greater than the number of instances.public java.lang.String toString()
toString in class java.util.AbstractCollection<Instance>public Instances trainCV(int numFolds, int numFold)
numFolds - the number of folds in the cross-validation. Must be
          greater than 1.numFold - 0 for the first fold, 1 for the second, ...java.lang.IllegalArgumentException - if the number of folds is less than 2 or
           greater than the number of instances.public Instances trainCV(int numFolds, int numFold, java.util.Random random)
numFolds - the number of folds in the cross-validation. Must be
          greater than 1.numFold - 0 for the first fold, 1 for the second, ...random - the random number generatorjava.lang.IllegalArgumentException - if the number of folds is less than 2 or
           greater than the number of instances.public double[] variances()
public double variance(int attIndex)
attIndex - the numeric attribute (index starts with 0)java.lang.IllegalArgumentException - if the attribute is not numericpublic double variance(Attribute att)
att - the numeric attributejava.lang.IllegalArgumentException - if the attribute is not numericpublic AttributeStats attributeStats(int index)
index - the index of the attribute to summarize (index starts with 0)public double[] attributeToDoubleArray(int index)
index - the index of the attribute.public java.lang.String toSummaryString()
public void swap(int i,
                 int j)
i - the first instance's index (index starts with 0)j - the second instance's index (index starts with 0)public static Instances mergeInstances(Instances first, Instances second)
first - the first set of Instancessecond - the second set of Instancesjava.lang.IllegalArgumentException - if the datasets are not the same sizepublic static void test(java.lang.String[] argv)
argv - should contain one element: the name of an ARFF filepublic static void main(java.lang.String[] args)
weka.core.Instances helpweka.core.Instances <filename>weka.core.Instances merge <filename1> <filename2>weka.core.Instances append <filename1> <filename2>
 weka.core.Instances headers <filename1>
 <filename2>weka.core.Instances randomize <seed> <filename>args - the commandline parameterspublic java.lang.String getRevision()
getRevision in interface RevisionHandler