public static class ArffLoader.ArffReader extends java.lang.Object implements RevisionHandler
BufferedReader reader = new BufferedReader(new FileReader("/some/where/file.arff")); ArffReader arff = new ArffReader(reader); Instances data = arff.getData(); data.setClassIndex(data.numAttributes() - 1);Typical code for incremental usage:
BufferedReader reader = new BufferedReader(new FileReader("/some/where/file.arff")); ArffReader arff = new ArffReader(reader, 1000); Instances data = arff.getStructure(); data.setClassIndex(data.numAttributes() - 1); Instance inst; while ((inst = arff.readInstance(data)) != null) { data.add(inst); }
Constructor and Description |
---|
ArffReader(java.io.Reader reader)
Reads the data completely from the reader.
|
ArffReader(java.io.Reader reader,
Instances template,
int lines,
int capacity,
boolean batch,
java.lang.String... fieldSepAndEnclosures)
Initializes the reader without reading the header according to the
specified template.
|
ArffReader(java.io.Reader reader,
Instances template,
int lines,
int capacity,
java.lang.String... fieldSepAndEnclosures)
Initializes the reader without reading the header according to the
specified template.
|
ArffReader(java.io.Reader reader,
Instances template,
int lines,
java.lang.String... fieldSepAndEnclosures)
Reads the data without header according to the specified template.
|
ArffReader(java.io.Reader reader,
int capacity) |
ArffReader(java.io.Reader reader,
int capacity,
boolean batch)
Reads only the header and reserves the specified space for instances.
|
Modifier and Type | Method and Description |
---|---|
Instances |
getData()
Returns the data that was read
|
int |
getLineNo()
returns the current line number
|
boolean |
getRetainStringValues()
Get whether to retain the values of string attributes in memory (in the
header) when reading incrementally.
|
java.lang.String |
getRevision()
Returns the revision string.
|
Instances |
getStructure()
Returns the header format
|
Instance |
readInstance(Instances structure)
Reads a single instance using the tokenizer and returns it.
|
Instance |
readInstance(Instances structure,
boolean flag)
Reads a single instance using the tokenizer and returns it.
|
void |
setRetainStringValues(boolean retain)
Set whether to retain the values of string attributes in memory (in the
header) when reading incrementally.
|
public ArffReader(java.io.Reader reader) throws java.io.IOException
getData()
method.reader
- the reader to usejava.io.IOException
- if something goes wronggetData()
public ArffReader(java.io.Reader reader, int capacity) throws java.io.IOException
java.io.IOException
public ArffReader(java.io.Reader reader, int capacity, boolean batch) throws java.io.IOException
readInstance()
.reader
- the reader to usecapacity
- the capacity of the new datasetbatch
- true if reading in batch modejava.io.IOException
- if something goes wrongjava.io.IOException
- if a problem occursgetStructure()
,
readInstance(Instances)
public ArffReader(java.io.Reader reader, Instances template, int lines, java.lang.String... fieldSepAndEnclosures) throws java.io.IOException
getData()
method.reader
- the reader to usetemplate
- the template headerlines
- the lines read so farfieldSepAndEnclosures
- an optional array of Strings containing the
field separator and enclosures to use instead of the defaults.
The first entry in the array is expected to be the single
character field separator to use; the remaining entries (if any)
are enclosure characters to use.java.io.IOException
- if something goes wronggetData()
public ArffReader(java.io.Reader reader, Instances template, int lines, int capacity, java.lang.String... fieldSepAndEnclosures) throws java.io.IOException
readInstance()
method.reader
- the reader to usetemplate
- the template headerlines
- the lines read so farcapacity
- the capacity of the new datasetfieldSepAndEnclosures
- an optional array of Strings containing the
field separator and enclosures to use instead of the defaults.
The first entry in the array is expected to be the single
character field separator to use; the remaining entries (if any)
are enclosure characters to use.java.io.IOException
- if something goes wronggetData()
public ArffReader(java.io.Reader reader, Instances template, int lines, int capacity, boolean batch, java.lang.String... fieldSepAndEnclosures) throws java.io.IOException
readInstance()
method.reader
- the reader to usetemplate
- the template headerlines
- the lines read so farcapacity
- the capacity of the new datasetbatch
- true if the data is going to be read in batch modefieldSepAndEnclosures
- an optional array of Strings containing the
field separator and enclosures to use instead of the defaults.
The first entry in the array is expected to be the single
character field separator to use; the remaining entries (if any)
are enclosure characters to use.java.io.IOException
- if something goes wronggetData()
public int getLineNo()
public Instance readInstance(Instances structure) throws java.io.IOException
structure
- the dataset header information, will get updated in case
of string or relational attributesjava.io.IOException
- if the information is not read successfullypublic Instance readInstance(Instances structure, boolean flag) throws java.io.IOException
structure
- the dataset header information, will get updated in case
of string or relational attributesflag
- if method should test for carriage return after each instancejava.io.IOException
- if the information is not read successfullypublic Instances getStructure()
public Instances getData()
public void setRetainStringValues(boolean retain)
retain
- true if string values are to be retained in memory when
reading incrementallypublic boolean getRetainStringValues()
public java.lang.String getRevision()
getRevision
in interface RevisionHandler