public class LatentSemanticAnalysis extends UnsupervisedAttributeEvaluator implements AttributeTransformer, OptionHandler
-N Normalize input data.
-R Rank approximation used in LSA. May be actual number of LSA attributes to include (if greater than 1) or a proportion of total singular values to account for (if between 0 and 1). A value less than or equal to zero means use all latent variables. (default = 0.95)
-A Maximum number of attributes to include in transformed attribute names. (-1 = include all)
Constructor and Description |
---|
LatentSemanticAnalysis() |
Modifier and Type | Method and Description |
---|---|
void |
buildEvaluator(Instances data)
Initializes the singular values/vectors and performs the analysis
|
Instance |
convertInstance(Instance instance)
Transform an instance in original (unnormalized) format
|
double |
evaluateAttribute(int att)
Evaluates the merit of a transformed attribute.
|
Capabilities |
getCapabilities()
Returns the capabilities of this evaluator.
|
int |
getMaximumAttributeNames()
Gets maximum number of attributes to include in
transformed attribute names.
|
boolean |
getNormalize()
Gets whether or not input data is to be normalized
|
java.lang.String[] |
getOptions()
Gets the current settings of LatentSemanticAnalysis
|
double |
getRank()
Gets the desired matrix rank (or coverage proportion) for feature-space reduction
|
java.lang.String |
getRevision()
Returns the revision string.
|
java.lang.String |
globalInfo()
Returns a string describing this attribute transformer
|
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] argv)
Main method for testing this class
|
java.lang.String |
maximumAttributeNamesTipText()
Returns the tip text for this property
|
java.lang.String |
normalizeTipText()
Returns the tip text for this property
|
java.lang.String |
rankTipText()
Returns the tip text for this property
|
void |
setMaximumAttributeNames(int newMaxAttributes)
Sets maximum number of attributes to include in
transformed attribute names.
|
void |
setNormalize(boolean newNormalize)
Set whether input data will be normalized.
|
void |
setOptions(java.lang.String[] options)
Parses a given list of options.
|
void |
setRank(double newRank)
Sets the desired matrix rank (or coverage proportion) for feature-space reduction
|
java.lang.String |
toString()
Returns a description of this attribute transformer
|
Instances |
transformedData(Instances data)
Transform the supplied data set (assumed to be the same format
as the training data)
|
Instances |
transformedHeader()
Returns just the header for the transformed data (ie.
|
clean, forName, makeCopies, postProcess
public java.lang.String globalInfo()
public java.util.Enumeration listOptions()
listOptions
in interface OptionHandler
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-N Normalize input data.
-R Rank approximation used in LSA. May be actual number of LSA attributes to include (if greater than 1) or a proportion of total singular values to account for (if between 0 and 1). A value less than or equal to zero means use all latent variables. (default = 0.95)
-A Maximum number of attributes to include in transformed attribute names. (-1 = include all)
setOptions
in interface OptionHandler
options
- the list of options as an array of stringsjava.lang.Exception
- if an option is not supportedpublic java.lang.String normalizeTipText()
public void setNormalize(boolean newNormalize)
newNormalize
- true if input data is to be normalizedpublic boolean getNormalize()
public java.lang.String rankTipText()
public void setRank(double newRank)
newRank
- the desired rank (or coverage) for feature-space reductionpublic double getRank()
public java.lang.String maximumAttributeNamesTipText()
public void setMaximumAttributeNames(int newMaxAttributes)
newMaxAttributes
- the maximum number of attributespublic int getMaximumAttributeNames()
public java.lang.String[] getOptions()
getOptions
in interface OptionHandler
public Capabilities getCapabilities()
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class ASEvaluation
Capabilities
public void buildEvaluator(Instances data) throws java.lang.Exception
buildEvaluator
in class ASEvaluation
data
- the instances to analyse/transformjava.lang.Exception
- if analysis failspublic Instances transformedHeader() throws java.lang.Exception
transformedHeader
in interface AttributeTransformer
java.lang.Exception
- if the header of the transformed data can't
be determined.public Instances transformedData(Instances data) throws java.lang.Exception
transformedData
in interface AttributeTransformer
java.lang.Exception
- if transformed data can't be returnedpublic double evaluateAttribute(int att) throws java.lang.Exception
evaluateAttribute
in interface AttributeEvaluator
att
- the attribute to be evaluatedjava.lang.Exception
- if attribute can't be evaluatedpublic Instance convertInstance(Instance instance) throws java.lang.Exception
convertInstance
in interface AttributeTransformer
instance
- an instance in the original (unnormalized) formatjava.lang.Exception
- if instance can't be transformedpublic java.lang.String toString()
toString
in class java.lang.Object
public java.lang.String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class ASEvaluation
public static void main(java.lang.String[] argv)
argv
- should contain the command line arguments to the
evaluator/transformer (see AttributeSelection)