public class BayesianLogisticRegression extends Classifier implements OptionHandler, TechnicalInformationHandler
@techreport{Genkin2004, author = {Alexander Genkin and David D. Lewis and David Madigan}, institution = {DIMACS}, title = {Large-scale bayesian logistic regression for text categorization}, year = {2004}, URL = {http://www.stat.rutgers.edu/\~madigan/PAPERS/shortFat-v3a.pdf} }
Modifier and Type | Field and Description |
---|---|
double[] |
BetaVector
Array for storing coefficients of Bayesian regression model.
|
double |
Change
This variable is used to keep track of change in
the value of delta summation of r(i).
|
int |
ClassIndex
The class index from the training data
|
static int |
CV_BASED |
double[] |
Delta
Trust Region Radius
|
double[] |
DeltaBeta
Array to store Regression Coefficient updates.
|
double[] |
DeltaR
This vector is used to store the increments on the R(i).
|
double[] |
DeltaUpdate
Trust Region Radius Update
|
static int |
GAUSSIAN
Distributions available
|
java.lang.String |
HyperparameterRange
CV Hyperparameter Range
|
double[] |
Hyperparameters
Array to store Hyperparameter values for each feature.
|
int |
HyperparameterSelection
Hyperparameter selection method
|
double |
HyperparameterValue
Best hyperparameter for test phase
|
static double[] |
InputHyperparameterValues
Set of values to be used as hyperparameter values during Cross-Validation.
|
int |
iterationCounter
Iteration counter
|
static int |
LAPLACIAN |
static double[] |
LogLikelihood
Log-likelihood values to be used to choose the best hyperparameter.
|
Filter |
m_Filter
Filter interface used to point to weka.filters.unsupervised.attribute.Normalize object
|
int |
m_seed
seed for randomizing the instances before CV
|
int |
maxIterations
Maximum number of iterations
|
static int |
NORM_BASED
Methods for selecting the hyperparameter value
|
boolean |
NormalizeData
Choose whether to normalize data or not
|
int |
NumFolds
NumFolds for CV based Hyperparameters selection
|
int |
PriorClass
Distribution Prior class
|
double[] |
R
R(i)= BetaVector X x(i) X y(i).
|
static int |
SPECIFIC_VALUE |
static Tag[] |
TAGS_HYPER_METHOD |
static Tag[] |
TAGS_PRIOR |
double |
Threshold
Threshold for binary classification of probabilisitic estimate
|
double |
Tolerance
Tolerance criteria for the stopping criterion.
|
Constructor and Description |
---|
BayesianLogisticRegression() |
Modifier and Type | Method and Description |
---|---|
static double |
bigF(double r,
double sigma)
This is a convient function that defines and upper bound
(Delta>0) for values of r(i) reachable by updates in the
trust region.
|
void |
buildClassifier(Instances data)
(1) Set the data to the class attribute m_Instances.
(2)Call the method initialize() to initialize the values.
|
double |
classifyInstance(Instance instance)
Classifies the given instance using the Bayesian Logistic Regression function.
|
static double |
classSgn(double value)
This class is used to mask the internal class labels.
|
double |
CVBasedHyperparameter()
Method computes the best hyperparameter value by doing cross
-validation on the training data and compute the likelihood.
|
java.lang.String |
debugTipText()
Returns the tip text for this property
|
Capabilities |
getCapabilities()
This method tests what kind of data this classifier can handle.
|
java.lang.String |
getHyperparameterRange()
Get the range of hyperparameter values to consider
during CV-based selection.
|
SelectedTag |
getHyperparameterSelection()
Get the method used to select the hyperparameter
|
double |
getHyperparameterValue()
Get the hyperparameter value.
|
double |
getLoglikeliHood(double[] betas,
Instances instances) |
int |
getMaxIterations()
Get the maximum number of iterations to perform
|
int |
getNumFolds()
Return the number of folds for CV-based hyperparameter selection
|
java.lang.String[] |
getOptions()
Gets the current settings of the Classifier.
|
SelectedTag |
getPriorClass()
Get the type of prior to use.
|
java.lang.String |
getRevision()
Returns the revision string.
|
int |
getSeed()
Get the seed for randomizing the instances for CV-based
hyperparameter selection
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing
detailed information about the technical background of this class,
e.g., paper reference or book this class is based on.
|
double |
getThreshold()
Return the threshold being used.
|
double |
getTolerance()
Get the tolerance value
|
java.lang.String |
globalInfo() |
java.lang.String |
hyperparameterRangeTipText()
Returns the tip text for this property
|
java.lang.String |
hyperparameterSelectionTipText()
Returns the tip text for this property
|
java.lang.String |
hyperparameterValueTipText()
Returns the tip text for this property
|
void |
initialize()
(1)Initialize m_Beta[j] to 0.
|
boolean |
isDebug()
Returns true if debug is turned on.
|
boolean |
isNormalizeData()
Returns true if the data is to be normalized first
|
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options.
|
static double |
logisticLinkFunction(double r)
This method computes the values for the logistic link function.
|
static void |
main(java.lang.String[] argv)
Main method for testing this class.
|
java.lang.String |
maxIterationsTipText()
Returns the tip text for this property
|
java.lang.String |
normalizeDataTipText()
Returns the tip text for this property
|
double |
normBasedHyperParameter()
This function computes the norm-based hyperparameters
and stores them in the m_Hyperparameters.
|
java.lang.String |
numFoldsTipText()
Returns the tip text for this property
|
java.lang.String |
priorClassTipText()
Returns the tip text for this property
|
java.lang.String |
seedTipText()
Returns the tip text for this property
|
void |
setDebug(boolean debugMode)
Set debugging mode.
|
void |
setHyperparameterRange(java.lang.String hyperparameterRange)
Set the range of hyperparameter values to consider
during CV-based selection
|
void |
setHyperparameterSelection(SelectedTag newMethod)
Set the method used to select the hyperparameter
|
void |
setHyperparameterValue(double hyperparameterValue)
Set the hyperparameter value.
|
void |
setMaxIterations(int maxIterations)
Set the maximum number of iterations to perform
|
void |
setNormalizeData(boolean normalizeData)
Set whether to normalize the data or not
|
void |
setNumFolds(int numFolds)
Set the number of folds to use for CV-based hyperparameter
selection
|
void |
setOptions(java.lang.String[] options)
Parses a given list of options.
|
void |
setPriorClass(SelectedTag newMethod)
Set the type of prior to use.
|
void |
setSeed(int seed)
Set the seed for randomizing the instances for CV-based
hyperparameter selection
|
void |
setThreshold(double threshold)
Set the threshold to use.
|
void |
setTolerance(double tolerance)
Set the tolerance value
|
static double |
sgn(double r)
Sign for a given value.
|
boolean |
stoppingCriterion()
This method implements the stopping criterion
function.
|
java.lang.String |
thresholdTipText()
Returns the tip text for this property
|
java.lang.String |
toleranceTipText()
Returns the tip text for this property
|
java.lang.String |
toString()
Outputs the linear regression model as a string.
|
distributionForInstance, forName, getDebug, makeCopies, makeCopy
public static double[] LogLikelihood
public static double[] InputHyperparameterValues
public boolean NormalizeData
public double Tolerance
public double Threshold
public static final int GAUSSIAN
public static final int LAPLACIAN
public static final Tag[] TAGS_PRIOR
public int PriorClass
public int NumFolds
public int m_seed
public static final int NORM_BASED
public static final int CV_BASED
public static final int SPECIFIC_VALUE
public static final Tag[] TAGS_HYPER_METHOD
public int HyperparameterSelection
public int ClassIndex
public double HyperparameterValue
public java.lang.String HyperparameterRange
public int maxIterations
public int iterationCounter
public double[] BetaVector
public double[] DeltaBeta
public double[] DeltaUpdate
public double[] Delta
public double[] Hyperparameters
public double[] R
public double[] DeltaR
public double Change
public Filter m_Filter
public java.lang.String globalInfo()
public void initialize() throws java.lang.Exception
(1)Initialize m_Beta[j] to 0. (2)Initialize m_DeltaUpdate[j].
java.lang.Exception
public Capabilities getCapabilities()
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class Classifier
Capabilities
public void buildClassifier(Instances data) throws java.lang.Exception
buildClassifier
in class Classifier
data
- training datajava.lang.Exception
- if classifier can't be built successfully.public static double classSgn(double value)
value
- internal class labelpublic TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface TechnicalInformationHandler
public static double bigF(double r, double sigma)
public boolean stoppingCriterion()
public static double logisticLinkFunction(double r)
f(r)=exp(r)/(1+exp(r))
public static double sgn(double r)
r
- public double normBasedHyperParameter()
public double classifyInstance(Instance instance) throws java.lang.Exception
classifyInstance
in class Classifier
instance
- the test instancejava.lang.Exception
- if classification can't be done successfullypublic java.lang.String toString()
toString
in class java.lang.Object
public double CVBasedHyperparameter() throws java.lang.Exception
java.lang.Exception
public double getLoglikeliHood(double[] betas, Instances instances)
public java.util.Enumeration listOptions()
listOptions
in interface OptionHandler
listOptions
in class Classifier
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-D Show Debugging Output
-P <integer> Distribution of the Prior (1=Gaussian, 2=Laplacian) (default: 1=Gaussian)
-H <integer> Hyperparameter Selection Method (1=Norm-based, 2=CV-based, 3=specific value) (default: 1=Norm-based)
-V <double> Specified Hyperparameter Value (use in conjunction with -H 3) (default: 0.27)
-R <string> Hyperparameter Range (use in conjunction with -H 2) (format: R:start-end,multiplier OR L:val(1), val(2), ..., val(n)) (default: R:0.01-316,3.16)
-Tl <double> Tolerance Value (default: 0.0005)
-S <double> Threshold Value (default: 0.5)
-F <integer> Number Of Folds (use in conjuction with -H 2) (default: 2)
-I <integer> Max Number of Iterations (default: 100)
-N Normalize the data
-seed <number> Seed for randomizing instances order in CV-based hyperparameter selection (default: 1)
setOptions
in interface OptionHandler
setOptions
in class Classifier
options
- the list of options as an array of stringsjava.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
Classifier
getOptions
in interface OptionHandler
getOptions
in class Classifier
public static void main(java.lang.String[] argv)
argv
- the optionspublic java.lang.String debugTipText()
debugTipText
in class Classifier
public void setDebug(boolean debugMode)
Classifier
setDebug
in class Classifier
debugMode
- true if debug output should be printedpublic java.lang.String hyperparameterSelectionTipText()
public SelectedTag getHyperparameterSelection()
public void setHyperparameterSelection(SelectedTag newMethod)
newMethod
- the method used to set the hyperparameterpublic java.lang.String priorClassTipText()
public void setPriorClass(SelectedTag newMethod)
newMethod
- the type of prior to use.public SelectedTag getPriorClass()
public java.lang.String thresholdTipText()
public double getThreshold()
public void setThreshold(double threshold)
threshold
- the threshold to usepublic java.lang.String toleranceTipText()
public double getTolerance()
public void setTolerance(double tolerance)
tolerance
- the tolerance value to usepublic java.lang.String hyperparameterValueTipText()
public double getHyperparameterValue()
public void setHyperparameterValue(double hyperparameterValue)
hyperparameterValue
- the value of the hyperparameterpublic java.lang.String numFoldsTipText()
public int getNumFolds()
public void setNumFolds(int numFolds)
numFolds
- number of folds to selectpublic java.lang.String seedTipText()
public void setSeed(int seed)
seed
- the seed to usepublic int getSeed()
public java.lang.String maxIterationsTipText()
public int getMaxIterations()
public void setMaxIterations(int maxIterations)
maxIterations
- maximum number of iterationspublic java.lang.String normalizeDataTipText()
public boolean isNormalizeData()
public void setNormalizeData(boolean normalizeData)
normalizeData
- true if data is to be normalizedpublic java.lang.String hyperparameterRangeTipText()
public java.lang.String getHyperparameterRange()
public void setHyperparameterRange(java.lang.String hyperparameterRange)
hyperparameterRange
- the range of hyperparameter valuespublic boolean isDebug()
public java.lang.String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class Classifier