public class FPGrowth extends AbstractAssociator implements AssociationRulesProducer, OptionHandler, TechnicalInformationHandler
@inproceedings{Han2000, author = {J. Han and J.Pei and Y. Yin}, booktitle = {Proceedings of the 2000 ACM-SIGMID International Conference on Management of Data}, pages = {1-12}, title = {Mining frequent patterns without candidate generation}, year = {2000} }Valid options are:
-P <attribute index of positive value> Set the index of the attribute value to consider as 'positive' for binary attributes in normal dense instances. Index 2 is always used for sparse instances. (default = 2)
-I <max items> The maximum number of items to include in large items sets (and rules). (default = -1, i.e. no limit.)
-N <require number of rules> The required number of rules. (default = 10)
-T <0=confidence | 1=lift | 2=leverage | 3=Conviction> The metric by which to rank rules. (default = confidence)
-C <minimum metric score of a rule> The minimum metric score of a rule. (default = 0.9)
-U <upper bound for minimum support> Upper bound for minimum support. (default = 1.0)
-M <lower bound for minimum support> The lower bound for the minimum support. (default = 0.1)
-D <delta for minimum support> The delta by which the minimum support is decreased in each iteration. (default = 0.05)
-S Find all rules that meet the lower bound on minimum support and the minimum metric constraint. Turning this mode on will disable the iterative support reduction procedure to find the specified number of rules.
-transactions <comma separated list of attribute names> Only consider transactions that contain these items (default = no restriction)
-rules <comma separated list of attribute names> Only print rules that contain these items. (default = no restriction)
-use-or Use OR instead of AND for must contain list(s). Use in conjunction with -transactions and/or -rules
Constructor and Description |
---|
FPGrowth()
Construct a new FPGrowth object.
|
Modifier and Type | Method and Description |
---|---|
void |
buildAssociations(Instances data)
Method that generates all large item sets with a minimum support, and from
these all association rules with a minimum metric (i.e.
|
boolean |
canProduceRules()
Returns true if this AssociationRulesProducer can actually produce rules.
|
java.lang.String |
deltaTipText()
Returns the tip text for this property
|
java.lang.String |
findAllRulesForSupportLevelTipText()
Tip text for this property suitable for displaying in the GUI.
|
static java.util.List<AssociationRule> |
generateRulesBruteForce(weka.associations.FPGrowth.FrequentItemSets largeItemSets,
DefaultAssociationRule.METRIC_TYPE metricToUse,
double metricThreshold,
int upperBoundMinSuppAsInstances,
int lowerBoundMinSuppAsInstances,
int totalTransactions)
Generate all association rules, from the supplied frequet item sets, that
meet a given minimum metric threshold.
|
AssociationRules |
getAssociationRules()
Gets the list of mined association rules.
|
Capabilities |
getCapabilities()
Returns default capabilities of the classifier.
|
double |
getDelta()
Get the value of delta.
|
boolean |
getFindAllRulesForSupportLevel()
Get whether all rules meeting the lower bound on min support and the
minimum metric threshold are to be found.
|
double |
getLowerBoundMinSupport()
Get the value of lowerBoundMinSupport.
|
int |
getMaxNumberOfItems()
Gets the maximum number of items to be included in large item sets.
|
SelectedTag |
getMetricType()
Get the metric type to use.
|
double |
getMinMetric()
Get the value of minConfidence.
|
int |
getNumRulesToFind()
Get the number of rules to find.
|
java.lang.String[] |
getOptions()
Gets the current settings of the classifier.
|
int |
getPositiveIndex()
Get the index of the attribute value to consider as positive for binary
attributes in normal dense instances.
|
java.lang.String |
getRevision()
Returns the revision string.
|
java.lang.String[] |
getRuleMetricNames()
Gets a list of the names of the metrics output for each rule.
|
java.lang.String |
getRulesMustContain()
Get the comma separated list of items that rules must contain in order to
be output.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed
information about the technical background of this class, e.g., paper
reference or book this class is based on.
|
java.lang.String |
getTransactionsMustContain()
Gets the comma separated list of items that transactions must contain in
order to be considered for large item sets and rules.
|
double |
getUpperBoundMinSupport()
Get the value of upperBoundMinSupport.
|
boolean |
getUseORForMustContainList()
Gets whether OR is to be used rather than AND when considering must contain
lists.
|
java.lang.String |
globalInfo()
Returns a string describing this associator
|
java.lang.String |
graph(weka.associations.FPGrowth.FPTreeRoot tree)
Assemble a dot graph representation of the FP-tree.
|
java.util.Enumeration<Option> |
listOptions()
Returns an enumeration describing the available options.
|
java.lang.String |
lowerBoundMinSupportTipText()
Returns the tip text for this property
|
static void |
main(java.lang.String[] args)
Main method.
|
java.lang.String |
maxNumberOfItemsTipText()
Tip text for this property suitable for displaying in the GUI.
|
java.lang.String |
metricTypeTipText()
Tip text for this property suitable for displaying in the GUI.
|
java.lang.String |
minMetricTipText()
Returns the tip text for this property
|
java.lang.String |
numRulesToFindTipText()
Tip text for this property suitable for displaying in the GUI.
|
java.lang.String |
positiveIndexTipText()
Tip text for this property suitable for displaying in the GUI.
|
static java.util.List<AssociationRule> |
pruneRules(java.util.List<AssociationRule> rulesToPrune,
java.util.ArrayList<Item> itemsToConsider,
boolean useOr) |
void |
resetOptions()
Reset all options to their default values.
|
java.lang.String |
rulesMustContainTipText()
Returns the tip text for this property
|
void |
setDelta(double v)
Set the value of delta.
|
void |
setFindAllRulesForSupportLevel(boolean s)
If true then turn off the iterative support reduction method of finding x
rules that meet the minimum support and metric thresholds and just return
all the rules that meet the lower bound on minimum support and the minimum
metric.
|
void |
setLowerBoundMinSupport(double v)
Set the value of lowerBoundMinSupport.
|
void |
setMaxNumberOfItems(int max)
Set the maximum number of items to include in large items sets.
|
void |
setMetricType(SelectedTag d)
Set the metric type to use.
|
void |
setMinMetric(double v)
Set the value of minConfidence.
|
void |
setNumRulesToFind(int numR)
Set the desired number of rules to find.
|
void |
setOffDiskReportingFrequency(int freq)
Set how often to report some progress when the data is being read
incrementally off of the disk rather than loaded into memory.
|
void |
setOptions(java.lang.String[] options)
Parses a given list of options.
|
void |
setPositiveIndex(int index)
Set the index of the attribute value to consider as positive for binary
attributes in normal dense instances.
|
void |
setRulesMustContain(java.lang.String list)
Set the comma separated list of items that rules must contain in order to
be output.
|
void |
setTransactionsMustContain(java.lang.String list)
Set the comma separated list of items that transactions must contain in
order to be considered for large item sets and rules.
|
void |
setUpperBoundMinSupport(double v)
Set the value of upperBoundMinSupport.
|
void |
setUseORForMustContainList(boolean b)
Set whether to use OR rather than AND when considering must contain lists.
|
java.lang.String |
toString()
Output the association rules.
|
java.lang.String |
transactionsMustContainTipText()
Returns the tip text for this property
|
java.lang.String |
upperBoundMinSupportTipText()
Returns the tip text for this property
|
java.lang.String |
useORForMustContainListTipText()
Returns the tip text for this property
|
doNotCheckCapabilitiesTipText, forName, getDoNotCheckCapabilities, makeCopies, makeCopy, postExecution, preExecution, run, runAssociator, setDoNotCheckCapabilities
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
makeCopy
public static java.util.List<AssociationRule> generateRulesBruteForce(weka.associations.FPGrowth.FrequentItemSets largeItemSets, DefaultAssociationRule.METRIC_TYPE metricToUse, double metricThreshold, int upperBoundMinSuppAsInstances, int lowerBoundMinSuppAsInstances, int totalTransactions)
largeItemSets
- the set of frequent item setsmetricToUse
- the metric to usemetricThreshold
- the threshold value that a rule must meetupperBoundMinSuppAsInstances
- the upper bound on the support in order
to accept the rulelowerBoundMinSuppAsInstances
- the lower bound on the support in order
to accept the ruletotalTransactions
- the total number of transactions in the datapublic static java.util.List<AssociationRule> pruneRules(java.util.List<AssociationRule> rulesToPrune, java.util.ArrayList<Item> itemsToConsider, boolean useOr)
public Capabilities getCapabilities()
getCapabilities
in interface Associator
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class AbstractAssociator
Capabilities
public java.lang.String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface TechnicalInformationHandler
public void resetOptions()
public java.lang.String positiveIndexTipText()
public void setPositiveIndex(int index)
index
- the index to use for positive values in binary attributes.public int getPositiveIndex()
public void setNumRulesToFind(int numR)
numR
- the number of rules to find.public int getNumRulesToFind()
public java.lang.String numRulesToFindTipText()
public void setMetricType(SelectedTag d)
d
- the metric typepublic void setMaxNumberOfItems(int max)
max
- the maxim number of items to include in large item sets.public int getMaxNumberOfItems()
public java.lang.String maxNumberOfItemsTipText()
public SelectedTag getMetricType()
public java.lang.String metricTypeTipText()
public java.lang.String minMetricTipText()
public double getMinMetric()
public void setMinMetric(double v)
v
- Value to assign to minConfidence.public java.lang.String transactionsMustContainTipText()
public void setTransactionsMustContain(java.lang.String list)
list
- a comma separated list of items (empty string indicates no
restriction on the transactions).public java.lang.String getTransactionsMustContain()
public java.lang.String rulesMustContainTipText()
public void setRulesMustContain(java.lang.String list)
list
- a comma separated list of items (empty string indicates no
restriction on the rules).public java.lang.String getRulesMustContain()
public java.lang.String useORForMustContainListTipText()
public void setUseORForMustContainList(boolean b)
b
- true if OR should be used instead of AND when considering
transaction and rules must contain lists.public boolean getUseORForMustContainList()
public java.lang.String deltaTipText()
public double getDelta()
public void setDelta(double v)
v
- Value to assign to delta.public java.lang.String lowerBoundMinSupportTipText()
public double getLowerBoundMinSupport()
public void setLowerBoundMinSupport(double v)
v
- Value to assign to lowerBoundMinSupport.public java.lang.String upperBoundMinSupportTipText()
public double getUpperBoundMinSupport()
public void setUpperBoundMinSupport(double v)
v
- Value to assign to upperBoundMinSupport.public java.lang.String findAllRulesForSupportLevelTipText()
public void setFindAllRulesForSupportLevel(boolean s)
s
- true if all rules meeting the lower bound on the support and
minimum metric thresholds are to be found.public boolean getFindAllRulesForSupportLevel()
public void setOffDiskReportingFrequency(int freq)
freq
- the frequency to print progress.public AssociationRules getAssociationRules()
getAssociationRules
in interface AssociationRulesProducer
public java.lang.String[] getRuleMetricNames()
getRuleMetricNames
in interface AssociationRulesProducer
public boolean canProduceRules()
canProduceRules
in interface AssociationRulesProducer
public java.util.Enumeration<Option> listOptions()
listOptions
in interface OptionHandler
listOptions
in class AbstractAssociator
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-P <attribute index of positive value> Set the index of the attribute value to consider as 'positive' for binary attributes in normal dense instances. Index 2 is always used for sparse instances. (default = 2)
-I <max items> The maximum number of items to include in large items sets (and rules). (default = -1, i.e. no limit.)
-N <require number of rules> The required number of rules. (default = 10)
-T <0=confidence | 1=lift | 2=leverage | 3=Conviction> The metric by which to rank rules. (default = confidence)
-C <minimum metric score of a rule> The minimum metric score of a rule. (default = 0.9)
-U <upper bound for minimum support> Upper bound for minimum support. (default = 1.0)
-M <lower bound for minimum support> The lower bound for the minimum support. (default = 0.1)
-D <delta for minimum support> The delta by which the minimum support is decreased in each iteration. (default = 0.05)
-S Find all rules that meet the lower bound on minimum support and the minimum metric constraint. Turning this mode on will disable the iterative support reduction procedure to find the specified number of rules.
-transactions <comma separated list of attribute names> Only consider transactions that contain these items (default = no restriction)
-rules <comma separated list of attribute names> Only print rules that contain these items. (default = no restriction)
-use-or Use OR instead of AND for must contain list(s). Use in conjunction with -transactions and/or -rules
setOptions
in interface OptionHandler
setOptions
in class AbstractAssociator
options
- the list of options as an array of stringsjava.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class AbstractAssociator
public void buildAssociations(Instances data) throws java.lang.Exception
buildAssociations
in interface Associator
data
- the instances to be used for generating the associationsjava.lang.Exception
- if rules can't be built successfullypublic java.lang.String toString()
toString
in class java.lang.Object
public java.lang.String graph(weka.associations.FPGrowth.FPTreeRoot tree)
tree
- the root of the FP-treepublic java.lang.String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class AbstractAssociator
public static void main(java.lang.String[] args)
args
- the commandline options