@KFStep(name="SubstringLabeler", category="Tools", toolTipText="Label instances according to substring matches in String attributes The user can specify the attributes to match against and associated label to create by defining \'match\' rules. A new attribute is appended to the data to contain the label. Rules are applied in order when processing instances, and the label associated with the first matching rule is applied. Non-matching instances can either receive a missing value for the label attribute or be \'consumed\' (i.e. they are not output).", iconPath="weka/gui/knowledgeflow/icons/DefaultFilter.gif") public class SubstringLabeler extends BaseStep
Constructor and Description |
---|
SubstringLabeler() |
Modifier and Type | Method and Description |
---|---|
boolean |
getConsumeNonMatching()
Get whether instances that do not match any of the rules should be
"consumed" rather than output with a missing value set for the new
attribute.
|
java.lang.String |
getCustomEditorForStep()
Return the fully qualified name of a custom editor component (JComponent)
to use for editing the properties of the step.
|
java.util.List<java.lang.String> |
getIncomingConnectionTypes()
Get a list of incoming connection types that this step can accept.
|
java.lang.String |
getMatchAttributeName()
Get the name of the new attribute that is created to indicate the match
|
java.lang.String |
getMatchDetails()
Get the internally encoded list of match rules
|
boolean |
getNominalBinary()
Get whether the new attribute created should be a nominal binary attribute
rather than a numeric binary attribute.
|
java.util.List<java.lang.String> |
getOutgoingConnectionTypes()
Get a list of outgoing connection types that this step can produce.
|
Instances |
outputStructureForConnectionType(java.lang.String connectionName)
If possible, get the output structure for the named connection type as a
header-only set of instances.
|
void |
processIncoming(Data data)
Process an incoming data payload (if the step accepts incoming connections)
|
void |
setConsumeNonMatching(boolean consume)
Set whether instances that do not match any of the rules should be
"consumed" rather than output with a missing value set for the new
attribute.
|
void |
setMatchAttributeName(java.lang.String name)
Set the name of the new attribute that is created to indicate the match
|
void |
setMatchDetails(java.lang.String details)
Set internally encoded list of match rules
|
void |
setNominalBinary(boolean nom)
Set whether the new attribute created should be a nominal binary attribute
rather than a numeric binary attribute.
|
void |
stepInit()
Initialize the step
|
environmentSubstitute, getDefaultSettings, getInteractiveViewers, getInteractiveViewersImpls, getName, getStepManager, globalInfo, isResourceIntensive, isStopRequested, outputStructureForConnectionType, setName, setStepIsResourceIntensive, setStepManager, setStepMustRunSingleThreaded, start, stepMustRunSingleThreaded, stop
@ProgrammaticProperty public void setMatchDetails(java.lang.String details)
details
- the list of match rulespublic java.lang.String getMatchDetails()
@OptionMetadata(displayName="Make a nominal binary attribute", description="Whether to encode the new attribute as nominal when it is binary (as opposed to numeric)", displayOrder=1) public void setNominalBinary(boolean nom)
nom
- true if the attribute should be a nominal binary onepublic boolean getNominalBinary()
@OptionMetadata(displayName="Consume non matching instances", description="Instances that do not match any rules will be consumed, rather than being output with a missing value for the new attribute", displayOrder=2) public void setConsumeNonMatching(boolean consume)
consume
- true if non matching instances should be consumed by the
component.public boolean getConsumeNonMatching()
@OptionMetadata(displayName="Name of the new attribute", description="Name to give the new attribute", displayOrder=0) public void setMatchAttributeName(java.lang.String name)
name
- the name of the new attributepublic java.lang.String getMatchAttributeName()
public void stepInit() throws WekaException
WekaException
- if a problem occurspublic java.util.List<java.lang.String> getIncomingConnectionTypes()
public java.util.List<java.lang.String> getOutgoingConnectionTypes()
public void processIncoming(Data data) throws WekaException
processIncoming
in interface BaseStepExtender
processIncoming
in interface Step
processIncoming
in class BaseStep
data
- the data to processWekaException
- if a problem occurspublic Instances outputStructureForConnectionType(java.lang.String connectionName) throws WekaException
outputStructureForConnectionType
in interface Step
outputStructureForConnectionType
in class BaseStep
connectionName
- the name of the connection type to get the output
structure forWekaException
- if a problem occurspublic java.lang.String getCustomEditorForStep()
getCustomEditorForStep
in interface Step
getCustomEditorForStep
in class BaseStep