public class PairedDataHelper<P>
extends java.lang.Object
implements java.io.Serializable
A helper class that Step implementations can use when processing paired data (e.g. train and test sets). Has the concept of a primary and secondary connection/data type, where the secondary connection/data for a given set number typically needs to be processed using a result generated from the corresponding primary connection/data. This class takes care of ensuring that the secondary connection/data is only processed once the primary has completed. Users of this helper need to provide an implementation of the PairedProcessor inner interface, where the processPrimary() method will be called to process the primary data/connection (and return a result), and processSecondary() called to deal with the secondary connection/data. The result of execution on a particular primary data set number can be retrieved by calling the getIndexedPrimaryResult() method, passing in the set number of the primary result to retrieve.
This class also provides an arbitrary storage mechanism for additional results beyond the primary type of result. It also takes care of invoking processing() and finished() on the client step's StepManager.public class MyFunkyStep extends BaseStep implements PairedDataHelper.PairedProcessor{ ... protected PairedDataHelper m_helper; ... public void stepInit() { m_helper = new PairedDataHelper (this, this, StepManager.[CON_WHATEVER_YOUR_PRIMARY_CONNECTION_IS], StepManager.[CON_WHATEVER_YOUR_SECONDARY_CONNECTION_IS]); ... } public void processIncoming(Data data) throws WekaException { // delegate to our helper to handle primary/secondary synchronization // issues m_helper.process(data); } public MyFunkyMainResult processPrimary(Integer setNum, Integer maxSetNun, Data data, PairedDataHelper helper) throws WekaException { SomeDataTypeToProcess someData = data.getPrimaryPayload(); MyFunkyMainResult processor = new MyFunkyMainResult(); // do some processing using MyFunkyMainResult and SomeDataToProcess ... // output some data to downstream steps if necessary ... return processor; } public void processSecondary(Integer setNum, Integer maxSetNum, Data data, PairedDataHelper helper) throws WekaException { SomeDataTypeToProcess someData = data.getPrimaryPayload(); // get the MyFunkyMainResult for this set number MyFunkyMainResult result = helper.getIndexedPrimaryResult(setNum); // do some stuff with the result and the secondary data ... // output some data to downstream steps if necessary } }
Modifier and Type | Class and Description |
---|---|
static interface |
PairedDataHelper.PairedProcessor<P>
Interface for processors of paired data to implement.
|
Constructor and Description |
---|
PairedDataHelper(Step owner,
PairedDataHelper.PairedProcessor processor,
java.lang.String primaryConType,
java.lang.String secondaryConType)
Constructor
|
Modifier and Type | Method and Description |
---|---|
void |
addIndexedValueToNamedStore(java.lang.String storeName,
java.lang.Integer index,
java.lang.Object value)
Adds a value to a named store with the given index.
|
void |
createNamedIndexedStore(java.lang.String name)
Create a indexed store with a given name
|
P |
getIndexedPrimaryResult(int index)
Retrieve the primary result corresponding to a given set number
|
<T> T |
getIndexedValueFromNamedStore(java.lang.String storeName,
java.lang.Integer index)
Gets an indexed value from a named store
|
boolean |
isFinished()
Return true if there is no further processing to be done
|
void |
process(Data data)
Initiate routing and processing for a particular data object
|
void |
reset()
Reset the helper.
|
public PairedDataHelper(Step owner, PairedDataHelper.PairedProcessor processor, java.lang.String primaryConType, java.lang.String secondaryConType)
owner
- the owner stepprocessor
- the PairedProcessor implementationprimaryConType
- the primary connection typesecondaryConType
- the secondary connection typepublic void process(Data data) throws WekaException
data
- the data object to processWekaException
- if a problem occurspublic P getIndexedPrimaryResult(int index)
index
- the set number of the result to getpublic void reset()
public boolean isFinished()
public void createNamedIndexedStore(java.lang.String name)
name
- the name of the store to createpublic <T> T getIndexedValueFromNamedStore(java.lang.String storeName, java.lang.Integer index)
T
- the type of the valuestoreName
- the name of the store to retrieve fromindex
- the index of the value to getpublic void addIndexedValueToNamedStore(java.lang.String storeName, java.lang.Integer index, java.lang.Object value)
storeName
- the name of the store to add toindex
- the index to associate with the valuevalue
- the value to store