emolib.classifier
Class Classifier

java.lang.Object
  extended by emolib.util.proc.TextDataProcessor
      extended by emolib.classifier.Classifier
All Implemented Interfaces:
Configurable, DataProcessor
Direct Known Subclasses:
ARNReduced, BernoulliNB, FiveIntervalsKE, HierarchicalARNReduced, KNearestNeighbour, Logistic, LSA, MultinomialNB, NaiveBayes, NearestCentroid, OrdinalLogReg, RiskLogReg, RiskWeightedNaiveBayes, SupportVectorMachine, ThreeIntervalsKE, WekaMultinomialNB

public abstract class Classifier
extends TextDataProcessor

The Classifier abstract class defines the basic methods that any EmoLib classifier should implement in order to provide an affective label.

The Classifier establishes the frontier (interface) between the feature-wise world and the knowledge-wise world. The knowledge may be incorporated in the classifier by means of explicitly hard-coding it into the definition of the classifier, i.e., the expert-based heuristic approach, or by means of automatically learning it from training data, i.e., the data-driven Machine Learning approach.

Author:
Alexandre Trilla (atrilla@salle.url.edu)

Constructor Summary
Classifier()
          Main constructor of the Classifier.
 
Method Summary
 void applyClassification(TextData inputTextDataObject)
          Method to perform the classification process.
abstract  java.lang.String getCategory(FeatureBox inputFeatures)
          The function that decides the most appropriate emotional category.
 Data getData()
          Obtains the TextData from the previous module, processes it and makes it available to the rest of the text processing chain.
 java.util.ArrayList<java.lang.String> getListOfExampleCategories()
          Retrieves the list of training example categories.
 java.util.ArrayList<FeatureBox> getListOfExampleFeatures()
          Retrieves the list of training example features.
 void initialize()
          Method to initialize the Classifier.
 void inputTrainingExample(FeatureBox features, java.lang.String cat)
          Mehtod to input training data into the classifier.
abstract  void load(java.lang.String path)
          Generic function to load a previously saved classifier.
 void newProperties(PropertySheet ps)
          This method is called when this configurable component has new data.
 void register(java.lang.String name, Registry registry)
          Register my properties.
 void resetExamples()
          Method to reset the classifier and flush the training examples.
abstract  void save(java.lang.String path)
          Generic method to save the fully fledged classifier into a given file path.
 void train()
          Method to train the classifier.
abstract  void trainingProcedure()
          Generic training procedure.
 
Methods inherited from class emolib.util.proc.TextDataProcessor
flush, getName, getPredecessor, setPredecessor, toString
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

Classifier

public Classifier()
Main constructor of the Classifier.

Method Detail

register

public void register(java.lang.String name,
                     Registry registry)
              throws PropertyException
Description copied from interface: Configurable
Register my properties. This method is called once early in the time of the component, shortly after the component is constructed. This component should register any configuration properties that it needs to register. If this configurable extends another configurable, super.register should also be called

Specified by:
register in interface Configurable
Overrides:
register in class TextDataProcessor
Parameters:
name - the name of the component
registry - the registry for this component
Throws:
PropertyException

newProperties

public void newProperties(PropertySheet ps)
                   throws PropertyException
Description copied from interface: Configurable
This method is called when this configurable component has new data. The component should first validate the data. If it is bad the component should return false. If the data is good, the component should record the the data internally and return true.

Specified by:
newProperties in interface Configurable
Overrides:
newProperties in class TextDataProcessor
Parameters:
ps - a property sheet holding the new data
Throws:
PropertyException - if there is a problem with the properties.

getData

public Data getData()
             throws DataProcessingException
Obtains the TextData from the previous module, processes it and makes it available to the rest of the text processing chain.

Specified by:
getData in interface DataProcessor
Specified by:
getData in class TextDataProcessor
Returns:
The next available Data object, returns null if no Data object is available.
Throws:
DataProcessingException - If there is a processing error.

initialize

public void initialize()
Method to initialize the Classifier.

Specified by:
initialize in interface DataProcessor
Overrides:
initialize in class TextDataProcessor

applyClassification

public void applyClassification(TextData inputTextDataObject)
Method to perform the classification process. This is the method that will be called externally (by the processing chain) and will wrap all the work. Since the target application of the whole system is emotional speech synthesis, the depth of the analysis is required at document, paragraph and sentence levels. The rest of the abstract methods are required to perform some basic classification functionalities. Beware that the textual features are not yet considered!!!

Parameters:
inputTextDataObject - The TextData object to process.

getCategory

public abstract java.lang.String getCategory(FeatureBox inputFeatures)
The function that decides the most appropriate emotional category. This is required for any classifier. The classifier in question has to previously run any training algorithm in order to provide the required prediction.

Parameters:
inputFeatures - The input emotional features.
Returns:
The most appropriate emotional category.

inputTrainingExample

public void inputTrainingExample(FeatureBox features,
                                 java.lang.String cat)
Mehtod to input training data into the classifier. The training examples are fed to the classifier one by one. It is a matter of the classifier in question to treat them appropriately.

Parameters:
features - The input emotional features.
cat - The category of the input example.

train

public void train()
Method to train the classifier. Only trainable classifiers will implement the body of this method. The classifiers that rely on an expert knowledge base will implement a void method.


trainingProcedure

public abstract void trainingProcedure()
Generic training procedure. It trains the classifier in question with the input training examples.


getListOfExampleFeatures

public java.util.ArrayList<FeatureBox> getListOfExampleFeatures()
Retrieves the list of training example features.

Returns:
The list of training example features.

getListOfExampleCategories

public java.util.ArrayList<java.lang.String> getListOfExampleCategories()
Retrieves the list of training example categories.

Returns:
The list of training example categories.

resetExamples

public void resetExamples()
Method to reset the classifier and flush the training examples. This method only makes sense if the classifier in question is trainable and already has some training examples.


save

public abstract void save(java.lang.String path)
Generic method to save the fully fledged classifier into a given file path. It is recommended to use a plain text file (such as XML) to save the classifier's configuration since it's readable directly.

Parameters:
path - The file path to save the classifier.

load

public abstract void load(java.lang.String path)
Generic function to load a previously saved classifier. This function should be consistent with the design followed in the saving procedure.

Parameters:
path - The path of the file which contains the previously saved classifier.