emolib.classifier.machinelearning
Class MultinomialNB

java.lang.Object
  extended by emolib.util.proc.TextDataProcessor
      extended by emolib.classifier.Classifier
          extended by emolib.classifier.machinelearning.MultinomialNB
All Implemented Interfaces:
Configurable, DataProcessor

public class MultinomialNB
extends Classifier

The MultinomialNB class is a Multinomial Naive Bayes (MNB) classifier.

It is a probabilistic generative approach that builds a language model assuming conditional independence among the features. In reality, this conditional independence assumption does not hold for text data, but even though the probability estimates of this oversimplified model are of low quality because of this, its classification decisions (based on Bayes’ decision rule) are surprisingly good. The MNB combines efficiency (its has an optimal time performance) with good accuracy.

The MultinomialNB follows the implementation described in (Manning, et al., 2008). The same term weighting schemes as the ones used in the ARN-R are considered.

--
(Manning, et al., 2008) Manning, C. D., Raghavan, P. and Schutze, H., "An Introduction to Information Retrieval", 2008.

Author:
Alexandre Trilla (atrilla@salle.url.edu)
See Also:
ARNReduced

Constructor Summary
MultinomialNB()
          Main constructor of this exponential regression classifier.
 
Method Summary
 java.lang.String getCategory(FeatureBox inputFeatures)
          The function that decides the most appropriate emotional category.
 void load(java.lang.String path)
          Generic function to load a previously saved classifier.
 void resetExamples()
          Method to reset the classifier and flush the training examples.
 void save(java.lang.String path)
          Generic method to save the fully fledged classifier into a given file path.
 void setChi2(boolean chi, int numF)
          Set the Chi square feature selection.
 void setCOF(boolean cof)
          Method to consider bigram frequencies.
 void setEmotionDims(boolean emodims)
          Method to consider emotion dimensions.
 void setMI(boolean mi, int numF)
          Set the Mutual Information feature selection.
 void setNegation(boolean neg)
          Method to consider negations.
 void setPOS(boolean pos)
          Method to consider POS tags.
 void setStemming(boolean stems)
          Method to consider stems.
 void setSynonyms(boolean syns)
          Method to consider synonyms.
 void setTF(boolean tf, int numF)
          Set the Term Frequency feature selection.
 void trainingProcedure()
          Training method based on the algorithm in (Manning, et al., 2008).
 
Methods inherited from class emolib.classifier.Classifier
applyClassification, getData, getListOfExampleCategories, getListOfExampleFeatures, initialize, inputTrainingExample, newProperties, register, train
 
Methods inherited from class emolib.util.proc.TextDataProcessor
flush, getName, getPredecessor, setPredecessor, toString
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

MultinomialNB

public MultinomialNB()
Main constructor of this exponential regression classifier.

Method Detail

getCategory

public java.lang.String getCategory(FeatureBox inputFeatures)
Description copied from class: Classifier
The function that decides the most appropriate emotional category. This is required for any classifier. The classifier in question has to previously run any training algorithm in order to provide the required prediction.

Specified by:
getCategory in class Classifier
Parameters:
inputFeatures - The input emotional features.
Returns:
The most appropriate emotional category.

trainingProcedure

public void trainingProcedure()
Training method based on the algorithm in (Manning, et al., 2008). Nevertheless, doc counts are approximated by the sum of term freqs. (non-Javadoc)

Specified by:
trainingProcedure in class Classifier
See Also:
Classifier.trainingProcedure()

setMI

public void setMI(boolean mi,
                  int numF)
Set the Mutual Information feature selection.

Parameters:
mi - The Mutual Information flag.
numF - The number of relevant features desired.

setChi2

public void setChi2(boolean chi,
                    int numF)
Set the Chi square feature selection.

Parameters:
chi - The Chi2 flag.
numF - The number of relevant features desired.

setTF

public void setTF(boolean tf,
                  int numF)
Set the Term Frequency feature selection.

Parameters:
tf - The Term Frequency flag.
numF - The number of relevant features desired.

setCOF

public void setCOF(boolean cof)
Method to consider bigram frequencies.

Parameters:
cof - The COF flag.

setPOS

public void setPOS(boolean pos)
Method to consider POS tags.

Parameters:
pos - The POS flag.

setStemming

public void setStemming(boolean stems)
Method to consider stems.

Parameters:
stems - The stemming flag.

setSynonyms

public void setSynonyms(boolean syns)
Method to consider synonyms.

Parameters:
syns - The synonyms flag.

setEmotionDims

public void setEmotionDims(boolean emodims)
Method to consider emotion dimensions.

Parameters:
emodims - The emotion dimensions flag.

setNegation

public void setNegation(boolean neg)
Method to consider negations.

Parameters:
neg - The negation flag.

save

public void save(java.lang.String path)
Description copied from class: Classifier
Generic method to save the fully fledged classifier into a given file path. It is recommended to use a plain text file (such as XML) to save the classifier's configuration since it's readable directly.

Specified by:
save in class Classifier
Parameters:
path - The file path to save the classifier.

load

public void load(java.lang.String path)
Description copied from class: Classifier
Generic function to load a previously saved classifier. This function should be consistent with the design followed in the saving procedure.

Specified by:
load in class Classifier
Parameters:
path - The path of the file which contains the previously saved classifier.

resetExamples

public void resetExamples()
Description copied from class: Classifier
Method to reset the classifier and flush the training examples. This method only makes sense if the classifier in question is trainable and already has some training examples.

Overrides:
resetExamples in class Classifier