emolib.stemmer.snowball
Class GenericSnowballStemmer

java.lang.Object
  extended by emolib.util.proc.TextDataProcessor
      extended by emolib.stemmer.Stemmer
          extended by emolib.stemmer.snowball.GenericSnowballStemmer
All Implemented Interfaces:
Configurable, DataProcessor

public class GenericSnowballStemmer
extends Stemmer

The GenericSnowballStemmer class performs the stemming process using the Snowball library.

This class accepts two parameters through the Configuration Manager: the "language", which determines the algorithm of the stemming process thus stating the language of use, and the "iterations", which determines the number of stemming iterations that must be performed on the incoming word removing suffices one at a time, starting at the end of the word and working towards the beginning.

Only the words that may have an affective content are stemmed. This assumption responds to the indexing goal that stemming pursues in Information Retrieval (IR). Read more about this in the article Snowball: A language for stemming algorithms.

Author:
Alexandre Trilla (atrilla@salle.url.edu)

Field Summary
static java.lang.String PROP_ITERATIONS
           
static java.lang.String PROP_LANGUAGE
          The name of the property indicating the language of this Stemmer.
 
Constructor Summary
GenericSnowballStemmer()
          Main constructor of the GenericSnowballStemmer.
 
Method Summary
 void applyStemming(TextData inputTextDataObject)
          Method to perform the stemming process.
 void initialize()
          Method to initialize the GenericSnowballStemmer.
 void newProperties(PropertySheet ps)
          This method is called when this configurable component has new data.
 void register(java.lang.String name, Registry registry)
          Register my properties.
 
Methods inherited from class emolib.stemmer.Stemmer
getData
 
Methods inherited from class emolib.util.proc.TextDataProcessor
flush, getName, getPredecessor, setPredecessor, toString
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

PROP_LANGUAGE

public static final java.lang.String PROP_LANGUAGE
The name of the property indicating the language of this Stemmer.

See Also:
Constant Field Values

PROP_ITERATIONS

public static final java.lang.String PROP_ITERATIONS
See Also:
Constant Field Values
Constructor Detail

GenericSnowballStemmer

public GenericSnowballStemmer()
Main constructor of the GenericSnowballStemmer.

Method Detail

register

public void register(java.lang.String name,
                     Registry registry)
              throws PropertyException
Description copied from interface: Configurable
Register my properties. This method is called once early in the time of the component, shortly after the component is constructed. This component should register any configuration properties that it needs to register. If this configurable extends another configurable, super.register should also be called

Specified by:
register in interface Configurable
Overrides:
register in class Stemmer
Parameters:
name - the name of the component
registry - the registry for this component
Throws:
PropertyException

newProperties

public void newProperties(PropertySheet ps)
                   throws PropertyException
Description copied from interface: Configurable
This method is called when this configurable component has new data. The component should first validate the data. If it is bad the component should return false. If the data is good, the component should record the the data internally and return true.

Specified by:
newProperties in interface Configurable
Overrides:
newProperties in class Stemmer
Parameters:
ps - a property sheet holding the new data
Throws:
PropertyException - if there is a problem with the properties.

initialize

public void initialize()
Method to initialize the GenericSnowballStemmer.

Specified by:
initialize in interface DataProcessor
Overrides:
initialize in class Stemmer

applyStemming

public void applyStemming(TextData inputTextDataObject)
Method to perform the stemming process.

Specified by:
applyStemming in class Stemmer
Parameters:
inputTextDataObject - The TextData object to process.