| 
 | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectemolib.util.proc.TextDataProcessor
emolib.pos.POSTagger
public abstract class POSTagger
The POSTagger abstract class defines the general structure to perform the Part-Of-Speech (POS) tagging process, which determines the correct function of a word in a sentence.
Nouns, adjectives and verbs share the same regular expression, thus their function cannot be inferred with a simple tokeniser. These classes of word need to be disambiguated according to the context where they appear.
In order to overcome this problem, two approaches can be presented:
Any well coded POS tagger should set the incoming words as nouns, verbs or adjectives according to the results obtained.
| Constructor Summary | |
|---|---|
| POSTagger()Main constructor of the POSTagger. | |
| Method Summary | |
|---|---|
| abstract  void | applyPOSTagging(TextData inputTextDataObject)Method to perform the POS tagging process. | 
|  Data | getData()Obtains the TextData from the previous module, processes it and makes it available to the rest of the text processing chain. | 
|  void | initialize()Method to initialize the POSTagger. | 
|  void | newProperties(PropertySheet ps)This method is called when this configurable component has new data. | 
|  void | register(java.lang.String name,
         Registry registry)Register my properties. | 
| Methods inherited from class emolib.util.proc.TextDataProcessor | 
|---|
| flush, getName, getPredecessor, setPredecessor, toString | 
| Methods inherited from class java.lang.Object | 
|---|
| clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait | 
| Constructor Detail | 
|---|
public POSTagger()
| Method Detail | 
|---|
public void register(java.lang.String name,
                     Registry registry)
              throws PropertyException
Configurable
register in interface Configurableregister in class TextDataProcessorname - the name of the componentregistry - the registry for this component
PropertyException
public void newProperties(PropertySheet ps)
                   throws PropertyException
Configurable
newProperties in interface ConfigurablenewProperties in class TextDataProcessorps - a property sheet holding the new data
PropertyException - if there is a problem with the properties.
public Data getData()
             throws DataProcessingException
getData in interface DataProcessorgetData in class TextDataProcessorDataProcessingException - If there is a processing error.public void initialize()
initialize in interface DataProcessorinitialize in class TextDataProcessorpublic abstract void applyPOSTagging(TextData inputTextDataObject)
inputTextDataObject - The TextData object to process.| 
 | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||