emolib.wsd
Class WordSenseDisambiguator

java.lang.Object
  extended by emolib.util.proc.TextDataProcessor
      extended by emolib.wsd.WordSenseDisambiguator
All Implemented Interfaces:
Configurable, DataProcessor
Direct Known Subclasses:
OpenThesWSD, SimLibWSD

public abstract class WordSenseDisambiguator
extends TextDataProcessor

The WordSenseDisambiguator abstract class defines the general structure to perform the Word Sense Disambiguation process, which determines the correct sense of polysemous words in context.

A rich variety of techniques have been researched, from dictionary-based methods that use the knowledge encoded in lexical resources, to supervised machine learning methods in which a classifier is trained for each distinct word on a corpus of manually sense-annotated examples, to completely unsupervised methods that cluster occurrences of words, thereby inducing word senses. Among these, supervised learning approaches have been the most successful algorithms to date.

Author:
Alexandre Trilla (atrilla@salle.url.edu)

Constructor Summary
WordSenseDisambiguator()
          Main constructor of the WordSenseDisambiguator.
 
Method Summary
abstract  void applyWSD(TextData inputTextDataObject)
          Method to perform the word sense disambiguation process.
 Data getData()
          Obtains the TextData from the previous module, processes it and makes it available to the rest of the text processing chain.
 void initialize()
          Method to initialize the WordSenseDisambiguator.
 void newProperties(PropertySheet ps)
          This method is called when this configurable component has new data.
 void register(java.lang.String name, Registry registry)
          Register my properties.
 
Methods inherited from class emolib.util.proc.TextDataProcessor
flush, getName, getPredecessor, setPredecessor, toString
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

WordSenseDisambiguator

public WordSenseDisambiguator()
Main constructor of the WordSenseDisambiguator.

Method Detail

register

public void register(java.lang.String name,
                     Registry registry)
              throws PropertyException
Description copied from interface: Configurable
Register my properties. This method is called once early in the time of the component, shortly after the component is constructed. This component should register any configuration properties that it needs to register. If this configurable extends another configurable, super.register should also be called

Specified by:
register in interface Configurable
Overrides:
register in class TextDataProcessor
Parameters:
name - the name of the component
registry - the registry for this component
Throws:
PropertyException

newProperties

public void newProperties(PropertySheet ps)
                   throws PropertyException
Description copied from interface: Configurable
This method is called when this configurable component has new data. The component should first validate the data. If it is bad the component should return false. If the data is good, the component should record the the data internally and return true.

Specified by:
newProperties in interface Configurable
Overrides:
newProperties in class TextDataProcessor
Parameters:
ps - a property sheet holding the new data
Throws:
PropertyException - if there is a problem with the properties.

getData

public Data getData()
             throws DataProcessingException
Obtains the TextData from the previous module, processes it and makes it available to the rest of the text processing chain.

Specified by:
getData in interface DataProcessor
Specified by:
getData in class TextDataProcessor
Returns:
The next available Data object, returns null if no Data object is available.
Throws:
DataProcessingException - If there is a processing error.

initialize

public void initialize()
Method to initialize the WordSenseDisambiguator.

Specified by:
initialize in interface DataProcessor
Overrides:
initialize in class TextDataProcessor

applyWSD

public abstract void applyWSD(TextData inputTextDataObject)
Method to perform the word sense disambiguation process.

Parameters:
inputTextDataObject - The TextData object to process.