| 
 | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectemolib.util.proc.TextDataProcessor
emolib.tokenizer.Tokenizer
emolib.tokenizer.lexer.english.EnglishLexer
public class EnglishLexer
Inherits the common methods and functions from the Tokenizer and implements an English lexical analyzer with JavaCC.
The tokens used by this lexer correpond to the English grammatical tokens proposed by David García. If other tagging guidelines are desired, refer to [Santorini, 1995] or the tagset proposed by the EAGLES group, used by the FreeLing project.
Nouns, adjectives and verbs, the tokens that don't match with any of the given tags are considered to have affective meaning.
No syntax is implemented in this lexer, thus a bag of words is used instead. The tokens are defined to not contain any space, otherwise, the modules that follow might be in trouble.
 --
 [Santorini, 1995] Santorini, B., "Part-of-Speech Tagging Guidelines for the Penn
 Treebank Project", (3rd revision, 2nd printing). Technical Report, Department of
 Computer and Information Science, University of Pennsylvania, 1995.
 
| Field Summary | |
|---|---|
|  Token | jj_ntNext token. | 
|  Token | tokenCurrent token. | 
|  EnglishLexerTokenManager | token_sourceGenerated Token Manager. | 
| Fields inherited from class emolib.tokenizer.Tokenizer | 
|---|
| negation, negativeModifier1, negativeModifier2, negativeModifier3, positiveModifier1, positiveModifier2, positiveModifier3, PROP_NEGATION, PROP_NEGATIVE_MODIFIER_1, PROP_NEGATIVE_MODIFIER_2, PROP_NEGATIVE_MODIFIER_3, PROP_POSITIVE_MODIFIER_1, PROP_POSITIVE_MODIFIER_2, PROP_POSITIVE_MODIFIER_3 | 
| Constructor Summary | |
|---|---|
| EnglishLexer()Void constructor needed to by the configuration manager to perform the instantiation. | |
| EnglishLexer(EnglishLexerTokenManager tm)Constructor with generated Token Manager. | |
| EnglishLexer(java.io.InputStream stream)Constructor with InputStream. | |
| EnglishLexer(java.io.InputStream stream,
             java.lang.String encoding)Constructor with InputStream and supplied encoding | |
| EnglishLexer(java.io.Reader stream)Constructor. | |
| Method Summary | |
|---|---|
|  void | disable_tracing()Disable tracing. | 
|  void | enable_tracing()Enable tracing. | 
|  ParseException | generateParseException()Generate ParseException. | 
|  Tokenizer | getNew(java.lang.String initialization)Function to obtain a new initialized instance of the Tokenizer. | 
|  Token | getNextToken()Get the next Token. | 
|  Token | getToken(int index)Get the specific Token. | 
|  void | parseEnglishGrammar() | 
|  void | parseGrammar()Method to parse the incoming text with the well defined grammar. | 
|  void | ReInit(EnglishLexerTokenManager tm)Reinitialise. | 
|  void | ReInit(java.io.InputStream stream)Reinitialise. | 
|  void | ReInit(java.io.InputStream stream,
       java.lang.String encoding)Reinitialise. | 
|  void | ReInit(java.io.Reader stream)Reinitialise. | 
| Methods inherited from class emolib.tokenizer.Tokenizer | 
|---|
| fillConfigurationValues, getData, getPossibleEmotionalContent, getWord, getWordClass, getWordModifierValue, initialize, inputData, newProperties, putModifierValue, putWord, putWordClass, register, setPossibleEmotionalContent | 
| Methods inherited from class emolib.util.proc.TextDataProcessor | 
|---|
| flush, getName, getPredecessor, setPredecessor, toString | 
| Methods inherited from class java.lang.Object | 
|---|
| clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait | 
| Field Detail | 
|---|
public EnglishLexerTokenManager token_source
public Token token
public Token jj_nt
| Constructor Detail | 
|---|
public EnglishLexer()
public EnglishLexer(java.io.InputStream stream)
public EnglishLexer(java.io.InputStream stream,
                    java.lang.String encoding)
public EnglishLexer(java.io.Reader stream)
public EnglishLexer(EnglishLexerTokenManager tm)
| Method Detail | 
|---|
public Tokenizer getNew(java.lang.String initialization)
Tokenizer
getNew in class Tokenizerinitialization - The string to initialize the new Tokenizer.
public void parseGrammar()
                  throws java.lang.Exception
Tokenizer
parseGrammar in class Tokenizerjava.lang.Exception - If a ParseException occurs.
public final void parseEnglishGrammar()
                               throws ParseException
ParseExceptionpublic void ReInit(java.io.InputStream stream)
public void ReInit(java.io.InputStream stream,
                   java.lang.String encoding)
public void ReInit(java.io.Reader stream)
public void ReInit(EnglishLexerTokenManager tm)
public final Token getNextToken()
public final Token getToken(int index)
public ParseException generateParseException()
public final void enable_tracing()
public final void disable_tracing()
| 
 | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||