SpanishLexer (EmoLib)

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

emolib.tokenizer.lexer.spanish
Class SpanishLexer

java.lang.Object
  emolib.util.proc.TextDataProcessor
      emolib.tokenizer.Tokenizer
          emolib.tokenizer.lexer.spanish.SpanishLexer

All Implemented Interfaces:: SpanishLexerConstants, Configurable, DataProcessor

public class SpanishLexer
extends Tokenizer
implements SpanishLexerConstants
extends Tokenizer
implements SpanishLexerConstants

Inherits the common methods and functions from the Tokenizer and implements a Spanish lexical analyzer with JavaCC.

Spanish is the language aimed by default in EmoLib. The tokens used by this lexer correpond to the Spanish grammatical tokens proposed by David García. If other tagging guidelines are desired, refer to [Santorini, 1995] or the tagset proposed by the EAGLES group, used by the FreeLing project.

Nouns, adjectives and verbs, the tokens that don't match with any of the given tags are considered to have affective meaning.

No syntax is implemented in this lexer, thus a bag of words is used instead.

--
[Santorini, 1995] Santorini, B., "Part-of-Speech Tagging Guidelines for the Penn Treebank Project", (3rd revision, 2nd printing). Technical Report, Department of Computer and Information Science, University of Pennsylvania, 1995.

Author:: David García, Alexandre Trilla (atrilla@salle.url.edu)

Field Summary
`Token`	`jj_nt` Next token.
`Token`	`token` Current token.
`SpanishLexerTokenManager`	`token_source` Generated Token Manager.

Fields inherited from class emolib.tokenizer.Tokenizer
`negation, negativeModifier1, negativeModifier2, negativeModifier3, positiveModifier1, positiveModifier2, positiveModifier3, PROP_NEGATION, PROP_NEGATIVE_MODIFIER_1, PROP_NEGATIVE_MODIFIER_2, PROP_NEGATIVE_MODIFIER_3, PROP_POSITIVE_MODIFIER_1, PROP_POSITIVE_MODIFIER_2, PROP_POSITIVE_MODIFIER_3`

Fields inherited from interface emolib.tokenizer.lexer.spanish.SpanishLexerConstants
ADVERBIO_AFIRMACION, ADVERBIO_CUANTITATIVO_NEG_1, ADVERBIO_CUANTITATIVO_NEG_2, ADVERBIO_CUANTITATIVO_NEG_3, ADVERBIO_CUANTITATIVO_POS_1, ADVERBIO_CUANTITATIVO_POS_2, ADVERBIO_CUANTITATIVO_POS_3, ADVERBIO_LUGAR, ADVERBIO_MODO, ADVERBIO_NEGACION, ADVERBIO_PROBABILIDAD, ADVERBIO_TIEMPO, ARTICULO_DETERMINADO, ARTICULO_FUSION, ARTICULO_INDETERMINADO, BLANK, CONJUNCION_ADVERSATIVA, CONJUNCION_CAUSAL, CONJUNCION_COPULATIVA, CONJUNCION_DISYUNTIVA, CONJUNCION_FINAL, CONJUNCION_TEMPORAL, DEFAULT, DEMOSTRATIVO, DIGITO, EOF, ESPECIAL, ESPECIFICACION, EXCLAMATIVA, FIN_FRASE, INDEFINIDO_CUANTITATIVO, INDEFINIDO_DISTRIBUTIVO, INTERROGATIVA, LETRA, NUMERAL, OTRO, POSESIVO_1, POSESIVO_2, POSESIVO_3, PREPOSICION, PRONOMBRE_1, PRONOMBRE_2, PRONOMBRE_3, PRONOMBRE_REL, SALTO_CR, SALTO_CRLF, SALTO_LF, SIMBOLO_NEUTRO, TAB, tokenImage

Fields inherited from interface emolib.tokenizer.lexer.spanish.SpanishLexerConstants

ADVERBIO_AFIRMACION, ADVERBIO_CUANTITATIVO_NEG_1, ADVERBIO_CUANTITATIVO_NEG_2, ADVERBIO_CUANTITATIVO_NEG_3, ADVERBIO_CUANTITATIVO_POS_1, ADVERBIO_CUANTITATIVO_POS_2, ADVERBIO_CUANTITATIVO_POS_3, ADVERBIO_LUGAR, ADVERBIO_MODO, ADVERBIO_NEGACION, ADVERBIO_PROBABILIDAD, ADVERBIO_TIEMPO, ARTICULO_DETERMINADO, ARTICULO_FUSION, ARTICULO_INDETERMINADO, BLANK, CONJUNCION_ADVERSATIVA, CONJUNCION_CAUSAL, CONJUNCION_COPULATIVA, CONJUNCION_DISYUNTIVA, CONJUNCION_FINAL, CONJUNCION_TEMPORAL, DEFAULT, DEMOSTRATIVO, DIGITO, EOF, ESPECIAL, ESPECIFICACION, EXCLAMATIVA, FIN_FRASE, INDEFINIDO_CUANTITATIVO, INDEFINIDO_DISTRIBUTIVO, INTERROGATIVA, LETRA, NUMERAL, OTRO, POSESIVO_1, POSESIVO_2, POSESIVO_3, PREPOSICION, PRONOMBRE_1, PRONOMBRE_2, PRONOMBRE_3, PRONOMBRE_REL, SALTO_CR, SALTO_CRLF, SALTO_LF, SIMBOLO_NEUTRO, TAB, tokenImage

Constructor Summary
`SpanishLexer()` Void constructor needed to by the configuration manager to perform the instantiation.
`SpanishLexer(java.io.InputStream stream)` Constructor with InputStream.
`SpanishLexer(java.io.InputStream stream, java.lang.String encoding)` Constructor with InputStream and supplied encoding
`SpanishLexer(java.io.Reader stream)` Constructor.
`SpanishLexer(SpanishLexerTokenManager tm)` Constructor with generated Token Manager.

Method Summary
`void`	`disable_tracing()` Disable tracing.
`void`	`enable_tracing()` Enable tracing.
`ParseException`	`generateParseException()` Generate ParseException.
`Tokenizer`	`getNew(java.lang.String initialization)` Function to obtain a new initialized instance of the Tokenizer.
`Token`	`getNextToken()` Get the next Token.
`Token`	`getToken(int index)` Get the specific Token.
`void`	`parseGrammar()` Method to parse the incoming text with the well defined grammar.
`void`	`parseSpanishGrammar()`
`void`	`ReInit(java.io.InputStream stream)` Reinitialise.
`void`	`ReInit(java.io.InputStream stream, java.lang.String encoding)` Reinitialise.
`void`	`ReInit(java.io.Reader stream)` Reinitialise.
`void`	`ReInit(SpanishLexerTokenManager tm)` Reinitialise.

Methods inherited from class emolib.tokenizer.Tokenizer
`fillConfigurationValues, getData, getPossibleEmotionalContent, getWord, getWordClass, getWordModifierValue, initialize, inputData, newProperties, putModifierValue, putWord, putWordClass, register, setPossibleEmotionalContent`

Methods inherited from class emolib.util.proc.TextDataProcessor
`flush, getName, getPredecessor, setPredecessor, toString`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait`

Field Detail

token_source

public SpanishLexerTokenManager token_source

Generated Token Manager.

token

public Token token

Current token.

jj_nt

public Token jj_nt

Next token.

Constructor Detail

SpanishLexer

public SpanishLexer()

Void constructor needed to by the configuration manager to perform the instantiation.

SpanishLexer

public SpanishLexer(java.io.InputStream stream)

Constructor with InputStream.

SpanishLexer

public SpanishLexer(java.io.InputStream stream,
                    java.lang.String encoding)

Constructor with InputStream and supplied encoding

SpanishLexer

public SpanishLexer(java.io.Reader stream)

Constructor.

SpanishLexer

public SpanishLexer(SpanishLexerTokenManager tm)

Constructor with generated Token Manager.

Method Detail

getNew

public Tokenizer getNew(java.lang.String initialization)

Description copied from class: Tokenizer

Function to obtain a new initialized instance of the Tokenizer. The real (not abstract) tokenizers should override this function.

Specified by:: getNew in class Tokenizer

Parameters:: initialization - The string to initialize the new Tokenizer.
Returns:: The new Tokenizer.

parseGrammar

public void parseGrammar()
                  throws java.lang.Exception

Description copied from class: Tokenizer

Method to parse the incoming text with the well defined grammar.

Specified by:: parseGrammar in class Tokenizer

Throws:: java.lang.Exception - If a ParseException occurs.

parseSpanishGrammar

public final void parseSpanishGrammar()
                               throws ParseException

Throws:: ParseException

ReInit

public void ReInit(java.io.InputStream stream)

Reinitialise.

ReInit

public void ReInit(java.io.InputStream stream,
                   java.lang.String encoding)

Reinitialise.

ReInit

public void ReInit(java.io.Reader stream)

Reinitialise.

ReInit

public void ReInit(SpanishLexerTokenManager tm)

Reinitialise.

getNextToken

public final Token getNextToken()

Get the next Token.

getToken

public final Token getToken(int index)

Get the specific Token.

generateParseException

public ParseException generateParseException()

Generate ParseException.

enable_tracing

public final void enable_tracing()

Enable tracing.

disable_tracing

public final void disable_tracing()

Disable tracing.

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

emolib.tokenizer.lexer.spanish Class SpanishLexer

token_source

token

jj_nt

SpanishLexer

SpanishLexer

SpanishLexer

SpanishLexer

SpanishLexer

getNew

parseGrammar

parseSpanishGrammar

ReInit

ReInit

ReInit

ReInit

getNextToken

getToken

generateParseException

enable_tracing

disable_tracing

emolib.tokenizer.lexer.spanish
Class SpanishLexer