| 
 | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectemolib.util.eval.CorpusVocabularyStatistics
public class CorpusVocabularyStatistics
The CorpusVocabularyStatistics class performs a vocabulary analysis on the input text file.
The CorpusVocabularyStatistics outputs the total vocabulary size (size of training corpus), the vocabulary size (number of words with a frequency over 20, 15, 10, 5 and 3) and the amount of observed bigrams wrt the number of possible events.
| Constructor Summary | |
|---|---|
| CorpusVocabularyStatistics()Void constructor. | |
| Method Summary | |
|---|---|
|  FeatureBox | getFeatures(java.lang.String text)Function to extract the features from the given text. | 
| static void | main(java.lang.String[] args) | 
|  void | printSynopsis()Prints the synopsis. | 
| Methods inherited from class java.lang.Object | 
|---|
| clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait | 
| Constructor Detail | 
|---|
public CorpusVocabularyStatistics()
| Method Detail | 
|---|
public void printSynopsis()
public FeatureBox getFeatures(java.lang.String text)
text - The given text.
public static void main(java.lang.String[] args)
                 throws java.lang.Exception
java.lang.Exception| 
 | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||