| 
 | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectemolib.util.proc.TextDataProcessor
emolib.splitter.SentenceSplitter
emolib.splitter.bdt.SentenceSplitterBDT
public class SentenceSplitterBDT
The SentenceSplitterBDT class performs the sentence segmentation process through a hand-crafted Binary Decision Tree (BDT).
The decision tree for sentence boundary detection has been inspired by the one that appears in (Reichel and Pfitzinger, 2006). Due to the fact that the tokens are independent (they don't end with a punctuation mark) the tree has not been kept the same. Refer to the article for more details.
 The diagram below shows this decision tree implementation
 (YES vs. NO):
 
 
 All the input sentences are required to be delimited by either a dot, an exclamation mark or a question mark.
 --
 (Reichel and Pfitzinger, 2006) Reichel, U.D. and Pfitzinger, H.R.,
 "Text Preprocessing for Speech Synthesis",
 In Proc. TC-Star Speech to Speech Translation Workshop, pp 207-212., 2006.
 
| Constructor Summary | |
|---|---|
| SentenceSplitterBDT()Main constructor of the SentenceSplitterBDT. | |
| Method Summary | |
|---|---|
|  void | applySentenceSplitting(TextData inputTextDataObject)Method to perform the sentence segmentation process. | 
|  void | initialize()Method to initialize the SentenceSplitterBDT. | 
|  void | newProperties(PropertySheet ps)This method is called when this configurable component has new data. | 
|  void | register(java.lang.String name,
         Registry registry)Register my properties. | 
| Methods inherited from class emolib.splitter.SentenceSplitter | 
|---|
| getData | 
| Methods inherited from class emolib.util.proc.TextDataProcessor | 
|---|
| flush, getName, getPredecessor, setPredecessor, toString | 
| Methods inherited from class java.lang.Object | 
|---|
| clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait | 
| Constructor Detail | 
|---|
public SentenceSplitterBDT()
| Method Detail | 
|---|
public void register(java.lang.String name,
                     Registry registry)
              throws PropertyException
Configurable
register in interface Configurableregister in class SentenceSplittername - the name of the componentregistry - the registry for this component
PropertyException
public void newProperties(PropertySheet ps)
                   throws PropertyException
Configurable
newProperties in interface ConfigurablenewProperties in class SentenceSplitterps - a property sheet holding the new data
PropertyException - if there is a problem with the properties.public void initialize()
initialize in interface DataProcessorinitialize in class SentenceSplitterpublic void applySentenceSplitting(TextData inputTextDataObject)
applySentenceSplitting in class SentenceSplitterinputTextDataObject - The TextData object to process.| 
 | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||