public class HyphenatedPTB extends Object
Constructor and Description |
---|
HyphenatedPTB() |
Modifier and Type | Method and Description |
---|---|
static void |
main(String[] args) |
static int |
tokenLengthCheckingForHyphenatedTerms(String lowerCasedString)
There is the fixed list of hyphenated words to not be split (hyphenatedWordsLookup)
And here are some made-up examples of words using affixes to keep together
chronic-itis 1 suffix
mega-huge 1 prefix
e-game-fest 1 prefix and 1 suffix
salon-o-torium 1 suffix that contains 2 hyphens
urban-esque-wise 2 suffixes
|
public static void main(String[] args)
public static int tokenLengthCheckingForHyphenatedTerms(String lowerCasedString)
lowerCasedString
- because of "-o-torium", input might contain more than 1 hyphen....Copyright © 2012-2017 The Apache Software Foundation. All Rights Reserved.