public final class OverlapJCasTermAnnotator extends AbstractJCasTermAnnotator
Modifier and Type | Field and Description |
---|---|
private int |
_consecutiveSkipMax |
private Logger |
_logger |
private int |
_totalSkipMax |
private static java.lang.String |
CONS_SKIP_PRP_KEY
specifies the number of consecutive non-comma tokens that can be skipped
|
private static java.lang.String |
TOTAL_SKIP_PRP_KEY
specifies the number of total tokens that can be skipped
|
_minimumLookupSpan, PARAM_EXC_TAGS_PRP, PARAM_MIN_SPAN_PRP, PARAM_WINDOW_ANNOT_PRP
DICTIONARY_DESCRIPTOR_KEY
Constructor and Description |
---|
OverlapJCasTermAnnotator() |
Modifier and Type | Method and Description |
---|---|
private static java.lang.String[] |
fastSplit(java.lang.String line,
int tokenCount) |
void |
findTerms(RareWordDictionary dictionary,
java.util.List<FastLookupToken> allTokens,
java.util.List<java.lang.Integer> lookupTokenIndices,
CollectionMap<TextSpan,java.lang.Long,? extends java.util.Collection<java.lang.Long>> termsFromDictionary)
Given a dictionary, tokens, and lookup token indices, populate a terms collection with discovered terms
|
private static TextSpan |
getOverlapTerm(java.util.List<FastLookupToken> allTokens,
int lookupTokenIndex,
RareWordTerm rareWordHit,
int consecutiveSkipMax,
int totalSkipMax)
Check to see if a given term overlaps a set of tokens
|
void |
initialize(UimaContext uimaContext)
Set the number of consecutive and total tokens that can be skipped (optional).
|
getAnnotationsInWindow, getDictionaries, isWindowOk, parseInt, process, processWindow
private final Logger _logger
private int _consecutiveSkipMax
private int _totalSkipMax
private static final java.lang.String CONS_SKIP_PRP_KEY
private static final java.lang.String TOTAL_SKIP_PRP_KEY
public void initialize(UimaContext uimaContext) throws ResourceInitializationException
initialize
in class AbstractJCasTermAnnotator
ResourceInitializationException
public void findTerms(RareWordDictionary dictionary, java.util.List<FastLookupToken> allTokens, java.util.List<java.lang.Integer> lookupTokenIndices, CollectionMap<TextSpan,java.lang.Long,? extends java.util.Collection<java.lang.Long>> termsFromDictionary)
dictionary
- -allTokens
- -lookupTokenIndices
- -termsFromDictionary
- -private static TextSpan getOverlapTerm(java.util.List<FastLookupToken> allTokens, int lookupTokenIndex, RareWordTerm rareWordHit, int consecutiveSkipMax, int totalSkipMax)
allTokens
- all tokens in a windowlookupTokenIndex
- index of rare word in the window of all tokensrareWordHit
- some possible termprivate static java.lang.String[] fastSplit(java.lang.String line, int tokenCount)