public class SentenceDetector
extends JCasAnnotator_ImplBase
Modifier and Type | Field and Description |
---|---|
private Logger |
logger |
private java.lang.String |
NEWLINE |
static java.lang.String |
PARAM_SD_MODEL_FILE |
static java.lang.String |
PARAM_SEGMENTS_TO_SKIP
Value is "SegmentsToSkip".
|
static java.lang.String |
SD_MODEL_FILE_PARAM |
private opennlp.tools.sentdetect.SentenceModel |
sdmodel |
private java.lang.String |
sdModelPath |
private SentenceDetectorCtakes |
sentenceDetector |
private java.lang.String[] |
skipSegmentsArray |
private java.util.Set<java.lang.String> |
skipSegmentsSet |
Constructor and Description |
---|
SentenceDetector() |
Modifier and Type | Method and Description |
---|---|
protected int |
annotateRange(JCas jcas,
java.lang.String text,
Segment section,
int sentenceCount)
Detect sentences within a section of the text and add annotations to the
CAS.
|
static AnalysisEngineDescription |
createAnnotatorDescription() |
static java.io.File |
getFileInExistingDir(java.lang.String fn) |
static java.io.File |
getReadableFile(java.lang.String fn) |
void |
initialize(UimaContext aContext) |
static void |
main(java.lang.String[] args)
Train a new sentence detector from the training data in the first file
and write the model to the second file.
The training data file is expected to have one sentence per line. |
static int |
parseInt(java.lang.String s,
Logger log) |
void |
process(JCas jcas)
Entry point for processing.
|
static void |
usage(Logger log) |
public static final java.lang.String PARAM_SEGMENTS_TO_SKIP
private java.lang.String[] skipSegmentsArray
private java.util.Set<java.lang.String> skipSegmentsSet
public static final java.lang.String PARAM_SD_MODEL_FILE
public static final java.lang.String SD_MODEL_FILE_PARAM
private java.lang.String sdModelPath
private opennlp.tools.sentdetect.SentenceModel sdmodel
private SentenceDetectorCtakes sentenceDetector
private java.lang.String NEWLINE
private Logger logger
public void initialize(UimaContext aContext) throws ResourceInitializationException
ResourceInitializationException
public void process(JCas jcas) throws AnalysisEngineProcessException
AnalysisEngineProcessException
protected int annotateRange(JCas jcas, java.lang.String text, Segment section, int sentenceCount)
jcas
- view of the CAS containing the text to run sentence detector
againsttext
- the document textsection
- the section this sentence is insentenceCount
- the number of sentences added already to the CAS (if
processing one section at a time)sentenceCount
and the number of
Sentence annotations added to the CAS for this sectionAnnotatorProcessException
public static AnalysisEngineDescription createAnnotatorDescription() throws ResourceInitializationException
ResourceInitializationException
public static void main(java.lang.String[] args) throws java.io.IOException
args
- training_data_filename name_of_model_to_create iters? cutoff?java.io.IOException
public static void usage(Logger log)
public static int parseInt(java.lang.String s, Logger log)
public static java.io.File getReadableFile(java.lang.String fn) throws java.io.IOException
java.io.IOException
public static java.io.File getFileInExistingDir(java.lang.String fn) throws java.io.IOException
java.io.IOException