@PipeBitInfo(name="Sentence Detector", description="Annotates Sentences based upon an OpenNLP model.", dependencies=SECTION, products=SENTENCE) public class SentenceDetector extends org.apache.uima.fit.component.JCasAnnotator_ImplBase
Modifier and Type | Field and Description |
---|---|
static String |
PARAM_SD_MODEL_FILE |
static String |
PARAM_SEGMENTS_TO_SKIP
Value is "SegmentsToSkip".
|
static String |
SD_MODEL_FILE_PARAM |
Constructor and Description |
---|
SentenceDetector() |
Modifier and Type | Method and Description |
---|---|
protected int |
annotateRange(org.apache.uima.jcas.JCas jcas,
String text,
Segment section,
int sentenceCount)
Detect sentences within a section of the text and add annotations to the
CAS.
|
static org.apache.uima.analysis_engine.AnalysisEngineDescription |
createAnnotatorDescription() |
static File |
getFileInExistingDir(String fn) |
static File |
getReadableFile(String fn) |
void |
initialize(org.apache.uima.UimaContext aContext) |
static void |
main(String[] args)
Train a new sentence detector from the training data in the first file
and write the model to the second file.
The training data file is expected to have one sentence per line. |
static int |
parseInt(String s,
org.apache.log4j.Logger log) |
void |
process(org.apache.uima.jcas.JCas jcas)
Entry point for processing.
|
static void |
usage(org.apache.log4j.Logger log) |
getRequiredCasInterface, process
getCasInstancesRequired, hasNext, next
public static final String PARAM_SEGMENTS_TO_SKIP
public static final String PARAM_SD_MODEL_FILE
public static final String SD_MODEL_FILE_PARAM
public void initialize(org.apache.uima.UimaContext aContext) throws org.apache.uima.resource.ResourceInitializationException
initialize
in interface org.apache.uima.analysis_component.AnalysisComponent
initialize
in class org.apache.uima.fit.component.JCasAnnotator_ImplBase
org.apache.uima.resource.ResourceInitializationException
public void process(org.apache.uima.jcas.JCas jcas) throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
process
in class org.apache.uima.analysis_component.JCasAnnotator_ImplBase
org.apache.uima.analysis_engine.AnalysisEngineProcessException
protected int annotateRange(org.apache.uima.jcas.JCas jcas, String text, Segment section, int sentenceCount)
jcas
- view of the CAS containing the text to run sentence detector
againsttext
- the document textsection
- the section this sentence is insentenceCount
- the number of sentences added already to the CAS (if
processing one section at a time)sentenceCount
and the number of
Sentence annotations added to the CAS for this sectionorg.apache.uima.analysis_engine.annotator.AnnotatorProcessException
public static org.apache.uima.analysis_engine.AnalysisEngineDescription createAnnotatorDescription() throws org.apache.uima.resource.ResourceInitializationException
org.apache.uima.resource.ResourceInitializationException
public static void main(String[] args) throws IOException
args
- training_data_filename name_of_model_to_create iters? cutoff?IOException
public static void usage(org.apache.log4j.Logger log)
public static int parseInt(String s, org.apache.log4j.Logger log)
public static File getReadableFile(String fn) throws IOException
IOException
public static File getFileInExistingDir(String fn) throws IOException
IOException
Copyright © 2012-2017 The Apache Software Foundation. All Rights Reserved.