SentenceDetector

java.lang.Object
- JCasAnnotator_ImplBase
- - org.apache.ctakes.core.ae.SentenceDetector

public class SentenceDetector
extends JCasAnnotator_ImplBase

Wraps the OpenNLP sentence detector in a UIMA annotator

Field Summary

Fields
Modifier and Type	Field and Description
`private Logger`	`logger`
`private java.lang.String`	`NEWLINE`
`static java.lang.String`	`PARAM_SD_MODEL_FILE`
`static java.lang.String`	`PARAM_SEGMENTS_TO_SKIP` Value is "SegmentsToSkip".
`static java.lang.String`	`SD_MODEL_FILE_PARAM`
`private opennlp.tools.sentdetect.SentenceModel`	`sdmodel`
`private java.lang.String`	`sdModelPath`
`private SentenceDetectorCtakes`	`sentenceDetector`
`private java.lang.String[]`	`skipSegmentsArray`
`private java.util.Set<java.lang.String>`	`skipSegmentsSet`

Constructor Summary

Constructors
Constructor and Description

SentenceDetector()

Constructors
Constructor and Description
`SentenceDetector()`

Method Summary

Methods
Modifier and Type	Method and Description
`protected int`	`annotateRange(JCas jcas, java.lang.String text, Segment section, int sentenceCount)` Detect sentences within a section of the text and add annotations to the CAS.
`static AnalysisEngineDescription`	`createAnnotatorDescription()`
`static java.io.File`	`getFileInExistingDir(java.lang.String fn)`
`static java.io.File`	`getReadableFile(java.lang.String fn)`
`void`	`initialize(UimaContext aContext)`
`static void`	`main(java.lang.String[] args)` Train a new sentence detector from the training data in the first file and write the model to the second file. The training data file is expected to have one sentence per line.
`static int`	`parseInt(java.lang.String s, Logger log)`
`void`	`process(JCas jcas)` Entry point for processing.
`static void`	`usage(Logger log)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - PARAM_SEGMENTS_TO_SKIP
```
public static final java.lang.String PARAM_SEGMENTS_TO_SKIP
```
    Value is "SegmentsToSkip". This parameter specifies which sections to skip. The parameter should be of type String, should be multi-valued and optional.
    
    See Also:
    Constant Field Values
  - skipSegmentsArray
```
private java.lang.String[] skipSegmentsArray
```
  - skipSegmentsSet
```
private java.util.Set<java.lang.String> skipSegmentsSet
```
  - PARAM_SD_MODEL_FILE
```
public static final java.lang.String PARAM_SD_MODEL_FILE
```
    See Also:
    Constant Field Values
  - SD_MODEL_FILE_PARAM
```
public static final java.lang.String SD_MODEL_FILE_PARAM
```
    See Also:
    Constant Field Values
  - sdModelPath
```
private java.lang.String sdModelPath
```
  - sdmodel
```
private opennlp.tools.sentdetect.SentenceModel sdmodel
```
  - sentenceDetector
```
private SentenceDetectorCtakes sentenceDetector
```
  - NEWLINE
```
private java.lang.String NEWLINE
```
  - logger
```
private Logger logger
```
- Constructor Detail
  - SentenceDetector
```
public SentenceDetector()
```
- Method Detail
  - initialize
```
public void initialize(UimaContext aContext)
                throws ResourceInitializationException
```
    Throws:
    
    ResourceInitializationException
  - process
```
public void process(JCas jcas)
             throws AnalysisEngineProcessException
```
    Entry point for processing.
    
    Throws:
    
    AnalysisEngineProcessException
  - annotateRange
```
protected int annotateRange(JCas jcas,
                java.lang.String text,
                Segment section,
                int sentenceCount)
```
    Detect sentences within a section of the text and add annotations to the CAS. Uses OpenNLP sentence detector, and then additionally forces sentences to end at end-of-line characters (splitting into multiple sentences). Also trims sentences. And if the sentence detector does happen to form a sentence that is just white space, it will be ignored.
    
    Parameters:
    jcas - view of the CAS containing the text to run sentence detector against
    text - the document text
    section - the section this sentence is in
    sentenceCount - the number of sentences added already to the CAS (if processing one section at a time)
    
    Returns:
    count The sum of sentenceCount and the number of Sentence annotations added to the CAS for this section
    
    Throws:
    
    AnnotatorProcessException
  - createAnnotatorDescription
```
public static AnalysisEngineDescription createAnnotatorDescription()
                                                            throws ResourceInitializationException
```
    Throws:
    
    ResourceInitializationException
  - main
```
public static void main(java.lang.String[] args)
                 throws java.io.IOException
```
    Train a new sentence detector from the training data in the first file and write the model to the second file.
    The training data file is expected to have one sentence per line.
    
    Parameters:
    args - training_data_filename name_of_model_to_create iters? cutoff?
    
    Throws:
    
    java.io.IOException
  - usage
```
public static void usage(Logger log)
```
  - parseInt
```
public static int parseInt(java.lang.String s,
           Logger log)
```
  - getReadableFile
```
public static java.io.File getReadableFile(java.lang.String fn)
                                    throws java.io.IOException
```
    Throws:
    
    java.io.IOException
  - getFileInExistingDir
```
public static java.io.File getFileInExistingDir(java.lang.String fn)
                                         throws java.io.IOException
```
    Throws:
    
    java.io.IOException

Class SentenceDetector

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

PARAM_SEGMENTS_TO_SKIP

skipSegmentsArray

skipSegmentsSet

PARAM_SD_MODEL_FILE

SD_MODEL_FILE_PARAM

sdModelPath

sdmodel

sentenceDetector

NEWLINE

logger

Constructor Detail

SentenceDetector

Method Detail

initialize

process

annotateRange

createAnnotatorDescription

main

usage

parseInt

getReadableFile

getFileInExistingDir