SentenceDetectorCtakes (Apache cTAKES 4.0.0 API)

java.lang.Object
- org.apache.ctakes.core.sentence.SentenceDetectorCtakes

```
public class SentenceDetectorCtakes
extends Object
```
A sentence detector for splitting up raw text into sentences.
A maximum entropy model is used to evaluate the characters ".", "!", and "?" in a string to determine if they signify the end of a sentence.

See Also:

in OpenNLP 1.5

Field Summary

Fields
Modifier and Type	Field and Description
`static String`	`NO_SPLIT` Constant indicates no sentence split.
`static String`	`SPLIT` Constant indicates a sentence split.
`protected boolean`	`useTokenEnd`

Constructor Summary

Constructors
Constructor and Description
`SentenceDetectorCtakes(opennlp.tools.ml.model.MaxentModel model, opennlp.tools.sentdetect.DefaultSDContextGenerator cg, opennlp.tools.sentdetect.EndOfSentenceScanner eoss)` Initializes the current instance.

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`double[]`	`getSentenceProbabilities()` Returns the probabilities associated with the most recent calls to sentDetect().
`protected boolean`	`isAcceptableBreak(String s, int fromIndex, int candidateIndex)` Allows subclasses to check an overzealous (read: poorly trained) model from flagging obvious non-breaks as breaks based on some boolean determination of a break's acceptability.
`static void`	`main(String[] args)` Trains a new sentence detection model.
`String[]`	`sentDetect(String s)` Detect sentences in a String.
`int[]`	`sentPosDetect(String s)` Detect the position of the first words of sentences in a String.
`static opennlp.tools.sentdetect.SentenceModel`	`train(String languageCode, opennlp.tools.util.ObjectStream<opennlp.tools.sentdetect.SentenceSample> samples, boolean useTokenEnd, opennlp.tools.dictionary.Dictionary abbreviations)`
`static opennlp.tools.sentdetect.SentenceModel`	`train(String languageCode, opennlp.tools.util.ObjectStream<opennlp.tools.sentdetect.SentenceSample> samples, boolean useTokenEnd, opennlp.tools.dictionary.Dictionary abbreviations, int cutoff, int iterations)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail
- SPLIT
```
public static final String SPLIT
```
  Constant indicates a sentence split.
  
  See Also:
  
  Constant Field Values
- NO_SPLIT
```
public static final String NO_SPLIT
```
  Constant indicates no sentence split.
  
  See Also:
  
  Constant Field Values
- useTokenEnd
```
protected boolean useTokenEnd
```

Constructor Detail

SentenceDetectorCtakes

public SentenceDetectorCtakes(opennlp.tools.ml.model.MaxentModel model,
                              opennlp.tools.sentdetect.DefaultSDContextGenerator cg,
                              opennlp.tools.sentdetect.EndOfSentenceScanner eoss)

Initializes the current instance.

Parameters:: model - the SentenceModel

Method Detail

sentDetect
```
public String[] sentDetect(String s)
```
Detect sentences in a String.

Parameters:

s - The string to be processed.

Returns:

A string array containing individual sentences as elements.

sentPosDetect
```
public int[] sentPosDetect(String s)
```
Detect the position of the first words of sentences in a String.

Parameters:

s - The string to be processed.

Returns:

A integer array containing the positions of the end index of every sentence

See Also:

SentenceDetectorME#sentPosDetect(String)

getSentenceProbabilities
```
public double[] getSentenceProbabilities()
```
Returns the probabilities associated with the most recent calls to sentDetect().

Returns:

probability for each sentence returned for the most recent call to sentDetect. If not applicable an empty array is returned.

isAcceptableBreak
```
protected boolean isAcceptableBreak(String s,
                                    int fromIndex,
                                    int candidateIndex)
```
Allows subclasses to check an overzealous (read: poorly trained) model from flagging obvious non-breaks as breaks based on some boolean determination of a break's acceptability.
The implementation here always returns true, which means that the MaxentModel's outcome is taken as is.

Parameters:

s - the string in which the break occurred.

fromIndex - the start of the segment currently being evaluated

candidateIndex - the index of the candidate sentence ending

Returns:

true if the break is acceptable

train

public static opennlp.tools.sentdetect.SentenceModel train(String languageCode,
                                                           opennlp.tools.util.ObjectStream<opennlp.tools.sentdetect.SentenceSample> samples,
                                                           boolean useTokenEnd,
                                                           opennlp.tools.dictionary.Dictionary abbreviations)
                                                    throws IOException

Throws:: IOException

train

public static opennlp.tools.sentdetect.SentenceModel train(String languageCode,
                                                           opennlp.tools.util.ObjectStream<opennlp.tools.sentdetect.SentenceSample> samples,
                                                           boolean useTokenEnd,
                                                           opennlp.tools.dictionary.Dictionary abbreviations,
                                                           int cutoff,
                                                           int iterations)
                                                    throws IOException

Throws:: IOException

main
```
public static void main(String[] args)
                 throws IOException
```
Trains a new sentence detection model.

Usage: opennlp.tools.sentdetect.SentenceDetectorME data_file new_model_name (iterations cutoff)?

Parameters:

args -

Throws:

IOException

Class SentenceDetectorCtakes

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

SPLIT

NO_SPLIT

useTokenEnd

Constructor Detail

SentenceDetectorCtakes

Method Detail

sentDetect

sentPosDetect

getSentenceProbabilities

isAcceptableBreak

train

train

main