本示例说明如何使用BreakIterator.getSentenceInstance()来将段落分为组成该段落的句子。要获取BreakIterator实例,我们调用getSentenceInstance()factory方法并传递语言环境信息。
在该count(BreakIterator bi, String source)方法中,我们迭代中断以提取组成该段落的句子,该段落的值存储在paragraph变量中。
package org.nhooo.example.text; import java.text.BreakIterator; import java.util.Locale; public class BreakSentenceExample { public static void main(String[] args) { String paragraph = "Line boundary analysis determines where a text " + "string can be broken when line-wrapping. The " + "mechanism correctly handles punctuation and " + "hyphenated words. Actual line breaking needs to " + "also consider the available line width and is " + "handled by higher-level software. "; BreakIterator iterator = BreakIterator.getSentenceInstance(Locale.US); int sentences = count(iterator, paragraph); System.out.println("Number of sentences: " + sentences); } private static int count(BreakIterator bi, String source) { int counter = 0; bi.setText(source); int lastIndex = bi.first(); while (lastIndex != BreakIterator.DONE) { int firstIndex = lastIndex; lastIndex = bi.next(); if (lastIndex != BreakIterator.DONE) { String sentence = source.substring(firstIndex, lastIndex); System.out.println("sentence = " + sentence); counter++; } } return counter; } }
我们的程序将在控制台屏幕上打印以下结果:
sentence = Line boundary analysis determines where a text string can be broken when line-wrapping. sentence = The mechanism correctly handles punctuation and hyphenated words. sentence = Actual line breaking needs to also consider the available line width and is handled by higher-level software. Number of sentences: 3