Interface | Description |
---|---|
ITokenizer |
Splits input characters into tokens representing e.g.
|
Class | Description |
---|---|
ExtendedWhitespaceTokenizer |
A tokenizer separating input characters on whitespace, but capable of extracting more
complex tokens, such as URLs, e-mail addresses and sentence delimiters.
|
ExtendedWhitespaceTokenizerImpl | |
TokenTypeUtils |
Utility methods for working with
ITokenizer attributes. |
Lexical analysis utilities.