public final class PreprocessingContext extends Object
Modifier and Type | Class and Description |
---|---|
static class |
PreprocessingContext.AllFields
Information about all fields processed for the input
documents . |
class |
PreprocessingContext.AllLabels
Information about words and phrases that might be good cluster label candidates.
|
class |
PreprocessingContext.AllPhrases
Information about all frequently appearing sequences of words found in the input
documents . |
class |
PreprocessingContext.AllStems
Information about all unique stems found in the input
documents . |
class |
PreprocessingContext.AllTokens
Information about all tokens of the input
documents . |
class |
PreprocessingContext.AllWords
Information about all unique words found in the input
documents . |
Modifier and Type | Field and Description |
---|---|
PreprocessingContext.AllFields |
allFields
Information about all fields processed for the input
documents . |
PreprocessingContext.AllLabels |
allLabels
Information about words and phrases that might be good cluster label candidates.
|
PreprocessingContext.AllPhrases |
allPhrases
Information about all frequently appearing sequences of words found in the input
documents . |
PreprocessingContext.AllStems |
allStems
Information about all unique stems found in the input
documents . |
PreprocessingContext.AllTokens |
allTokens
Information about all tokens of the input
documents . |
PreprocessingContext.AllWords |
allWords
Information about all unique words found in the input
documents . |
List<Document> |
documents
A list of documents to process.
|
LanguageModel |
language
Language model to be used
|
String |
query
Query used to perform processing, may be
null |
Constructor and Description |
---|
PreprocessingContext(LanguageModel languageModel,
List<Document> documents,
String query)
Creates a preprocessing context for the provided
documents and with
the provided languageModel . |
Modifier and Type | Method and Description |
---|---|
boolean |
hasLabels()
Returns
true if this context contains any label candidates. |
boolean |
hasWords()
Returns
true if this context contains any words. |
char[] |
intern(MutableCharArray chs)
Return a unique char buffer representing a given character sequence.
|
void |
preprocessingFinished()
This method should be invoked after all preprocessing contributors have been executed
to release temporary data structures.
|
static int[] |
toFieldIndexes(byte b)
Convert the selected bits in a byte to an array of indexes.
|
String |
toString() |
public final String query
null
public final LanguageModel language
public final PreprocessingContext.AllTokens allTokens
documents
.public final PreprocessingContext.AllFields allFields
documents
.public final PreprocessingContext.AllWords allWords
documents
.public final PreprocessingContext.AllStems allStems
documents
.public PreprocessingContext.AllPhrases allPhrases
documents
.public final PreprocessingContext.AllLabels allLabels
public PreprocessingContext(LanguageModel languageModel, List<Document> documents, String query)
documents
and with
the provided languageModel
.public boolean hasWords()
true
if this context contains any words.public boolean hasLabels()
true
if this context contains any label candidates.public static int[] toFieldIndexes(byte b)
public void preprocessingFinished()
public char[] intern(MutableCharArray chs)