@Bindable public abstract class MultipageSearchEngine extends SearchEngineBase
IDocumentSource
s wrapping external
search engines with remote/ network-based interfaces. This class implements helper
methods for concurrent querying of search services that limit the number of search
results returned in one request.SimpleSearchEngine
Modifier and Type | Class and Description |
---|---|
protected class |
MultipageSearchEngine.SearchEngineResponseCallable
An implementation of
Callable that increments page request count statistics
before the actual search is made. |
static class |
MultipageSearchEngine.SearchMode
Search mode for data source components that implement parallel request to some
search service.
|
protected static class |
MultipageSearchEngine.SearchRange
A single result window to fetch.
|
Modifier and Type | Field and Description |
---|---|
MultipageSearchEngine.SearchMode |
searchMode
Search mode defines how fetchers returned from
createFetcher(org.carrot2.source.MultipageSearchEngine.SearchRange)
are called. |
compressed, documents, POSTPROCESSING, query, results, resultsTotal, SERVICE, start, statistics
Constructor and Description |
---|
MultipageSearchEngine() |
Modifier and Type | Method and Description |
---|---|
protected void |
collectDocuments(Collection<Document> collector,
SearchEngineResponse[] responses)
Collects documents from an array of search engine's responses.
|
protected abstract Callable<SearchEngineResponse> |
createFetcher(MultipageSearchEngine.SearchRange bucket)
Subclasses should override this method and return a
Callable instance that
fetches search results in the given range. |
protected void |
process(MultipageSearchEngineMetadata metadata,
ExecutorService executor)
Run a request the search engine's API, setting
documents to the set of
returned documents. |
protected SearchEngineResponse[] |
runQuery(String query,
int start,
int results,
MultipageSearchEngineMetadata metadata,
ExecutorService executor)
This method implements the logic of querying a typical search engine.
|
afterFetch, clean, urlEncode
afterProcessing, beforeProcessing, dispose, getContext, getSharedExecutor, init, process
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
afterProcessing, beforeProcessing, dispose, init, process
@Processing @Input @Attribute(key="search-mode") @Level(value=ADVANCED) @Label(value="Search Mode") @Group(value="Data source paging") public MultipageSearchEngine.SearchMode searchMode
createFetcher(org.carrot2.source.MultipageSearchEngine.SearchRange)
are called.MultipageSearchEngine.SearchMode
protected void process(MultipageSearchEngineMetadata metadata, ExecutorService executor) throws ProcessingException
documents
to the set of
returned documents.ProcessingException
protected abstract Callable<SearchEngineResponse> createFetcher(MultipageSearchEngine.SearchRange bucket)
Callable
instance that
fetches search results in the given range.
Note the query (if any is required) should be passed at the concrete class level. We are not concerned with it here.
bucket
- The search range to fetch.protected final void collectDocuments(Collection<Document> collector, SearchEngineResponse[] responses)
protected final SearchEngineResponse[] runQuery(String query, int start, int results, MultipageSearchEngineMetadata metadata, ExecutorService executor) throws ProcessingException
ExecutorService
.ProcessingException