Returns the top N matching results sorted by score, using block-quality
optimizations to skip blocks of documents that can’t contribute to the top
N. The whoosh.searching.Searcher.search() method uses this type of
collector by default or when you specify a limit.

UnlimitedCollector

Returns all matching results sorted by score. The
whoosh.searching.Searcher.search() method uses this type of collector
when you specify limit=None or you specify a limit equal to or greater
than the number of documents in the searcher.

SortingCollector

Returns all matching results sorted by a whoosh.sorting.Facet
object. The whoosh.searching.Searcher.search() method uses this type
of collector when you use the sortedby parameter.

Here’s an example of a simple collector that instead of remembering the matched
documents just counts up the number of matches:

There are also several wrapping collectors that extend or modify the
functionality of other collectors. The meth:whoosh.searching.Searcher.search
method uses many of these when you specify various parameters.

NOTE: collectors are not designed to be reentrant or thread-safe. It is
generally a good idea to create a new collector for each search.

This method is called for every matched document. It should do the
work of adding a matched document to the results, and it should return
an object to use as a “sorting key” for the given document (such as the
document’s score, a key generated by a facet, or just None). Subclasses
must implement this method.

If you want the score for the current document, use
self.matcher.score().

Overriding methods should add the current document offset
(self.offset) to the sub_docnum to get the top-level document
number for the matching document to add to results.

Parameters:

sub_docnum – the document number of the current match within the
current sub-searcher. You must add self.offset to this number
to get the document’s top-level document number.

This method calls Collector.matches() and then for each
matched document calls Collector.collect(). Sub-classes that
want to intervene between finding matches and adding them to the
collection (for example, to filter out certain documents) can override
this method.

Returns True if the collector naturally computes the exact number of
matching documents. Collectors that use block optimizations will return
False since they might skip blocks containing matching documents.

Note that if this method returns False you can still call count(),
but it means that method might have to do more work to calculate the
number of matching documents.

Subclasses can override this to perform set-up work, but
they should still call the superclass’s method because it sets several
necessary attributes on the collector object:

self.top_searcher

The top-level searcher.

self.q

The query object

self.context

context.needs_current controls whether a wrapping collector
requires that this collector’s matcher be in a valid state at every
call to collect(). If this is False, the collector is free
to use faster methods that don’t necessarily keep the matcher
updated, such as matcher.all_ids().

Returns a sorting key for the current match. This should return the
same value returned by Collector.collect(), but without the side
effect of adding the current document to the results.

If the collector has been prepared with context.needs_current=True,
this method can use self.matcher to get information, for example
the score. Otherwise, it should only use the provided sub_docnum,
since the matcher may be in an inconsistent state.

keyfacet – a whoosh.sorting.Facet to use for collapsing.
All but the top N documents that share a key will be eliminated
from the results.

limit – the maximum number of documents to keep for each key.

order – an optional whoosh.sorting.Facet to use
to determine the “top” document(s) to keep when collapsing. The
default (orderfaceet=None) uses the results order (e.g. the
highest score in a scored search).

A collector that raises a TimeLimit exception if the search
does not complete within a certain number of seconds:

uc=collectors.UnlimitedCollector()tlc=TimeLimitedCollector(uc,timelimit=5.8)try:mysearcher.search_with_collector(myquery,tlc)exceptcollectors.TimeLimit:print("The search ran out of time!")# We can still get partial results from the collectorprint(tlc.results())

IMPORTANT: On Unix systems (systems where signal.SIGALRM is defined), the
code uses signals to stop searching immediately when the time limit is
reached. On Windows, the OS does not support this functionality, so the
search only checks the time between each found document, so if a matcher
is slow the search could exceed the time limit.

Parameters:

child – the collector to wrap.

timelimit – the maximum amount of time (in seconds) to
allow for searching. If the search takes longer than this, it will
raise a TimeLimit exception.

greedy – if True, the collector will finish adding the most
recent hit before raising the TimeLimit exception.

use_alarm – if True (the default), the collector will try to
use signal.SIGALRM (on UNIX).