Populating a collection

The data is gathered. For example, if it is a web collection the web sites will be crawled to download all HTML files and other documents.

All "binary" documents are filtered to extract plain text. For example, PDF files will be processed to extract the text.

The documents will be indexed: word lists and other information will be processed into Funnelback indexes. The index is then used to answer user queries.

All of this work occurs in an offline area to prevent disrupting the current live view which is being used for query processing. If the update process completed successfully, the live and offline views will be swapped, making the new indexes available for querying.

Manage collections

For details on how to manage Funnelback collections, see the following: