Doodle 0.6.5 review

Doodle is a tool to quickly search the documents on a computer. Doodle builds an index using meta-data contained in the documents and allows fast searches on the resulting database.

Doodle project uses libextractor to support obtaining meta-data from various file-formats. The database used by doodle is a suffix tree, resulting in fast lookups. Doodle supports approximate searches.

Here are some key features of "Doodle":
A web interface
Ordering of search results
Spidering (indexing the Internet or websites)

Using

First the doodle database needs to be created. The simplest way to create the database is to run doodle with the -b option on the directories that are to be indexed. For example:

$ doodle -b

This will create the doodle database under ~/.doodle.
After creating the doodle database, you can search it. For example:

$ doodle keyword

Full text search

You can achieve a (limited) form of full-text search with doodle. For that, the dictionary-based plaintext extractors from libextractor are used. In order to use them, you need to pass the option -b LANG to doodle.

LANG is a two letter language code that selects the dictionary. Available languages at the moment are en, es, fr, it and no. Words and sentences that are available in the respective dictionaries for these languages will then be added to the index.

While libextractor attempts to avoid full-text extraction for certain kown binary formats, it may still find words in non-text files. Running with this option will dramatically increase the size of the index and the time it takes to build the index.

Note that if you change the options used to build a database will not (!) result in doodle re-indexing files that were processed with other options previously. The only way to force doodle to re-index files with different options is to either touch the files (change modification timestamp) or to delete the old database and start from scratch.

What's New in This Release:
This release fixes one minor bug in which doodled did not react properly to signals.
It also eliminates database verification in the doodled startup phase, since it takes too long.
Out-of-process execution of libextractor is enabled because it is more resilient.
A translation to Swedish was added.