Project description

DAWG

String data in a DAWG (Directed Acyclic Word Graph) may take
200x less memory than in a standard Python dict or list and
the raw lookup speed is comparable. DAWG may be even faster than
built-in dict for some operations. It also provides fast
advanced methods like prefix search.

If you found a bug in a C++ part please report it to the original
bug tracker.

How is source code organized

There are 4 folders in repository:

bench - benchmarks & benchmark data;

lib - original unmodified dawgdic C++ library and
a customized version of libb64 library. They are bundled
for easier distribution; if something is have to be fixed in these
libraries consider fixing it in the original repositories;

Authors & Contributors

License

Wrapper code is licensed under MIT License.
Bundled dawgdic C++ library is licensed under BSD license.
libb64 is Public Domain.

0.3.2 (2012-09-24)

prefixes method for finding all prefixes of a given key.

0.3.1 (2012-09-20)

bundled dawgdic C++ library is updated to the latest version.

0.3 (2012-09-13)

similar_keys, similar_items and similar_item_values methods
for more permissive lookups (they may be useful e.g. for umlaut handling);

load method returns self;

Python 3.3 support.

0.2 (2012-09-08)

Greatly improved memory usage for DAWGs loaded with load method.

There is currently a bug somewhere in a wrapper so DAWGs loaded with
read() method or unpickled DAWGs uses 3x-4x memory compared to DAWGs
loaded with load() method. load() is fixed in this release but
other methods are not.