Routing words to /dev/null

Menu

After covering the GADDAG Data Structure it seems like a good idea to follow up on the structure it’s based on, the Trie. The Trie is a very special case structure where that’s optimized for prefix searches and was first proposed in 1960 by Edward Fredkin, which has the advantage of using less memory than a GADDAG but has the downside where it’s only efficient for one purpose. It just so happens though that purpose is very common on websites: autocompletion.

A Trie for any one word looks like a linked list, as seen here:

No matter how complex a Trie gets, every word will only appear once and it will appear like it does above somewhere in the tree.

Based on the example above, if you wanted to know every word that contained the prefix “CA” you can simply walk to the node “C” from the root, then down to “A”, and then crawl the rest of the tree to grab all the words that a hanging off of that node.

A Trie has a very fast prefix search time of O(M), where M is the number of letters in the prefix, and it has return time of O(M+N) where N is the number of nodes hanging off of that prefix. This makes the absolute worst case time for a Trie search O(N) where N is the number of nodes in the Trie, which happens when you search for an empty string as the prefix.

The worst case time can be avoided in situations like autocompletes where you will likely not return every possible word hanging off the the prefix, but you will instead only return the top N results. This is shown in the code below, and drastically reduces the time needed to return when dealing with large tries and short prefixes, for example when you want to autocomplete against a dictionary and the user just entered their first keystroke.

Here is a sample of a Trie implemented in .NET 4.5, I’m licensing it under the MIT License so it is free to use however you wish. This implementation does not include optimizations like using binary arrays to store characters or node compression (which would turn this into a radix tree), it was written to be simple to follow for those that want to walk through the operation of a Trie.