On constants in algorithmic complexity, and comparing apples to oranges

Can an O(n!) algorithm be faster than an O(1) algorithm? Yes, of course it can. So I came …

Can an O(n!) algorithm be faster than an O(1) algorithm? Yes, of course it can.

So I came across an interestring and almost trivial programming puzzle: given an arbitrary word and a dictionary of words, find all anagrams of the word present in the dictionary. At first I didn't
care to think about the best solution for the problem and I just wrote a trivial, brute force solution as a proof of concept:

Neat and straightforward... and with the obvious complexity of O(n!) where n is the length of the given word. This
complexity is a natural consequence of finding all permutations of the word and checking each one for its presence in the dictionary. It also has a O(m) memory space complexity where m is the size of the
dictionary. I'm using the set type to hold the result here since permutations may be non-unique (e.g. transpose the 1st and the second letter 'a' in the word 'anagram').

To illustrate what this means, a 5-letter word has 5! = 120 permutations to check. An 8 letter word has 40,320 and a 10 letter word has 3,628,800 permutations to check.

Ok... so this is basically bad. What can we do about it? How about turning the problem upside-down and for each word in the dictionary, check if it's an anagram of the target word. This is surely
an O(m) operation where m is the number of words in the dictionary:

Again, a clear and obvious solution. It makes a single pass through the dictionary and if the size of the dictionary is constant, we may even stretch our estimation of the complexity and call this
an O(1) algorithm, as the number of operations in the loop doesn't depend significantly on the target word length. By the way, this is an obvious candidate for list comprehension,
so I can write this more Pythonically as:

Both of these solutions are also essentially O(1) in memory space complexity. But... does any of this actually help? Maybe.

The thing is: if our target word is short, the number of total operations we execute in the O(n!) case is actually pretty low for a modern CPU's capabilities. Since in most cases the words we use actually
are short, this should be taken into account when solving this particular problem. To illustrate the point, here's a comparison in the number of operations depending on the target word word length
and the assumed dictionary length of 22,000 words:

The O(1) solution actually has a huge constant attached to it, since in every single case we will pass through all of the entries in the dictionary, no matter what. This fixed cost of every
iteration is the "constant" in the complexity estimation. Writing just "O(1)" doesn't tell the whole story. In practice, we're interested in the fact that this algorithm is simply slow,
even if it behaves nicely and executes in constant-bound time. It's useful to write C + O(1), where C (indicating some constant cost per iteration, and this algorithm has only a "single" iteration)
is very large in this case. Here is the table of timings for a 5 letter target word and 100 iterations:

solution 1 1.7 s
solution 2 31.0 s
solution 3 30.7 s

Ouch. Our clever O(1) algorithm seems to be rather sluggish. Also note that in this case, using list comprehensions doesn't help us at all. On the other hand, our O(n!) algorithm is actually
K+O(n!), but its constant cost per iteration K is so low that we don't care: basically it stands for whatever operations are involved in comparing the two strings.

Ok, so let's be a little less dumb and realize that for two words to be anagrams, they need to have the same length, and adjust the algorithm such:

In this case, we have significantly reduced the constant cost of the average iteration, but the nature of the algorithm has remained the same.

Clearly there's a cutoff point at which the number of operations performed by the O(n!) solution is significantly lower than for the other ones, so we can pick and mix our algorithms for the respective
cases. To keep the solution correct, we need to adapt to both the word length and the dictionary size:

defana5(target):result=set()words=set([line.strip()forlineinopen('words.txt').readlines()])iffactorial(len(target))>len(words):# use the O(m) algorithm WRT the size of the dictionaryc=Counter(target)forwordinwords:iflen(word)!=len(target):continueifCounter(word)==c:result.add(word)else:# use the O(n!) algorithm WRT the word lengthforperminpermutations(target):perm_str=''.join(perm)ifperm_strinwords:result.add(perm_str)returnlist(result)

Call this "solution 5."

This gives us a nicely bounded worst-case performance, which is best illustrated by running the algorithm through a list of words of varying sizes, each of which has exactly two permutations in
the dictionary (the words in my example were chosen randomly):

So, yeah, programming puzzles are even more interesting to solve if performance is in consideration.

Update: By popular request, here's one way to do it "properly". The algorithms above are toys for illustrating a point about the importance of constants in
algorithm complexity, not actually something that should be used in production.

Here's how to do it better: index all the words in the dictionary (put them in a hash map) by their "signature" which contains information about the letters
they are made of and how many times each of the letters occurs. Thus, all searches for anagrams of the target words boil down to a single lookup in the
hash table:

In the above code, I'm using str(Counter(...)) because Counter objects are not hashable (cannot be keys in a dict) but their string representations, of course,
are. HOWEVER: this particular code has a monstruously complex constant part while the map (cmap in the above code) is being built. It takes around 76 seconds to
do 100 iterations of the above solution! In real, production code, if at all possible, this word index would be created only once:

# index all the words in the dictionary only oncecmap={}forlineinopen('words.txt').readlines():word=line.strip()c=str(Counter(word))try:cmap[c].append(word)exceptKeyError:cmap[c]=[word]# use the indexdefana7(target):globalcmaptry:returncmap[str(Counter(target))]exceptKeyError:return[]

We'll call this "solution 7" and, fittingly, it has a perfect O(1) performance with a trivial cost: a 100 calls of ana7() with absolutely any arguments takes around 0.003s on this
machine. It has an O(m) memory space complexity where m is the size of the dictionary, and can be considered as an example of the typical tradeoff between
the performance and memory used. If you actually need to find anagrams, use this algorithm, not the toy ones.