My coworkers presented a silly programming interview style question to
me the other day: given a list of words, find the largest set of words from
that list that all have the same hash value. Everyone was playing around
with a different language, and someone made the claim that it couldn't be done
efficiently in bash. Rising to the challenge, I rolled up my sleeves and
started playing around.

The first trick was to figure out how to write the hash function in bash.
bash has functions, but they can only return an exit status in the range 0-255.
There are a couple of different ways to do that, but I opted to return the value
in a global variable. We also want to iterate through the letters of the word
and want to take great care not invoke another process while doing so (so
while read letter; do math; done <(grep -o <<<$word) is out of the question).
Instead, we will use a for loop with bash expansions to iterate of each
character. Finally, we will use bash 4.0 associative arrays map a letter to
its corresponding index (for computing hash values).

# We will return into this variable.declare -i HASH_RESULT
function kr1 {localword=$1HASH_RESULT=0for((i=0; i <${#word}; i++));dolocalletter=${word:$i:1}((HASH_RESULT+= letter_value[$letter]))done}

Full program source below1. With the hash function implemented, it is fairly
straightforward to finish the rest of the program:

At this point it became interesting. My bash solution outperformed all the
other bash solutions by a fair margin, but I wanted to see if I could do better.
I ran it under a profiler and saw that it was spending all its time in many
nested layers of execute_command.

This gave me the idea to try inlining the function call. Quickly prototyping a
variation using an inlined function call, I run some trials (and collect statistics
with my favorite tool, histogram.py2):

At this point we run again under a profiler and notice something interesting: the
first time the runtime of an execute_command call isn't dominated by another
recursive call to execute_command, the function eval_arith_for_expr consumes
a large portion of the time.

Furthermore, we see that a large portion of the rest of the time is eventually spent
in expand_word_list_internal:

These observations lead us to another technique - we will use only one character
variable names to try to optimize for these two functions. Running again with all of
these optimizations, we get a huge performance improvement:

We can take this further, but I think I'm going to quit here for now - I improved
performance by almost 50% by using a profiler and some bash-foo. Final program
below3. One final note -
for the love of all that is holy, don't write performant programs in bash!