20120504

Using memoization to speed up your code

Memoization is a technique to remember the output of a pure function, in the mathematical sense. That is, for a given input it will always produce the same output. By remembering, you can serve the output right away if you have seen the input before and you may omit (costly) computation steps.

This post has an emphasis on Perl, since that is a language I use quite often at the time, but in fact the technique is applicable to programming in general.

As you can see, this is a very ineffecient approach, because the subtree of fib(3) appears twice in this call graph. Imagine what it would look like for fib(100), where fib(3) is computed billions and billions of times, again and again. So the idea with memoization is to compute fib(3) once, and serve its outcome immediately for every future call to fib(3). You could call it a cache, but since this is a specific caching technique on a subroutine level I prefer to stick with the term memoization.

And I'm sure you can find an implementation for any other programming language. I found this interesting article on how to accomplish memoization in Javascript.

To me, the Perl version excels because of its ultimate simplicity. One call to memoize() and your running time is only 2% of what it was before (this actually happened to me). A script with an exponential running time might turn into something that looks linear (or at least less exponential).

To demonstrate the effects on the fib function, consider the following numbers. n is the argument passed to fib, as defined above. The two columns show the time spent by the CPU (in user space, not wall-clock time). You can see the dramatic improvements by adding a simple line to your program:

n

No memoization (s)

Memoization (s)

10

0.00

0.01

20

0.01

0.01

30

1.24

0.01

40

147

0.01

50

18485

0.01

The results speak for themselves. The case n=10 looks a bit strange, going from 0.00 to 0.01 seconds. This is due to the overhead of loading the Memoize module, which takes a bit of time. But for larger n this cost is totally worth it, from 5 hours down to almost nothing.

This may look impressive, but there's no such thing as a free lunch. As always with optimization there's a trade-off between time (CPU cycles) and space (memory). Memoization has little use when you hit the limits of your RAM because you chose to memoize every single function that moves. So you'd better make a wise choice for which functions you want to memoize.

Not all functions are suitable for memoization, and you can find a list of caveats in the Perl documentation. Although it's a Perl site, the caveats are actually language-independent. Summarized, these are:

the behavior only depends on its own parameters;

there are no side-effects;

the output is not modified by the caller;

the function is not too simple.

And even when a function passes these criteria, it doesn't automatically mean that it's suitable for memoization. You should ask the following three questions:

How often is this function called?

What is the size of the input domain for this function?

What is the output size for this function?

Remembering the output of a function which is called once during the execution of a script is not very useful, since there is exactly zero reuse. But even a function which is called multiple times might not be useful. And how do you know how often a function is called? Simple: by measurement. Run your code through a profiler. For Perl I can recommend the NYTProf profiler:

$ perl -d:NYTProf fibonacci.pl
$ nytprofhtml

From the HTML it generates you can easily check for each function how often it was executed. But the number of calls is just one figure, you also need to know the domain of your input parameters. For example, if you know that one parameter has 400 possible values and the other one 2, that makes 800 possible inputs. If you observe that a function is called 4000 times, you know that there are 5 calls per input on average. In that case it might be worthwhile to memoize. If it is less than 800 times, you should investigate the distribution of your inputs. In case there is a lot of inequality then you're likely to win some time by memoizing the function. Contrarily, if (almost) all inputs are unique it makes no sense to waste precious memory.

Additionally, you should also take the actual size of the input and output into consideration, because this is what will end up in memory. After all, a boolean occupies less RAM than a huge array of data. If the output is a large data structure, you should consider to memoize smaller parts of the solution and use these parts to assemble the final solution for each call. For example, suppose you have a function which returns a list of Fibonacci numbers, starting from the mth number until the nth, it is often sufficient to only store the output of fib and build the list on every call.

To conclude, applying memoization has the potential to bring you huge performance benefits with minimal effort. But it's not a silver bullet, you should be aware under which conditions it is applicable and be aware of the trade-offs you're about to make.