Software and thoughts by Emmanuel Goossaert

Cache debugging in Python

While debugging code, one of the most important factors is speed of iteration. The faster the program can be run and either succeed or fail, the faster the code can be debugged. In most cases one only waits a few seconds before the program fails and return an error code or exception, which will guide the debugging process. However in some other cases, the program to debug has to go through a very long initialization phase, during which some data has to be created or preprocessed before the program can really start. It is not rare to find that such initialization processes can take minutes, if not hours. In those cases, since testing a bug fix requires starting the program, and as starting the program can take hours, then reaching the bug will take hours too! In the end, you can only get a few debugging iterations in a day, which is definitely not a good way to get work done.

In this article I am presenting a technique for debugging such cases while programming in Python. I am calling it “cache debugging“.

Since the issue is the initialization phase with the preprocessing of the data, the obvious solution is to somehow cache the results of this step, and next time the program starts, simply load it instead of recomputing it. Well, I have tried various things, including using the pickle module, using the shelve module, and even serializing the data with JSON. All these solutions ended up taking way too long and not really speeding up the iteration cycle. Then I did some research, and I stumbled upon this question on StackOverflow. The selected answer, by Peter Lyons, is a real pearl.

All what needs to be done is a wrapper, which performs the initialization phase, keeps the processed data in memory, and then uses Python’s reload() method to reload the module to test whenever necessary.

Mind = Blown.

I modified a bit the code to make it easier to use, and that gives me the following.

So you just run the wrapper.py file, which loads the initialization data from mainprogram.py, and executes the method to test. Then, when a bug occurs, you get the exception message thanks to the traceback module. All you have to do at this point is edit your mainprogram.py file to fix the bug, and then press any key in the window that is running wrapper.py in order to reload the code and test your fix. This trick saved me a lot of time. Simple and elegant. Thank you, Peter!