Debugging compilers with optimization fuel

Today I would like to describe how I pin down compiler bugs, specifically, bugs tickled by optimizations, using a neat feature that Hoopl has called optimization fuel. Unfortunately, this isn’t a particularly Googleable term, so hopefully this post will get some Google juice too. Optimization fuel was originally introduced by David Whalley in 1994 in a paper Automatic isolation of compiler errors. The basic idea is that all optimizations performed by the compiler can be limited (e.g. by limiting the fuel), so when we suspect the optimizer is misbehaving, we binary search to find the maximum amount of fuel we can give the compiler before it introduces the bug. We can then inspect the offending optimization and fix the bug. Optimization fuel is a feature of the new code generator, and is only available if you pass -fuse-new-codegen to GHC.

The bug

The bug shows up when I attempt to build GHC itself with the new code generator. Building GHC is a great way to ferret out bugs, since it has so much code in it, it manages to cover a lot of cases:

Viewing the culprit

How do we convince GHC to tell us what optimization it did with the 710st bit of fuel? My favorite method is to dump out the optimized C-- from both runs, and then do a diff. We can dump the C-- to a file using -ddump-opt-cmm-ddump-to-file, and then diff reveals:

Seems not: the variable is used in MO_S_Rem_W32: that’s no good. We conclude that the bug is in an optimization pass, and it is not the case that the register allocator failed to handle a case that our optimization is now tickling.

Fixing the bug

With this information, we can also extract the program fragment that caused this bug:

This looks like it should be an unobjectionable case of dead assignment elimination coupled with liveness analysis, but for some reason, the backwards facts are not being propagated properly. In fact, the problem is that I attempted to optimize the Hoopl dataflow function, and got it wrong. (Fixpoint analysis is tricky!) After reverting my changes, the unsound optimization goes away. Phew.

7 Responses to “Debugging compilers with optimization fuel”

Do you have a sense for how much of the time a fuel-based compiler debugging session goes awry because the search finds an optimization that merely enables the buggy one, instead of actually being the buggy one?

I did an experiment where I did a binary search on GCC versions in order to find the first one that introduced a given bug. This turned out to be not such a great idea because very often, the result of the binary search contained an incidental change that exposed a previously introduced bug.

John: I’ve done serious fuel-based debugging several times and have never had it lead me astray, but the sample size is rather small. As nominolo points out, if essential “optimizations” are skipped, fuel will cause the program to stop working (this was the case for the new code generator before I made essential optimizations fuel invariant.)

Even if a particular optimization only unmasks an earlier introduced bug, it can still be useful. In particular, optimizations in one block usually affect only that block, so I can massively narrow down the search space.

nominolo: Oh really? At the very least, the C– passes are the only ones that refer to it as “fuel.”

In the case of a panic, which GHC itself can detect is a bug, it is awfully nice to have GHC do the binary search itself. We did this in Quick C– and it was a great boon. (It was a little more useful for us because our compiler driver and regression-test driver were the same driver, so we could isolate any bug-inducing fault, not only panic-inducing faults.)

Yeah. I think it would not be too difficult to add this feature to our test drivers, but I’m not really sure how I would put it into GHC itself (our options passing machinery is not exactly modular, though you could probably make it work?)