57 Small Programs that Crash Compilers

It’s not clear how many people enjoy looking at programs that make compilers crash — but this post is for them (and me). Our paper on producing reduced test cases for compiler bugs contained a large table of results for crash bugs. Below are all of C-Reduce’s reduced programs for those bugs.

Can we conclude anything just by looking at these? It’s hard to say… many of these C fragments are not obviously hard to compile — to see the problem we would need to know the details of the translation to a particular compiler’s intermediate representation.

In general, we don’t know of any way to make these programs much smaller. In other words, C-Reduce already implements most of the tricks we can think of. It will always be the case that an experienced compiler developer who understand a particular bug will be able to produce considerably better reduced test cases than C-Reduce can. Our goal, rather, is to create tests that a naive user cannot improve very much. So if you see interesting opportunities to improve these test cases, we’d love to hear about it. The current version of C-Reduce fails to implement optimizations such as constant folding, constant propagation, and loop peeling. We haven’t seen much need for these, though.

These are the verbatim tool output; there are definitely some formatting warts.

In general a company wants to know that a significant number of customers (or one big one) are affected by a bug before putting resources into fixing it.

The exception is when a particular engineer at a company gets ahold of Csmith (or a similar tool), runs it internally, and uses peer pressure or other mechanisms to get people to fix bugs. I’ve heard that this has happened at a couple of places, but not at Microsoft as far as I know.

This crashed the BASIC interpreter of the Wang 3300 computer (circa 1972), as the RETURN would pop the “next token” address off of a stack and not check to see if it was valid (it was 1 past the end of line 30). It was then try to interpret garbage and get lost in the weeds.

Wang Labs later created the Wang 2200 computer, which was popular for its time and purpose. The interpreter was a complete rewrite, as the 3300 was a general purpose computer and the 2200 was a microcoded machine dedicated for running BASIC. The 2200 interpreter suffered the exact same error. I always wondered if the same people wrote the two interpreters, or if it is just an obvious pothole to fall into.

Hi bcs, C5 looks like the only one where running a code formatting tool didn’t work (C-Reduce runs both GNU indent and astyle at various places).

Looking at C5, you can also see that the function renaming pass failed. My guess is that it’s a buffer overrun bug — these tend to be the ones that are sensitive to things like string length and whitespace.

I’m not sure why indent/astyle fail to produce nicely formatted test cases sometimes. I’m guessing they fail to do a full parse/unparse and rather try to remember some of the original formatting, which is a bad idea in this case.

When I was in college in the 80s I asked one of my profs who was talking about proving correctness in a program, “What if the compiler has a bug?” With a completely straight face he said, “Compilers don’t have bugs.” I think he believed it, too.

CSX321: You are correct in your assumption. My Former Bitch Supervisor From Hell(tm) used to claim that program bugs were like roaches. If you found one, there were at least nine others. All programs had bugs, she stated.
We encountered a situation where the Microsoft C compiler had generated the wrong machine code. Since she couldn’t read assembler, she thought we were BSing her and that the problem was with us. We tried to remind her of her dictum at all programs had bugs and this was one of them. She countered with “It’s a compiler, not a program.”
She did not buy it when we tried to tell her that a compiler was a program.
In the end we just found a different way to code the routine to avoid the bug.

Humberto, the question of what (if anything) we can conclude from these is a good one.

I think the answer is that we can learn a few simple things, but not a lot beyond that. We can conclude that usually, not a lot of code is needed to trigger a crash. We can conclude that loops, structs, and math are almost always present in crash-bug triggers, but non-loop control flow is rarely needed.