Compilers: can’t live with ‘em, can’t live without ‘em - at least not if you write code for a living. Compilers are great at taking your hand crafted human-readable program, translating it into machine code and, in the process, optimizing it so it runs as efficiently as possible. Sometimes, though, as new research from MIT points out, in their zeal to optimize your code, compilers can go too far and remove code that they shouldn’t, which can make the system or application more vulnerable.

Four researchers in MIT’s Computer Science and Artificial Intelligence Laboratory, in a paper which is to be presented next week at the ACM Symposium on Operating Systems Principles, looked at the problem of optimization-unstable code, which is code that gets removed by a compiler because it includes undefined behavior. Undefined behavior is code which can behave unpredictably, such as dividing by zero, null pointer dereferencing and buffer overflows. Unlike other code, compiler writers are free to deal with undefined behavior however they wish. In some cases, they choose to eliminate it completely, which can lead to vulnerabilities if the code in question contains security checks.

The MIT researchers studied a dozen common C/C++ compilers to see how they dealt with undefined code. They found that, over time, compilers are becoming more aggressive in how they deal with such code, more often simply removing it, even at default or low levels of optimization. Since C/C++ is fairly liberal about allowing undefined behavior, it is more susceptible to subtle bugs and security threats as a result of unstable code.

"As compilers improve their optimizations, for example, by implementing new algorithms… or by exploiting undefined behavior from more constructs (e.g., library functions), we anticipate an increase in bugs due to unstable code."

The good news is the researchers have developed a model and a static checker for identifying unstable code. Their checker is called STACK, and it currently works for checking C/C++ code. The idea is that it will warn programmers about unstable code in their applications, so they can fix it, rather than have the compiler simply leave it out. They also hope it will encourage compiler writers to rethink how they can optimize code in more secure ways.

STACK was run against a number of systems written in C/C++ and it found 160 new bugs in the systems tested, including the Linux kernel (32 bugs found), Mozilla (3), Postgres (9) and Python (5). They also found that, of the 8,575 packages in the Debian Wheezy archive that contained C/C++ code, STACK detected at least one instance of unstable code in 3,471 of them, which, as the researchers write, “suggests that unstable code is a widespread problem.”