Tuesday, December 06, 2016

Undefined Behaviour in C

Summary: I tripped over undefined behaviour in C. It's annoying.

I've recently been writing some C code to parse XML quickly. While working on that project, I inadvertently wrote some code which is undefined according to the C language standard. The code compiled and ran fine using Visual Studio, but under gcc (even at -O0) it corrupted memory, sometimes leading to a segfault, but usually just leading to a wrong answer. The code in question was (see full code at GitHub):

d->nodes.nodes[0].nodes = parse_content(d);

To give some context, d is a structure that contains various pieces of state - what the string to be parsed is, how much we have parsed, along with a pointer to the output nodes. The parse_content function parses the bit inside an XML tag, returning the indicies in nodes which it used.

The complication comes from nodes not being a fixed size, but dynamically resized if the number of nodes exceeds the capacity. For big documents that means parse_content will reallocate d->nodes.nodes.

According to the C spec, the compiler can evaluate the LHS and RHS of an assignment in any order. Since gcc computes the location of d->nodes.nodes[0] before calling parse_content it uses the address of the node before reallocation. After reallocation the address will have changed, and the assignment will be made to the wrong location.

I spotted the bug by inserting printf statements, and in doing so, I had to rewrite the code to:

str content = parse_content(d);
d->nodes.nodes[0].nodes = content;

That fixes the issue, since now the evaluation order is strictly defined. As a simplified example of the same issue:

Here the line array[0] = f() might assign to either the result of malloc(0) or malloc(42), at the compilers discretion.

I manually checked if I had made any other such mistakes, and I couldn't find any. Naturally, I wanted to find a static checker that could detect such a mistake, so I tried a bunch of them. I wasn't very successful:

greg, just tried it with both clang and gcc address sanitisers and no luck. The problem is that in truth I allocate a shared buffer that stores the document structure, a small number of attributes, and a small number of nodes, so that small parse trees require only one allocation. If the nodes grow, then I allocate some space just for some nodes, and if they grow more, I repeat that process. It means the initial small nodes buffer remains valid on reallocation, it just happens to point at somewhere I no longer care about, and the address sanitiser is perfectly happy.

Did you try FlexeLint? I pasted your simplified example (after adding #include <stdlib.h>) into http://gimpel-online.com//cgi-bin/genPage.py?srcFile=diy.c&cgiScript=analyseCode.py&title=Do-It-Yourself+Example+%28C%29&intro=This+example+allows+you+to+specify+your+own+C+code.&compilerOption=online32.lnt&includeOption={{quotedIncludeOption}} , and it gave me a warning: "Warning 591: Variable 'array' depends on the order of evaluation; it is modified through function 'f(void)' via calls: f()"

Try running your debug binary under Valgrind, it'll immediately tell you which line is offending. I treat Valgrind like a last-resort antibiotics, as using it too often can develop lazy habits. In the world of C/C++ you have to be on your toes all the time. 60% of your brain should be spared for proving that what you're doing at a given moment is not UB.

FWIW, http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html Three part blog entry, What Every C Programmer Should Know About Undefined Behavior by LLVM founder Chris Lattner. C is a vey scary language.

Actually, the root cause of your issue is unspecified behavior, where the C standard specifies that the order of evaluation of the LHS with respect to the RHS is not specified. The undefined behavior then happens in only one of the two cases, which might explain why it is difficult for the static analysis tools to spot it.

A fast parser with a C interface for Haskell -- almost perfect excuse to try out Rust IMO. I would refrain from asking for a rewrite in Rust tho since the project is already published. But honestly, check it out at https://www.rust-lang.org , and maybe try it out on another system-level project.