Menu

American Fuzzy Lop’ing Rust

In the last six months I’ve been all in on Rust. The work I’m doing at Postmates is low-level and finicky, which is my kind of thing. In particular, the software I’m constructing:

must have high throughput and tolerable latency (the system is high-scale but is not real-time),

must consume CPU and RAM in a tolerably deterministic fashion, relative to input requests,

must perform its computations with a high degree of certainty that the computation has been done correctly and

the system must not experience runtime crashes.

There’s a fair bit of wiggle room in there. The system is high-scale but not real-time so latency spikes are okay sometimes. Postmates’ really, really needs the system not to be a resource hog but regressions in that area can be accepted with the understanding that the regression is bounded. Correctness is important–we’re building an empirical total system model off this work–but not so important that the organization has invested in rigorous formal methods. Where there isn’t wiggle room is that last requirement. In this post I’m going to talk about how I use american fuzzy lop to demonstrate correctness for purpose with regard to runtime crashes against a Rust codebase.

It’s a neat tool!

Wait, bunnies?

American fuzzy lop is a security-oriented fuzzer that employs a novel type of compile-time instrumentation and genetic algorithms to automatically discover clean, interesting test cases that trigger new internal states in the targeted binary. This substantially improves the functional coverage for the fuzzed code. The compact synthesized corpora produced by the tool are also useful for seeding other, more labor- or resource-intensive testing regimes down the road.

What this means is that AFL is a tool that will:

chuck a bunch of random data at specially compiled program and seek out crashes or hangs and

be really good at finding interesting random data to chuck at said program.

As AFL chucks random data into your program its uses special instrumentation placed in the binary to determine when AFL has stumbled on a new control path. AFL will work to discover new paths and, in so doing, will end up exercising more of your program–likely enough–than you might have by careful thinking and manual experimentation alone. In C and C++ where memory accesses are something of a tightrope act this is a really important tool! Rust for sure cuts down on some of the balancing skill needed but it’s still open to runtime crashes because of integer overflows, unwraps that should “never happen” or judicious use of unsafe blocks that looked totally safe but it turns out actually were not.

I’m sold. Lead me to the promised land.

We’re going to need two things:

a library that’s been specially compiled with AFL instrumentation and

runner programs.

A “runner program” is a wee thing that reads AFL’s random nonsense from stdin, converts into structured random nonsense for your system under test and then feeds the one into the other. How do we get a compilation environment going and what does a runner program look like? Let’s get a motivating example going.

AFL can work without specially compiled programs but it is sloooooow. We’re going to need to compile our Rust code with AFL instrumentation built in. Fortunately frewsxcv has really kindly done all the hard work for us and made a special build of the Rust compiler to do what’s needed and made a docker container for this build of Rust. Per the setup documentation all that’s needed to get our environment going is to

> docker pull corey/afl.rs

and we’re in business. Let’s now clone quantiles:

> git@github.com:postmates/quantiles.git

Per the afl.rs tutorial you’ll need to make some changes to quantile’s Cargo.toml as well as a few small changes sprinkled here and there. From commit 12337344ff2e53c95ff4f1dfc38178ced1efa1b6 the diff you need to make happen is:

That done, we need a runner! Quantiles is a library only crate and we’ll create src/main.rs for our runner. (Later, when you have a library and executable crate, it’ll be sufficient to create a new executable in src/bin/.) The src/main.rs should be the following:

Here we’re parsing stdin into a vector of f64. The first element of the vector is the error bound for the CKMS[1], the second is the quantile to query once all insertions have been made and the remaining elements are values for insertion. AFL requires an initial valid seed to start its genetic mutations from so go ahead and do the following

> mkdir in/
> echo "+0.0001 0.5 2.2 -3.2" > in/basics

Now there’ll be something for AFL to start with. Enter the container we pulled earlier

After a little warm-up you'll be greeted by AFL's UI, which at first is inscrutable but gradually comes to be inscrutable and familiar.

While writing this article I realized that I could change the runner and discover paths more quickly inside quantiles. Look at the "findings in depth" sub-UI. Turns out, quantiles still suffered some crashes as of da81258737046ba65fa52ec4a827238872db8459! All the crash inputs that AFL discovers go into out/crashes/ as per the -o flag given above. Cracking one open I find:

To be honest, I wasn't sure what to make of it when I first saw this. Reasoning about the crash inputs it struck me that the control columns were all suspiciously close to zero or actually zero. Time to run the input through the runner:

Well look at that! Line 173 is here. Tracing back up to where insert_threshold is defined and doing the math it turns out that when the input error is ridiculously small as above the insert_threshold will be quite large, 7575757575757576000000000000000000000000000000000000000000000000000 here. Oops! On line 84 this value is chopped down with as usize and that ends up truncating to 0! To resolve this, I adjusted the minimum error code in CKMS::new to look like so:

The minimum error bound a CKMS user can request is 1 in 1,000,000,000. This is an arbitrary lower bound and the most correct would be one which takes into account the value of std::usize::MAX. The PR for this change is here, which is now included in quantiles 0.3.1.

Now the question comes up: how long should I let AFL run? Answer: a long time! The AFL process works through different heuristics to generate inputs and doing so is not fast. It is possible to perform parallel fuzzing to reduce the run time, but the more control paths you have in your program the longer AFL will need to run to discover them. An overnight run for smallish programs–like quantiles–is more than enough to give me confidence. For much larger systems I set up a server to continually run AFL and periodically check in on it. Your milage may vary.

That sure was a lot of fiddling!

It was! AFL in Rust is still a relatively new thing so it is for sure fiddly. There's an open issue to bring fuzzing more inline with the way that quickcheck works, which would simply the code somewhat, but would still leave the need for a special compilation toolchain in place. There is work being done on LLVM specific fuzzing at the libFuzzer project and when that bubbles up into rust-land it'll be a good time.

For now, all the fiddling is well worth it. Rust doesn't suffer the same odd memory access problems that C/C++ do but there are still opportunities for computation ending goofs to occur. AFL is real good at finding those goofs!

A CKMS is a structure to answer quantile queries in limited space. That is, the CKMS is able to throw inputs away but only so long as tossing inputs does not deviate the quantile query from its true value by more than a pre-configured error boundary. ↩︎