Developing Software in a Hostile Environment

Ted Unangst

tedu@openbsd.org

intro

What's this talk about and who is it for? I'll begin by referencing a series
of talks that Theo has given over the years regarding exploit mitigation. What
it is, how it works, why you want it. Usually the focus is on stopping the bad
guys.

I noticed something a bit funny about the exploit mitigation material. On the
one hand, it's very technical, how exploits work and are mitigated. On the
other hand, the people most interested in exploit mitigation are more likely
to be users, stuck running software they don't trust. I wanted to look at
things from a different viewpoint. How can we help the good guys?

This is a talk about developing software. My examples are all going to come
from OpenBSD, but that doesn't mean I'm only talking to OpenBSD developers.
Actually my target audience is everyone who develops software that does, or
may, run on OpenBSD. In this case, OpenBSD is the hostile environment. If
you're only now discovering that the internet is a hostile environment, you're
about 20 years late to the party.

What makes OpenBSD a hostile environment? It doesn't always conform to
expectations and it certainly doesn't condone many mistakes. Developers talk a
lot about standards; C standard, posix standard; but there's also real world
de facto standards and assumptions. Let's challenge some of those assumptions
and push the boundaries of the standard. A strictly conforming program will
continue to run just as it should, but a program that takes shortcuts will
quickly find itself in trouble.

Everybody loves secure software, but as we've maintained for some time, secure
software is simply a subset of correct software.

outline

I'm going to start off by discussing a few features that someone developing
software on OpenBSD should know about. Then I'm going to discuss some of the
theory behind these features, and how your design decisions can affect your
ability to detect bugs. Along the way, I'll mix in a few example bugs that
were identified and fixed.

philosophy

Whenever we've added new exploit mitigations to OpenBSD, something always
seems to stop working. Always. This makes the development of such features
very exciting, of course. Did I break it, or was it always broken?

From a high level, my philosopy is that instability today leads to stability
tomorrow. The sooner we can break it, the sooner we can fix. Everybody will
tell you the same thing, that's it's better to find bugs in development than
in production. True of course, but that doesn't mean you won't have production
bugs. Earlier is better there too. There is an unfortunate mindset that once
something ships, we play it safe and become very conservative. But this only
delays the inevitable, and it makes me very uncomfortable. Debugging a live
system running code from two years ago is much more difficult than debugging a
live system running code from last month. You didn't avoid the bug; you've
only made it harder to deal with.

malloc.conf

Let's turn to some specifics.

I'm guessing that malloc.conf is the most popular, familiar feature I'm going
to talk about. It's a good place to start, with the user visible interface to
malloc, and then we can move along from there to how allocators work.

All BSDs support the malloc.conf feature, which made its appearance in
phkmalloc. It was subsequently retained in jemalloc and ottomalloc, although
with some divergence in the available options. The internal behavior of malloc
can be controlled by creating a symlink named malloc.conf. The letters in the
name of the symlink target enable or disable various options.

The J for junk option is the one I'd like to focus on. When enabled, this
option prefills allocated memory with a non zero pattern and overwrites freed
memory as well. This catches two different classes of bugs.

First, many program fail to completely initialize heap objects. Many malloc
implementations, at least for a while after program startup, will return zero
filled memory for allocations because that's what malloc gets from the kernel.
Much like an uninitialzed stack variable, you'll probably get lucky until you
don't. At some point, malloc will switch to returning previously used memory,
which won't be zero, and the bug manifests. Better to catch it on the first
allocation.

Second, many use after free bugs rely on the memory remaining unchanged for
some time after free. Until the memory is recycled, the program can perhaps
continue using it without consequence. By immediately overwriting the memory,
we can trigger erroneous behavior.

There's no guarantee that the junking memory will flush out a bug. However, we
can hope that the junked memory is sufficiently atypical that it causes
observable deviations.

The man page for malloc.conf unfortunately downplays how signficant and
helpful it can be. It's not just for testing or debugging, and despite
whatever warnings you may find in the man page, running with J in production
isn't such a bad idea. If the program behaves, it won't hurt. There may be a
performance hit, true, but it may be acceptable. If it does hurt, you probably
don't want to be running that program in production, regardless of what malloc
options are in use.

A short while ago, I changed the default on OpenBSD to always junk small
memory after free. We didn't necessarily want to impose a penalty on all code,
particularly considering that some memory may still be unmapped zero on demand
pages, so we restricted it to small chunks. And use after free bugs are
probably more dangerous and more pervasive than uninitialized memory, so we
focused the effort there.

Lots of OpenBSD users do run with malloc options enabled, so many of the bugs
it catches have already been caught, but occasionally some slip through. When
we started junking memory by default after free, the postgres ruby gem stopped
working. You can always find more bugs by conscripting more testers.

There is (or was) a related option, Z for zero. It basically does the
opposite, and always zeros newly malloced memory. jemalloc still has this
option, but I removed it from OpenBSD because it seems like a crutch for
poorly written applications. I don't want to help bad programs run; I want to
stop them from running.

Some other options that may be interesting are the F and U options, also
designed to help track down use after free bugs. By unmapping the freelist
whenever possible, it can trigger segfaults when memory is accessed.

The G option enables guard pages. Unlike the others, I don't think this option
is necessarily a good trade off. By default, the kernel will return randomized
and well separated pages with unmapped regions between them. The G option
enforces this behavior from the userland side, but it adds several system
calls to some important code paths and is mostly redudant. If you're looking
to conserve performance, skip this one.

poison

Another term for junking memory is poisoning. This is the term used in the
OpenBSD kernel for instance. A little more on the theory behind it and some
other uses. malloc is a general purpose allocator that deals in memory, but
others like pool, uma, zone, slab, etc. deal in objects. That's a better way
to think about what we're doing. Every object has a lifetime. It's allocated,
used, and then freed (destroyed). Lots of bugs result when the code using an
object fails to respect the lifetime. We'd like an enforcement mechanism.
Enter poison.

Poisoning an object can be as simple as overwriting the memory with a simple
pattern, but it can also be considerably more complex. The fixed pattern can
be selected (crafted) for maximum invalidity. For example, we might want to
overwrite pointers with a value that we know is not mapped. Find a hole in the
address space and use that as the fill value. Use a few different patterns. Bugs
have a tendency to adapt to whatever fill patterns you use.

Theo and I had been discussing the possibility that the poison values in use
might be conveniently aligned with harmless flag values. As an experiment, I
inverted the bit patterns used. It wasn't long before the smoke started
pouring out. In the OpenBSD kernel, we alternate between two values.
Unfortunately, the two values happen to be quite similar, and this allowed
some bugs to escape. The function to establish interrupts on i386 failed to
initialize a flags argument. This was mostly harmless because the default fill
value from malloc didn't set any interesting flags. When the bit values were
inverted, the MP safe flag was set. Marking an interrupt handler as MP safe
when it is not quickly leads to trouble.

Another thing one can do, and the kernel does this even though userland malloc
does not, is to check that the poison values are still in place. This can
detect some use after free writes. When code frees an object, it's immediately
poisoned. If buggy code later changes the object, that will erase some of the
poison. When the allocator decides to recycle the object, it checks that all
the poison is in place. If not, panic.

recycling

Any allocator, be it malloc or pool or uma, has a recycling policy. When an
object is freed, it is returned to the free list where it waits for a
subsequent allocation. The free list need not actually be a list, but
regardless of the data structure used, there will be either an explicit or
implicit policy regarding the selection of the next object.

One common policy is fast recycle. Last in, first out. The most recent object
freed is the next object allocated. This is great for performance because the
object is probably already in cache. Unfortunately, it's not so great for
detecting bugs. When the object is recycled, it will be reinitialized. All the
poison is washed off. Any dangling pointers to the old freed object will
instead see the new object. But since the new object is a perfectly legitimate
object, the buggy code will continue running. Usually until something goes
really wrong. From a security standpoint, this is also troublesome because
it's predictable. If the attacker can control the contents of either the new
or old object, they have some control over the other as well. That's always
the case, but fast recycle makes it easy to control the interleavings of
malloc and free as well, to guarantee that old and new objects overlap.

The opposite policy would be slow recycle. First in, first out. Or last in,
last out. Or LRU. The buffer cache uses slow recycle. This is great for
detecting bugs. The longer an object remains on the free list, the more
opportunities you have to check that the poison is intact.

There's also indeterminate recycling, although that may be a poor name. What I
mean is, it's not immediately obvious what policy is in effect. For instance,
malloc and pool both operate on a current working page. Freed objects will be
returned to whatever page they came from, but allocated objects always come
from the current page. So for some objects, not on the current page, this is
slow recycling. But for some objects, it's fast recycling. For example, even
though userland malloc selects a chunk at random, if you allocate the last
object in a page, then free it, then malloc again, you'll get the same object.
We addressed this in both pool and malloc by occasionally reselecting a random current page.

Random recycling. First, you can deliberately try to recycle
objects randomly. In OpenBSD, we do this by keeping a stash of recently freed
objects. Whenever something is freed, a randomly selected index is used. The
recently freed something goes into the stash, and the older something that was
freed comes out and actually goes onto the free list. This was added as a
security feature, but it's also great at mixing things up in everyday programs
as well.

more info

I'd like to refer you to Google's Project Zero blog, especially two posts. An
earlier one about Safari and a recent one on Flash. Even if you're not
interested in exploit development, they do a good job of explaining how the
heap works, particularly in regards to recycling. Understanding how malloc
works is good for everyone.

mostly harmless

One of the more difficult to integrate mitigation technologies was stack
smashing protector. Propolice includes not only a stack cookie to detect
overwrites, but it also rearranges stack frames so that buffers are more
likely to hit the cookie and not something else. In practice, this means lots
of even tiny one byte overflows are detected. We had a similar experience
adding guard pages to malloc.

Whenever a one byte overflow is found, people tend to have this reaction that
it's harmless. That gets revised to mostly harmless. Then possibly harmless.
Then maybe not so harmless.

gcc -fstack-shuffle

A quick digression to discuss another feature. martynas added a new option to
gcc that shuffles stack buffers. I already mentioned that the stack protector
arranges buffers to be near the cookie. But when there are two buffers, it can
only pick one buffer. This option resorts the buffers so that each compilation
results in a different ordering. It's compile time, not run time, but it still
found several bugs in OpenBSD.

practice

There are other appraoches to bug detection. Static analysis, of course. Being
careful, code review. In the category of dynamic tooling, lots of options
exist. Electric fence is the classic I usually refer to, and of course
Valgrind. Back in school, we had the chance to use Purify. It worked great,
but you always had to remember to build with it, then rerun your program. It
split development in half. You'd play around and get the feature working, then
you'd start over and run some tests with the purify build. It makes a lot more
sense to just always use the bug finding build, but that's not human nature.
So instead, let's make the normal build capable of flushing out bugs as well.

We're not doing anything that electric fence or valgrind can't do, but we do
it all the time. No matter how robust your test coverage is, it's never going
to cover everything that happens in the real world.

putting to use

The first thing you can try doing is running OpenBSD. There are many reasons
to pick OpenBSD, but hopefully I've given you one more. Software that works on
OpenBSD tends to work elsewhere.

The reverse is not always true, and unfortunately I think this affects
OpenBSD's reputation negatively. "Hey, this program crashes when I run it on
OpenBSD. OpenBSD sucks." I beg to differ. More likely that it's the program
that sucks. Just because a program doesn't always crash doesn't mean
it can't be induced to crash.

If you're developing a library, don't fight the operating system. If you're
developing an application, be aware of what implicit behaviors the libraries
you use may have, and how this may mask bugs. Fast recycling is pretty common
in custom allocators. It hides lots of bugs. Another OpenBSD developer told
me that they patched the APR (Apache Portable Runtime) library to simply use
the system malloc instead of custom pools. subversion stopped working.

assertions

Usually we add assertions to code to indicate that something must happen or
not happen, specfifically past tense though. Something did happen or something
did not happen. Ocassionally though, we can assert that something is allowed
to happen. For instance, in the kernel, both pool and malloc take a flag
indicating whether it is acceptable to wait. Most of the time, the system has
ample memory and doesn't wait, so the waiting paths are less tested. In an
effort to smoke out some bugs, I changed malloc to always wait if allowed.
Boom. This broke nfs, ptrace, and the vm system. There were latent race
conditions everywhere.

randomization

If something can be random, make it random. Process IDs, pids, are one
example. Originally randomized ages ago to prevent predictable PID race
conditions, they also helped uncover a bug in libpthread. At the time,
libpthread had a reaper function which would garbage collect the stack from an
exited thread, but it relied on the PID returned from kqueue to do that. If a
newly created thread received the same PID, fireworks. The probability of a
PID collision was low, so this wasn't noticed for a while, but after receiving
a bug report and test case, it was easy to reproduce. With sequential PIDs,
the bug would have been much harder to trigger because you'd need to time the
creation of the new thread with the exact moment the PID sequence rolled over,
but it's not impossible. Instead of being detected soon after release, the bug
would have lain dormant for years, only to topple some 4 year old java process
and never be seen again. So ironically, a feature designed to prevent
exploitation of race conditions helped make another race condition
reproducible.

Another great bug. I'm going to go light on the details, but the gist of how
hibernation and resume work is that the hibernating kernel writes its memory
to swap, then the resuming kernel reads that memory back in, overwriting
itself. This obviously requires that the two kernels be identical. There was a
bug that suddenly appeared where the stack protector was being triggered in
resume. Since the introduction of stack protection for the kernel, it used a
fixed cookie. Where would it get randomness from? That was recently changed.
The bootloader now fills in the random data segment of the kernel. What
happened with resume is that the currently running kernel had a different
cookie than the saved kernel. As the saved kernel was restored, the cookie
value was replaced, but the stack value wasn't updated. The real bug was that
the resume code should have switched to a different stack, but continued
running with the wrong stack for longer than it should have. Conditions
changed, assumptions were challenged, a bug was found.

QA

Colin: Have you considered returning EINTR for read and write?

That's a good idea. In general, we're limited somewhat by the state of programs.
We can't break too much at once. Netflix has a program, Chaos Monkey, that goes around
killing processes to ensure that their redudancy is working. I think that's
pushing it a little too far. Adding options to aid in debugging is great, but
it's moving a little farther from the mission of a general purpose OS.