A few billion lines of code later: using static analysis to find bugs in the real world
.
Al Bessey, Ken Block, Ben Chelf, Andy Chou, Bryan Fulton, Seth Hallem, Charles Henri-Gros, Asya Kamsky, Scott McPeak, Dawson Engler. Communications of the ACM archive Volume 53 , Issue 2 (February 2010) Pages: 66-75.
According to ACM this is their most downloaded paper ever.
I wrote it, but everyone else did all (I
mean: ALL) of the technical work. It documents the oft-times bizarre
things that happened when we took our static bug finding research and
commercialized it. It's a light read, full of stuff you are glad happened
to other people. Someone other than me did a final edit pass, doing
odd things to some transitions and neutering some language, if you want
the sharper-edged draft, send
email. (I think the information content is about the same in both.) I have
30min, 60min, and 90min talks based on it that I'm always happy to give to avoid
real work.

My research focuses on effective, automatic methods for finding
lots of bugs in real code. We have used three main approaches:
system-specific static analysis, implementation-level model checking, and
most recently, using symbolic execution to automatically generate
high-coverage tests.
The following papers
give a reasonable overview of the first two approaches (the Klee paper
above describes that last):

eXplode: a Lightweight, General System for Finding Serious
Storage System Errors, Junfeng Yang, Can Sar, and Dawson Engler,
Proceedings of the 7th Symposium on Operating System Design and
Implementation, 2006. (postscript,
PDF)

Bugs as Deviant Behavior: A General Approach
to Inferring Errors in Systems Code
(postscript)
(PDF).
Slides:
PS,
PDF,
PPT.
Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf.
Appeared in: Proceedings of the Eighteenth ACM Symposium on Operating Systems Principles, 2001.
This paper has some of my favorite ideas, though unfortunately I can't
say the same for the writing. If a static checking person was going to
read just one of our papers this should be it: I've used its ideas in
every static checker I've built since writing it, typically more
than once. I've never seen a checker that could not be improved by
using its trick of belief analysis to infer what to check or what the
state of a checked system is based on
what programmers seem to believe.

Checking System Rules Using
System-Specific, Programmer-Written Compiler Extensions (Best Paper)
(postscript)
(PDF).
Slides:
PS,
PDF,
PPT.
Dawson Engler, Benjamin Chelf, Andy Chou, and Seth Hallem. Appeared
in: Proceedings of the 4th Symposium on Operating System Design and
Implementation.
This paper
discusses a set of small extensions that found
roughly 500 bugs in Linux, OpenBSD, and the Xok exokernel. The extensions
were usually less than 100 lines. Our current static checking systems
are much more powerful, but the basic approach is roughly the same.
I co-founded a company, Coverity,
several years ago with three of my students
(Seth Hallem, Andy Chou and Ben Chelf) to commercialize this static
checking work. Coverity has 500+ customers and 130+ employees.
A free trial of the tool can be obtained here.

Automatically generating malicious disks using symbolic execution,
Junfeng Yang, Can Sar, Paul Twohey, Cristian Cadar, and Dawson Engler.
(PDF) (postscript),
IEEE Security and Privacy, 2006.
This paper uses a system we built, EXE, to automatically generate disk
images of death that, when mounted, will cause a file system to crash.
Many systems now let untrusted users mount files as file systems, which
opens them up to all sorts of exploits, given the somewhat erratic input
checking done by current file system code.

Automatic test case generation by executing code on symbolic inputs.
A technical report containing our SOSP05 submission gives preliminary
results: Execution generated test cases: how to make systems code crash
itself, CSTR-2005-04 (pdf). The trick
is simple and cute. A shortened version appeared as an invited paper
in SPIN05 (pdf). The two EXE papers given above
are probably better to read.

Software model checking

Using Model Checking to Find Serious File System Errors (Best Paper)
(postscript)
(PDF),
Junfeng Yang, Paul Twohey, Dawson Engler, and
Madanlal Musuvathi, Operating System Design and Implementation (OSDI) 2004.
Describes FiSC, a file system model checking tool. We applied it to
three widely-used, heavily-tested file systems: ext3, JFS, and ReiserFS.
We found serious bugs in all of them, 32 in total. Most have led to
patches within a day of diagnosis. For each file system, FiSC found
demonstrable events leading to the unrecoverable destruction of metadata
and entire directories, including the file system root directory ``/''.
The actual bugs we found are
here.

Model-checking large network protocol implementations
(postscript)
(PDF),
Dawson Engler and
Madanlal Musuvathi. Proceedings of the First Conference
on Network System Design and Implementation (NSDI), 2004.
Describes the challenges
in model checking
50K lines of TCP code. One of the surprising
results was that it was easier to run the entire Linux kernel in Madan's
CMC model checker than extract out TCP in a stand-alone version.

Static analysis versus software model checking for bug finding
(postscript)
(PDF). Extended version with better
formatting:
(postscript)
(PDF),
Dawson Engler and
Madanlal Musuvathi.
Invited paper for VMCAI04. It describes some of our experiences finding
bugs with both model checking and static analysis, including some of
the less pleasant aspects of both. Slides are available:
postscript,
PDF,
PPT.

The paper is a reworked version of a invited paper for the SoftMC
2003 workshop:
"Some lessons from using static analysis and software model checking for bug finding"
(postscript)
(PDF), Madanlal Musuvathi and Dawson R. Engler.
The writing could be better, but some of the tables are reasonable.
Slides are available:
postscript,
PDF,
PPT.

CMC: A pragmatic approach to model checking real code
(postscript)
(PDF)
,
Madanlal Musuvathi, David Y.W. Park, Andy Chou, Dawson R. Engler,
David L. Dill. Appeared in OSDI 2002. A model checker that can
check C code directly,
dispensing with the need to write a specification. We applied it
to three
different AODV routing protocol implementations, where it found roughly
one bug per 300 lines of code.

A Simple Method for Extracting Models from Protocol Code
(postscript)
(PDF)
,
David Lie, Andy Chou, Dawson Engler, and David Dill Appeared
in ISCA 2001. It
shows how to check deeper properties than possible with static analysis
by
using
MC to automatically extract specifications (models) from actual C code and then running these models through a formal verifier.

Correlation Exploitation in Error Ranking
(postscript)
(PDF)
,
Ted Kremenek, Ken Ashcraft, Junfeng Yang and Dawson Engler. To appear:
Foundations of Software Engineering (FSE) 2004. This paper demonstrates a
error ranking scheme that uses machine learning to
reduce the observed
false positive rate from 2 to 8 times as compared to random ranking.
It exploits the fact that both true error and false positives cluster.
I.e., given a group of related error messages, if one is a true error
it is likely that the others are bugs as well. Similarly, if one is a
false positive, it is likely the others are false positives as well.

Z-Ranking: Using Statistical Analysis to Counter the Impact of
Static Analysis Approximations
(postscript)
(PDF)
,
Ted Kremenek and Dawson Engler. Appeared in SAS 2003. A simple
technique for detecting when static analysis makes a mistake.
The basic
intuition: the most trustworthy analysis decisions are those
that lead to many successful checks and relatively few errors. Analysis
mistakes tend to flag many errors and have few successful checks.

(Reasonably) generic static analysis for bug finding

MECA: an Extensible, Expressive System and Language for Statically Checking Security Properties (PDF)
Junfeng Yang, Ted Kremenek, Yichen Xie, and Dawson Engler. Proceedings of
the 10th ACM conference on Computer and communication security (ACM CCS),
2003. Flexible annotations plus methods for propagating them in a way
that does not require much manual labor nor kill you with false positives.

RacerX: Effective, Static Detection of Race Conditions and Deadlocks
(postscript)
(PDF)
Dawson Engler and Ken Ashcraft. Appeared in SOSP 2003. Has about 20 tricks
or so (unlockset analysis is the coolest) to help find racers and deadlocks.
Slides are available:
PS,
PDF,
PPT.

Using Redundancies to Find Errors (Award Paper)
(postscript)
(PDF)
Yichen Xie and Dawson Engler To appear in TSE. Finds funny bugs by
looking for redundant operations (dead code, unused assignments, etc.).
From empirical measurements, code with such redundant errors is 50-100%
more likely to have hard errors. Also describes how to check for
redundancies to find holes in specifications.

It is an expanded version of an identically titled award
paper
(postscript)
(PDF) that appeared in Foundations of
Software Engineering (FSE) 2002.

How to Write System-specific, Static Checkers in Metal
(postscript)
(PDF),
Benjamin Chelf, Seth Hallem, and Dawson Engler
Invited paper for PASTE'02.
A simple tutorial on writing extensions. Does not introduce new
ideas, but gives concrete examples that helps understand some of
the previous papers.

This paper validates the earlier SOSP paper, which was big on religion
and microbenchmarks and a bit light on application numbers.
It also has a cool use of deterministic Turing machines that lets an
exokernel trust applications to track what they own without understanding
how they do so. Slides are available
here.

Dynamic code generation Papers

The following papers discuss several dynamic code generation systems.

Reverse-Engineering Instruction Encodings
by Wilson C. Hsieh, Dawson R. Engler, and Godmar Back. This paper
appeared in the proceedings of the 2001 USENIX Annual Technical
Conference. It uses a cool trick to automatically extract
instruction encodings from a system assembler. It's genesis was
the fact that writing a binary encoder for the x86 is an
unredeemably disgusting experience.