Parallel

Proving Correctness of an OS Kernel

By Gernot Heiser, January 25, 2010

This proof surpasses by far what other formal-verification projects have achieved

Formal Verification

Other than exhaustive testing, formal verification is the only known way to guarantee a system free of bugs. This approach is not well known and seldom used, because it is considered expensive and only applicable to very small bodies of code. However, a small microkernel is amenable to formal verification, as the project demonstrated.

The technique used for formal verification is a refinement proof using interactive theorem proving. Refinement shows a formal correspondence between a high-level "abstract" and a low-level "concrete" model. Refinement proves that all possible behaviors of the concrete model are captured by the abstract one.

In this case, the abstract model is the specification of kernel behavior, essentially its ABI, expressed in a language called "higher-order logic" (HOL). The concrete model is the kernel's implementation -- the C code processed by a C compiler to produce an executable kernel. Refinement then proves that the kernel truly behaves according to its specification, i.e., it is functionally correct.

Functional correctness means that implementation always strictly follows the high-level abstract specification of kernel behavior. Provable functional correctness encompasses traditional design and implementation safety properties such that the kernel will never crash, and will never perform an unsafe operation. Functional correctness also enables prediction of precisely how a kernel will behave in every possible situation. Such assurances rule out any unexpected behavior. One implication is that it becomes impossible to inject arbitrary code into the kernel as happens with popular "stack-smashing" attacks -- our kernel is provably immune to such attacks.

At the concrete level, the translation from C into HOL is correctness-critical; the team took great care to model the C subset semantics precisely, treating C semantics, types, and memory model exactly as the standard prescribes, e.g., with architecture-dependent word size, padding of structs, type unsafe casting of pointers, and arithmetic on addresses. The actual proof was conducted with and checked by an interactive theorem prover called Isabelle/HOL. Interactive theorem proving requires human intervention and creativity to construct and guide the proof. However, it has the advantage of not being constrained to specific properties or finite, feasible state spaces, unlike more automated methods of verification, such as static analysis or model checking.

Kernel Design for Verification

To facilitate verification, kernel design should minimize the complexity of its components. Ideally, kernel code (and associated proofs) would consist of simple statements that rely on explicit local state, with simple invariants. These smaller elements could then be integrated into larger elements that avoid exposing underlying local elements. Unfortunately, OS kernels are not usually so structured and instead feature inter-dependent subsystems, as is the case for a small, high-performance microkernel, such as the various members of the L4 microkernel family. By removing everything from the kernel that can securely be implemented in user space, the microkernel is the "essence of OS messiness," and its components are highly interconnected.

A major design goal of OK:Verified was suitability for real world use, requiring performance comparable to the fastest existing microkernels. Therefore, the kernel design methodology aims to minimize proof complexity without compromising performance.

OS developers tend to take a bottom-up approach to kernel design. They strive for high performance by managing hardware efficiently, leading to designs governed by low-level details. In contrast, formal methods practitioners tend toward top-down design, as proof tractability is determined by system complexity.

A top-down approach results in designs based on simple models with a high degree of abstraction from hardware. As a compromise that blends both views, we adopted an approach based around an intermediate target, readily accessible to both OS developers and formal methods practitioners. We used the functional programming language Haskell as a tool for OS developers, while at the same time providing an artifact for automatic translation into the theorem-proving tool.

This resulted in a high-level prototype of the kernel, implemented in Haskell, (a general-purpose, purely functional programming language, with non-strict semantics and strong static typing) that is itself an executable program. The prototype required design and implementation of algorithms that manage the low-level hardware details, and as such shares many implementation details of the real kernel.

To execute the Haskell prototype in a reasonably realistic setting, the team linked it with software derived from QEMU, a portable processor emulator that relies on dynamic binary translation for speed. Normal user-level execution is enabled by the emulator, while traps are passed to the kernel model which computes the result of the trap. The prototype modifies the user-level state of the emulator to appear as if a real kernel had executed in privileged mode.

This arrangement provides a prototyping environment that enables low-level design evaluation from both the user program and kernel perspective, including low-level physical and virtual memory management. It also provides a realistic execution environment that is binary-compatible with the real kernel. Employing the standard ARM tool chain for developing software and executing it on the Haskell kernel had the nice side effect of letting the team port software to the new kernel and get experience with its ABI before the "real" kernel existed.

Although the Haskell prototype provides an executable model and implementation of the final design, it is not itself the final production kernel. We manually re-implemented the model in the C programming language for several reasons. Firstly, the Haskell runtime is a significant body of code (much bigger than our kernel), which would be hard to verify for correctness. Secondly, the Haskell runtime relies on garbage collection which is unsuitable for real-time environments.

Additionally, using C enables optimization of the low-level implementation for performance. Although an automated translation from Haskell to C would have simplified verification, the team would have lost most opportunities to optimize the kernel, required for adequate microkernel performance. Instead, we could build a kernel that performs on par with the fastest kernels.

Other elements of the kernel design, such as incorporating global variables, memory management, concurrency, and I/O are fully described in seL4: Formal Verification of an OS Kernel. The kernel implementation used a slightly restricted subset of the C language to ease verification. Specifically we outlawed goto statements, fall-through in switch statements, and the use of functions with side effects in expressions -- all features good programmers avoid anyway. The most serious restriction was prohibiting passing of pointers to stack variables as function arguments (but pointers to global variables are allowed).

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Video

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!