Menu

I'm interested in most aspects of computer programming, in
particular in programming languages, tools, and systems to make
programs safer, easier to write, and more efficient. Most of my work
revolves around the following topics:

Programming language design, formalization and implementation.
Popular programming languages are most often unsafe, giving
programmers way too much rope to hang themselves with, and/or
inexpressive, forcing programmers to deploy painful,
time-consuming and error-prone encodings of domain concepts in terms
of language constructs. I believe that safer, more expressive programming
languages based on the functional programming paradigm can
tremendously improve the quality and timeliness of software. The
OCaml language embodies some
of my results in this area.

Type systems, module systems and related static analyses.
Many common programming errors can be detected automatically by static
analysis of the code. I have worked on type systems,
module systems and static analyses to detect large classes
of programming errors at compile-time,
ensure safety properties in programs, and
support software reuse and parameterization in a type-safe way.

Compilers. I have a long-standing fascination for compilers
and more generally for all sorts of translations from one computer
language to another. I have worked
on the efficient compilation of the ML language, resulting in
the OCaml system.
These days, in the CompCert project,
I'm having great fun proving semantic preservation for
optimizing compilers using the Coq proof assistant.

Mechanized proofs.
I am a firm believer in formal methods and enjoy using proof
assistants such as Coq to specify
and prove properties of programs, operational semantics, type systems,
compilers and other tools that participate in the software development
process.

The remainder of this page lists the main research projects I've been
involved in, roughly in reverse chronological order.

Formal verification of compilers

Compilers sometimes contain bugs causing wrong machine code to be
generated for a correct source program. This is a concern for
critical software certified using formal methods: the guarantees so
painfully obtained by formal verification of the source program are
useless if the compiler generates wrong code. This problem can be
avoided by formally verifying the compiler itself, proving that it
preserves the semantics of source programs. I currently work on
the Compcert project: the
development and proof of correctness, using the Coq proof assistant,
of a moderately-optimizing C compiler usable for critical embedded
systems. This is a fascinating line of work, combining compilation,
functional programming, operational semantics and program proof.

OCaml

OCaml is a popular, typed
functional programming language of the ML family. Besides leading the
development effort for OCaml between 1995 and 2012, my main
contributions were the design and implementation of its module system
(a more syntactic variant of the Standard ML module system) and of its
two compilers: a simple yet efficient bytecode compiler and an
optimizing native-code compiler generating high-performance assembly
code for a variety of platforms.

Mechanized semantics

The Coq proof assistant and
other systems based on the propositions-as-types,
proofs-as-programs paradigm provide a fascinating way to do
mathematics with the assistance of the computer. Besides my work on
the formal verification of compilers, for several years I have been
using Coq to formalize and prove properties of programming language
components such as bytecode verifiers, operational semantics and type
systems. Recently, I re-discovered a
coinductive approach to big-step operational semantics (natural
semantics) that enables it to describe non-terminating computations
quite nicely. I have also participated in the
POPLmark
challenge, a collective experiment in mechanizing the metatheory
of type systems.

Language-based security

Circa 1995, Java applets promoted the idea of type-checking
untrusted compiled bytecode before execution on a trusted device.
While Java applets never really caught on, they popularized the idea
that high-level programming languages and static type systems are good
not only for program safety, but also for security as well -- an
approach known today as language-based security. With my colleague
François Rouaix, I tried to characterize the security
implications of strong typing. We formalized and proved secure a few
security patterns that rely on strong typing in an essential way.
Later, I got interested in Java bytecode verification -- the static
analysis that establishes the well-typedness of compiled Java bytecode.
In the context of the Trusted Logic
start-up company, I developed a variant of Java bytecode verification
that runs in very low memory space and can therefore be embedded in
Java smart cards.

Module systems

Modular programming can be done in any language, with sufficient
discipline from the programmers. However, it is facilitated if the
programming language provides constructs to express some aspects of
the modular structure and check them automatically. The Standard ML
language introduced a powerful module systems featuring multiple views
of modules and parameterized modules (called "functors").
Between 1993 and 1997, I designed, formalized and implemented within
Objective Caml a variant of the Standard ML module system that is very
close to an ordinary type system and uses only syntactic signatures.
Later, my Ph.D. student Tom Hirschowitz and I worked on mixin
modules, which extend ML modules with late binding and extensibility
features reminiscent of class-based object-oriented programming.

Types in compilation and static analysis

As evidenced by Fortran and C, static type checking was initially
introduced in programming languages not to improve the safety of
programs, but to enable the compiler to generate more efficient code: the
static types of data determines their run-time representation, and the
compiler-generated code for data accesses is based on these types.
For quite a while, I was puzzled why such type-based compilation schemes
were not used for advanced, statically-typed languages such as ML and
Haskell. I eventually realized that the reason was parametric
polymorphism and type abstraction, which lead to statically unknown
(or partially-known) types. Between 1991 and 1994, I designed
and implemented a type-directed data representation scheme for ML,
which circumvents the problem with polymorphism and type abstraction
through the use of type-directed coercions between multiple
representations for the same data type. Experimental results were not
uniformly positive, so I gave up on this idea. However, it sparked
considerable interest in the type systems community, which
discovered many other interesting interactions between type systems
and the compilation process. (See the
TIC & TLDI workshops.)

Later, my Ph.D. student François Pessaux and I worked on a
type-based static analysis for uncaught exception in Caml programs.
Initially, we planned to use untyped static analysis techniques such
as k-CFA, but discovered that a better compromise
between analysis speed and analysis precision could be obtained by
using a type and effect system based on unification between row types.

Type systems

During my Master's and my Ph.D. (1988-1992),
I worked on type systems and type inference, more specifically on
dynamic types, on type abstraction, and on polymorphic typing of
imperative features such as references, communication channels and
first-class continuations.