Menu

The Compcert verified compiler

Compcert is a moderately optimizing compiler from most of the ISO C 1999 language to PowerPC, ARM and x86 assembly language. It comes accompanied with
a Coq proof of semantic preservation, ensuring that the generated
assembly code behaves as prescribed by the source code. The whole Compcert
development is available under a non-commercial licence. Commercial versions are available from AbsInt.

OCaml

I am the proud architect and main developer of the OCaml system, an implementation of the
Caml dialect of ML that supports equally well functional, imperative
and object-oriented programming, all brought together by an ML-style
static type system with polymorphism and type inference.
OCaml's object system, due
to Didier Rémy
and Jérôme Vouillon, is
the first to offer
ML-style type inference within a full class-based OO language.
Other notable features include a powerful SML-style module calculus,
a bytecode compiler that generates portable intermediate code that
runs very efficiently, and a native-code compiler that generates
high-performance assembly code for a variety of platforms.

Various Caml libraries and tools that I developed as side projects
are collected below.

Caml Light

Caml Light is a predecessor to
OCaml. It is a simple bytecoded implementation of the core
Caml language. Caml Light was made obsolete by OCaml, but remained in use for teaching during many years, especially in French preparatory classes
(classes préparatoires scientifiques). It is now officially abandoned.

LinuxThreads

I'm also a happy user of Linux. My contribution to the Linux effort is
the first version of
LinuxThreads, a multithreading
library for Linux implementing the Posix 1003.1c standard for threads.
This library was the standard thread library in Linux distributions
between 1998 and 2004. Since then, it has been completely
reimplemented by Ulrich Drepper and Ingo Molnar at Redhat. Their
implementation is much cleaner and more efficient than mine.

Java Card technologies

I have also worked on smart cards, more specifically on
compact, efficient and secure Java execution environments for Java Card
(Java-enabled smartcards). I started this work as a consultant for
Bull CP8, then joined the
Trusted Logic start-up company
between 1999 and 2004.
The most visible outcome of this work is the world's first on-card
bytecode verifier for JavaCard, marketed by Axalto under the name
Codeshield.

SpamOracle: detection of spam by statistical
analysis of e-mail contents

SpamOracle, a.k.a. "Saint Peter",
is a tool to help detect and filter away "spam"
(unsolicited commercial e-mail). It proceeds by statistical analysis
of the words that appear in the e-mail, comparing the frequencies of
words with those found in a user-provided corpus of known spam and
known legitimate e-mail. The classification algorithm is based on
Bayes' formula, and is described in Paul Graham's paper,
A plan for
spam.

This program is designed to work in conjunction with
procmail.
The result of the analysis is output as an additional message header
X-Spam:, followed by
yes, no or unknown, plus additional
details. A procmail rule can then test this X-Spam: header and
deliver the e-mail to the appropriate mailbox.

Cryptokit: cryptographic primitives for OCaml

The Cryptokit library for OCaml provides a variety of
cryptographic primitives that can be used to implement cryptographic
protocols in security-sensitive applications. The primitives provided
include:

Hash functions and MACs: SHA-1, SHA-256, RIPEMD-160, MD5, and MACs based on AES and DES.

Random number generation.

Encodings and compression: base 64, hexadecimal, Zlib compression.

Additional ciphers and hashes can easily be used in conjunction with
the library. In particular, basic mechanisms such as chaining modes,
output buffering, and padding are provided by generic classes that can
easily be composed with user-provided ciphers. More generally, the library
promotes a "Lego"-like style of constructing and composing
transformations over character streams.

CamlIDL: automatic generation of Caml/C
interface stubs

CamlIDL
is a stub code generator and COM binding for OCaml. It
automates the generation of C stub code required for the Caml/C
interface, based on an MIDL specification. (MIDL stands for
Microsoft's Interface Description Language; it looks like C header
files with some extra annotations, plus a notion of object interfaces
that look like C++ classes without inheritance.) In addition, it can
be used to interface Caml code with software components that follow
Microsoft's now semi-defunct COM interface.

OCamlAgrep: string searching with errors

This OCaml library implements the Wu-Manber algorithm for string searching
with errors, popularized by the "agrep" Unix command and the "glimpse"
file indexing tool. It was developed as part of a search engine for a
largish MP3 collection; the "with error" searching comes handy for those
who can't spell Liszt or Shostakovitch.

Given a search pattern and a string, this algorithm determines whether
the string contains a substring that matches the pattern up to a
parameterizable number N of "errors". An "error" is either a
substitution (replace a character of the string with another
character), a deletion (remove a character) or an insertion (add a
character to the string). In more scientific terms, the number of
errors is the Levenshtein edit distance between the pattern and the
matched substring.
The search patterns are roughly those of the Unix shell, including
one-character wildcard (?), character classes
([0-9]) and multi-character wildcard (*).
In addition, conjunction (&) and alternative
(|) are supported. General regular expressions are not
supported, however.

CamlJava provides a low-level, weakly-typed OCaml interface
very similar to the JNI. Java object references are mapped to an
abstract type, and various JNI-like operations are provided to allow
Java method invocation, field access, and more. A basic callback
facility (allowing Java code to invoke methods on Caml objects) is
also provided, although some stub Java code must be written by hand.

This library has been tested against Sun's JDK under Linux and Windows.
On other platforms, some Makefile tweaking is probably necessary.

For higher-level, type-safe interfaces between Caml and Java,
see O'Jacare
and also OCaml-Java.

CamlZip: reading and writing ZIP, JAR and GZIP files

This OCaml library provides easy access to compressed files in ZIP
and GZIP format, as well as to Java JAR files. It provides functions
for reading from and writing to compressed files in these formats.

OCamlMPI: interface with the MPI message-passing interface

OCamlMPI provides Caml bindings for a large subset of MPI functions.
MPI is a popular library for distributed-memory parallel programming
in SPMD (single program, multiple data) style.
MPI offers both point-to-point message passing and group communication
operations (broadcast, scatter/gather, etc).
Several implementations of MPI are available, both for networks of
Unix workstations and for supercomputers with specialized communication
networks.
More info on MPI is available
here.

OCaml-callcc: call/cc for OCaml

This library implements the call/cc
(call-with-current-continuation)
control operator for OCaml.
This is a very naive implementation: it works only in bytecode, and
performance is terrible (call/cc copies the whole stack). It is
intended for educational and experimental purposes. Use in production
code is not advised.
See also
delimited continuations in OCaml by O. Kiselyov.

PGF driver for GNU pic

GNU PIC
(gpic) is a text preprocessor that translates
high-level geometric descriptions of pictures to Troff and LaTeX
low-level graphics primitives.
PGF
(the Portable Graphics Format) is a LaTeX macro package for generating
graphics that are compatible with several TeX back-ends. In
particular, PGF is used by the excellent Beamer LaTeX macro
package to compose good-looking presentations. As an avid user of
both GNU PIC and Beamer, I was frustrated that I was not able to use
both at the same time. This patch against the GNU Groff distribution
adds a new back-end to GNU PIC that generates PGF graphics
primitives. Usage:

gpic -p < inputfile > outputfile

There is no documentation and no support. It works for me; maybe it
will work for you as well.