First-class modules: hidden power and tantalizing promises

Introduction

First-class modules introduced in OCaml 3.12 make type constructors
first-class, permitting type constructor abstraction and polymorphism. It
becomes possible to manipulate and quantify over types of higher kind. We
demonstrate that as a consequence, full-scale, efficient generalized
algebraic data types (GADTs) become expressible in OCaml 3.12 as it is,
without any further extensions. Value-independent generic programming along
the lines of Haskell's popular ``Generics for the masses'' become possible
in OCaml for the first time. We discuss extensions such as a better
implementation of polymorphic equality on modules, which can give us
intensional type analysis (aka, type-case), permitting generic programming
frameworks like SYB.

Simplistic GADTs in OCaml

We illustrate one simplistic, pure, magic-free implementation of a
form of GADTs in OCaml that is sufficient for the common applications
of GADTs such as data structures with embedded invariants, typed
printf/scanf, tagless interpreters. The implementation is a simple
module, requiring no changes to the OCaml system. The implementation,
based on the common technique of term witnesses of type equality, is
so trivial that it should work on any ML system (although, like nested
data types, GADTs aren't very useful on an SML system without support
for polymorphic recursion).

Our examples include:

enforcing invariants on data structures: statically ensuring that in a tree representation of an HTML document, a link node is never an ancestor of another link node;

typed printf/scanf sharing the same format descriptor, which is first-class and can be built incrementally;

We see that common GADTs are available in OCaml here
and now. We can truly write the published examples that motivated
GADTs, without too much violence to their notation. We can translate
GADT code from Haskell, more or less mechanically. No changes to the
OCaml type system or the type checker are necessary. Of course changes
such as explicit existential quantification, better support for rank-2
types, etc. shall be greatly appreciated -- but they are not necessary
to start using and enjoying GADTs. First-class modules introduced
in OCaml 3.12 let us implement genuine Leibniz equality and better
GADTs.

GADTs

First-class modules -- first-class functors -- permit type
constructor abstraction and polymorphism. Type constructor polymorphism makes
it possible to encode genuine Leibniz equality. The latter, along with
existentials and polymorphic recursion (both of which can also be encoded
with first-class modules), let us implement GADTs. We can now work with
``real'' GADTs in OCaml, without the need for extensions.

GADTs let us express a form of bounded polymorphism, where only
some instances of a type schema are populated. The populated instances
represent a relation among types -- the role GADTs share with type classes.

The injectivity of our GADTs is, alas, rather restricted: it holds
only for functors, and only after we downgrade GADTs to simplistic GADTs.

Version

The current version is June 2010.

References

polyrec.ml [5K]
Six encodings of polymorphic recursion in OCaml 3.12, including two hitherto unavailable encodings using first-class modules

Generics for the OCaml masses

Generic programming in ML typically relies on the value representation
of types. In most generic programming libraries so far the
type representation was a collection of generic functions
specialized for that particular type. First-class modules
permit for the first time value-independent generic programming.
A type s is represented as a value of the type s repr:

The type representation Repr contains no information about
specific generic functions such as show. Rather, Repr
receives the interpretation from the user and selects the one
that pertains to the represented type. A generic function like
show takes the type representation s repr and supplies
the Interpretation argument, describing how to
show primitive and composite types.