Could the compiler guess a part of the program I am
currently writing? It's impossible in general, but it may be possible,
and hopefully interesting, in special cases. For a start, I am
interested in a situation where there is only one possibility: at
the expected type, there is only one possible program -- modulo
(an approximation of) program equivalence.

In a fixed environment, a type is a singleton type if it has only
one inhabitant. How widespread are those? Would a language feature to
infer their inhabitants be useful? Can we have more interesting
singleton types by enriching the type system?

Don't expect a structured story with a happy ending: this is only
a writing-ideas-down session for something I have been thinking about
lately.

From erasable coercions to singleton types

I started to think about this kind of singleton types seriously when
discussing with one of our fellow students at Gallium, Julien
Cretin. Julien has been working on
coercions for long enough that, if you listen to him, basically
anything in a type system is a coercion in disguise. But for the most
part he has worked on erasable coercions, that are terms from a type
A to a type B that are just the identity function (once you erase
the type annotations that make them useful).

One example in System F is the coercion

λ(f:A→∀α.B). Λα.λ(x:A). f x α

from A→∀α.B to ∀α.(A→B). Note that if you erase all the type
annotations (including type abstractions and applications), this
becomes just λfλx. f x, which is an η-expansion of the identity
function. Those identities are "erasable" in the sense that, after the
type-checking phase, you can just drop them from your program and you
will never notice: they do not affect the runtime semantics of the
program.

Erasable coercions are fairly interesting as a way to structure and
explain type systems. For example, they allow to understand various
variants of System F (System Fη, Fsub, and MLF) as instances of
a general version of System F enhanced with abstraction on coercions,
whose formal study is surprisingly challenging. You can learn about
this in Julien's 2012 paper with Didier
Rémy
(long version).

But what would be a notion of non-erasable coercion? My suggestion
was to consider terms that may not exactly be the identity, but that
are unique at their type. This is actually a fairly different concept:
some coercions (for example int → int) are not unique in their
types, and some unique inhabitants (for example in ∀αβ.(α*β)→(β*α))
have a runtime behavior that is fairly different from the identity.

Singleton types in practice

As a thought experiment, imagine adding the keyword ? in the
syntactic class of expressions of your programming language. The
semantics of ? is as follows: at typing time, it can take absolutely
any type and will never raise a type error. After the program has been
verified to be type-correct, we look at each occurrence of ? and its
inferred type. If the type is a singleton (assume we know how to
check this), we replace ? with the unique inhabitant, otherwise we
fail with an error.

That's an inference-centered point of view. In languages with less
robust type inference than ML languages, we could have a more robust
feature with a construction (? : σ) that is given a precise type,
which should later turn out to be a singleton. A simple ? is then
explained as (? : _), where _ means "infer me" at the type level.

Consider for example those lines of OCaml or Haskell code found in
the wild (in fact in the OCaml type-checker and in GHC):

The functions passed to List.map and zipWithM here are the
typical kind of boilerplate that is boring to write. Provided that
"the types are distinct enough", they are also singleton types: fun
(x,y,z) -> (x,y) is a singleton type as long as the types for x,
y and z are distinct (in this example I checked and they are),
likewise in the second example. Wouldn't it be nice to be able to write:

(List.map ?)
zipWith3M f xs ys ys = zipWithM (? f) xs (zip ys zs)

The hope is that some of the case where you think "This is dumb,
I should not have to write it" may turn out to be, when looking at
them the right way, inferrable through singleton types.

Singleton types precisely

What does it mean for a type to be a singleton type? We have to make
a choice on four aspects:

The type system we consider (the static semantics). Are we looking
for singleton types in ML, in System F, in a dependently typed
calculus? All those choices have merit, but for the purpose of the
demonstration I will mostly use examples in System F.

The language of the terms we infer (the dynamic semantics). If we
consider the full term language of OCaml (or Haskell) as the space
of our search, no interesting type is a singleton because of
non-terminating terms inhabiting any type. We want to restrict
ourselves to a pure (including total) lambda-calculus. For a start,
just the obvious explicitly typed thing with variables, lambdas and
applications will be fine.

Our notion of program equivalence: if we consider t and (λx.x) t
as two different programs, we will have trouble finding unique
inhabitants. On the other end, observational equivalence is often
undecidable for realistic languages, so we may have to pick
reasonable restrictions. I will just imagine I can perfectly decide
observational equivalence for now.

The typing context where inhabitants live: I am asking whether a type
σ has a unique inhabitant t : σ. To consider this properly
I need to pick a typing context Γ and ask about Γ ⊢ ? : σ. We
could imagine giving the user some freedom to specify the context in
which the search should take place. A good starting choice is to
consider the super-empty context: a context with no variables, but
also no constants: forget about 0, 1, 2, true, false, etc., to
get even more singleton types (bool → int → bool,
for example). Other reasonable choices include the empty context
with constants, the current context of the program, or a restriction
of the current context with only the variables that have been marked
"implicit" of sorts.

The restriction to an empty context is why I wrote (? f) in the
previous example:

zipWithM3 f xs ys zs = zipWithM (? f) xs (zip ys zs)

If I just write zipWithM ? xs .., this tries to infer code at the
type (a → (b, c) → m d), which is not inhabited in an empty
context. On the contrary, the ? in (? f) has type
(a → b → c → m d) → (a → (b, c) → m d), which is a singleton
type. This shows that this construct can actually be used for
partial term inference, just add the hints as parameters to
?. (Another possibility would be to add syntax to specify
a context by hand, eg. (f,g |- ? : σ).)

Similarly, it can also be used for plumbing: if you cannot remember
which of x or y comes first as a parameter to f, and they have
different types, (? f x y) can do the job. (If they don't have
different types there is a potential for non-statically-catchable
error and that's bad. OCaml's labeled
arguments
are very nice when used to fight this, but it's orthogonal to the
present considerations.)

Related Work

I know of surprisingly little work on the topic of "code inference"
(looking for this search string on Google Scholar for example brings
up very little hits), and would welcome pertinent pointers. There is
a paper at PLDI last year, Type-Directed Completion of Partial
Expressions
by Daniel Perelman, Sumit Gulwani, Thomas Ball and Dan Grossman, but
it is much more practically oriented: "let's look for completions
using typing hints and see how often we were right in ranking highly
a suggestion to the user". In the limit, you get something fuzzy and
interactive and machine-learning-based, where type-correctness is just
one ranking criterion among others, as you can find in Sean
McDirmid's work
on touch-based programming or calls to Programming as collaborative
reference, but
that is not where I want to go now. The idea here more to study the
mathematical structure of singleton types, which is quite a different
subject.

Of course, "singleton types" is just a special case: it could also be
interesting for code inference to determine which types have only two
inhabitants; the compiler then wouldn't choose automatically
(that would break principality) but instead give the choice to the
user. Joe B. Wells has a paper with Boris Yakobowski (which is,
incidentally, a former Gallium PhD
student), Graph-based proof
counting and enumeration with applications for program fragment
synthesis,
that does exactly this: count the number of inhabitants of types in an
intuitionistic type theory, and devise a term enumeration algorithm
out of it. It is the most pertinent previous work that I know of, and
was done with program synthesis in mind.

Singleton types and programming experience

I've mentioned that singleton types is only one specific case in the
wider setting of "please find me the inhabitants of this
type". I think however that focusing on singleton types is maybe not
as ad-hoc at is looks like: if a type has a finite number of
inhabitants, it can probably be decomposed as a sum of singleton
types. Maybe we could tell to the user of our ? construction that
"you're not a singleton yet, you have to make a choice between this
type and this type", and let him add just enough information to
determine which situation she's interested in. That would be
potentially even more useful than enumerating the full terms and
letting the user choose one, as ? may be more robust in the face of
code maintenance than system-spewed boilerplate.

More robust, or more fragile: being a singleton is not stable by
instantiation. The type ∀αβ. α*β → β*α is a singleton, but
instantiating α:=int, β:=int breaks this property. (Note that it
is interesting in practice only if the type is inferred,
because it is just as long as the term.). People are
surprised/dismayed by this aspect when I give them examples of
singleton types, but I think it's fine: we are checking singletons
after type inference, where type abstractions and applications
have been fully determined. Yet I don't know how robust those ?
would be in practice. I think the idea of doing all this after
type inference is a sane protection: I have this intuition that we
probably don't want a system that tries to infer simultaneously
the types of some part of our program, and the terms of another.

Note that it is possible to see the work on type classes as a specific
instance of the idea of singleton types. Using an operation of a type
class would add a ? to synthesize the expected dictionary. Instance
declarations are function declarations (instance dependencies being
function arrows) that populate the context of search. The coherence
relations that reduce ambiguities in Haskell are a way to ensure
unicity of inhabitants. Finally, inference of class predicates
(the way Eq a => gets added to an inferred function type if you use
(==) inside the body of the function) is a way to ensure that there
is always an inhabitant.

This is not a realization of the highest importance, because it
actually tells us little new about type classes; singletons are
a rather general way to look at everything that involves some form of
"guessing code" that is a bit principled (and doesn't just produce
code at random); this recasting should work for other features that
can be expressed as code inference. It tells us, however, that some
particular forms in the design space for putting singleton types in
practice have already found themselves useful enough for
programmers. I think that justifies looking at the concept at a wide
angle.

A temporary ending -- or delay

This post is getting quite long, so I'll temporarily conclude
here. Stay tuned for another blog post with even more ramblings about
singleton types !