Friday, December 12, 2014

One of my favorite things in language design is when a high-level
language feature lets you change your low-level runtime system in
interesting ways. In this post, I'll talk about one such idea, which I
probably won't have the time to work out in detail any time soon, but
which is pretty enough that I want to offer it to the interweb.

In 2004, David Bacon, Perry Cheng, and V.T. Rajan published their
paper A Unified Theory of Garbage Collection,
which showed that tracing collectors and reference counting were the
two extreme ends of a spectrum of memory management strategies.

Consider Cheney's two-space collector. When performing a garbage
collection, it will start with the root objects, copy them from
old-space to new-space, and then recursively traverse the objects
reachable from the roots, moving them from old-space to new-space as
it goes.

An important property of Cheney's algorithm is that it never touches
any object that needs to be freed; it only follows pointers from live
objects to live objects.

On the other hand, consider the naive reference counting algorithm.
When the reference count on an object goes to zero, the algorithm will
decrement the reference counts of everything the object points to,
recursively invoke itself on all of the objects whose reference counts
also went to zero, and then free the original object.

Bacon, Cheng and Rajan observed that this reference counting algorithm
has the opposite property as Cheney's algorithm -- it only traverses
dead objects, and never follows a pointer inside a live
object.

When object lifetimes are very short, tracing collectors beat
reference counting, because very few objects will be live at gc time,
and tracing only follows live objects. On the other hand, when object
lifetimes are very long, reference counting wins, because very few
objects are dead at each collection, and reference counting only
follows dead objects.

So (a) the best memory management strategy to use depends on the
lifetime of the object in question, and (b) every memory management
algorithm can be seen as a hybrid of reference counting and tracing,
based on which objects it chooses to follow.

In 2003, Stephen Blackburn and Kathryn Mckinley gave an incredibly
slick application of this idea in their paper Ulterior
Reference Counting. (Yes, the chronology is backwards:
research is asynchronous.)

Most production garbage collectors are based on what is called "the
generational hypothesis". This says that in typical programs, most
objects have a very short lifetime, and only a few have a long
lifetime. So it's a good idea to allocate objects into a region of
memory called "the nursery", and when it fills up, to copy live
objects out of it. Because of the generational hypothesis, most
objects in the nursery will be dead, and so collecting the nursery
will be very fast.

Blackburn and McKinley observed that the generational hypothesis also
implies that if an object survives a young collection, it's likely to
live for a very long time. So in their algorithm, they have a nursery
as usual for tracing collectors. But for objects copied out of the
nursery, they use reference counting. That is, for objects likely to
have a short lifetime, they use tracing, and for objects likely to
have a long lifetime, they use reference counting!

Now, if you're a functional programmer, the mere mention of reference
counting very likely rings some alarm bells --- what about cycles?
Reference counting is, after all, notorious for its inability to
handle cyclic data.

Blackburn and McKinley handle this issue with a backup mark-and-sweep
algorithm that is periodically run on the old generation. But wouldn't
it be nice if we could just know that there isn't any cyclic
data in the heap? Then we could do away with the backup collector, and
implement a very simple "trace young, reference count old" collector.

Surprisingly, this is possible! If we program in a pure functional language,
then under the usual implementation strategies, there will never be nontrivial
cycles in the heap. The only way a cyclic reference could occur is in
the closure of a recursive function definition, and we can simply mark such
recursive pointers as something the gc doesn't need to follow.

So a very high-level property (purity) seems to imply something
about our low-level runtime (the gc strategy strategy)! Proving this
works (and benchmarking it) is something I don't have room on my plate
for, but it's something I wish I could do...

Tuesday, December 9, 2014

Indeed, when we realize usual axioms of mathematics, we need to introduce, one after the other, the very standard tools in system programming: for the law of Peirce, these are continuations (particularly useful for exceptions); for the axiom of dependent choice, these are the clock and the process numbering; for the ultrafilter axiom and the well ordering of $\mathbb{R}$, these are no less than read and write instructions on a global memory, in other words assignment.

Thursday, November 13, 2014

Together with Jennifer Paykin and Steve Zdancewic, we have written a short note about the next phase of the long project to make GUI programming intellectually manageable.

Essentially, the idea is that the natural language of graphical user interfaces is $\pi$-calculus, typed using classical linear logic (plus some temporal modalities). Furthermore, we think that the implementation strategy of callbacks and event loops can be understood in terms of Paul-Andre Mellies' tensorial logic. So we think we can:

Explain how there are secretly beautiful logical abstractions
inside the apparent horror of windowing toolkits;

Illustrate how to write higher-order programs which automatically
maintain complex imperative invariants, and

Write some Javascript programs which we think are actually $\pi$-calculus terms in disguise.

Tuesday, November 4, 2014

I'm very happy to announce that Integrating Linear and Dependent Types will appear at POPL 2015! The link above goes to the final version, which (at the behest of the reviewers) has been significantly expanded from the original submission. (Added up, the total length of the reviews was almost half the length of the submission, which says something about the degree of care taken in the process.)

One of the big difficulties in applying contract checking to functional programming is that it breaks tail call optimization. This paper says that you can do it without breaking TCO, which is (a) a real breakthrough, and (b) probably has all kinds of applications.

I knew fibrations were important for characterizing inductive definitions in type theory, parametricity, and the semantics of dependent types. Apparently they are also important for characterizing refinement types.

This is a dependent type theory which works up the congruence closure of the equality hypotheses in the context, rather than using a judgmental equality. The treatment of equality is the central problem in the design of dependently typed languages, so it's nice to see exploration of the design space. (This approach reminds me a bit of the Girard/Schroeder-Heister equality rule, which semi-secretly underpins GADTs.)

Monday, October 20, 2014

Ever since I learned about them, I've thought of
call-by-push-value and focusing (aka polarization) as
essentially two different views of the same problem: they both give a
fine-grained decomposition of higher-order effectful programs which
permits preserving the full βη-theory of the language.

Until this morning, I had thought that the differences were merely
cosmetic, with CBPV arising from Paul Levy's analysis of the
relationship between denotational semantics and operational semantics,
and focusing arising an analysis of the relationship between
operational semantics and proof theory (a lot of people have looked
at this, but I learned about it from Noam Zeilberger). Both systems decompose a
Moggi-style computational monad into a pair of adjoint operators,
which mediate between values and computations (in CBPV) and positive
and negative types (in focusing). So I thought this meant that “value
type” and “positive type” were synonyms, as were “computation type”
and “negative type”.

This morning, I realized I was wrong! Focusing and call-by-push-value
make precisely the opposite choices in their treatment of variables!
To understand this point, let's first recall the syntax of types for
a call-by-push-value (on top) and a polarized (on bottom) calculus.

At first glance, these two grammars look identical, save only for the
renamings and . But this is
misleading! If they are actually the same idea, the reason has to be
much more subtle. The reason for this is that the typing judgements
for these two systems are actually quite different.

In call-by-push-value, the idea is that is a functor which
is left adjoint to . As a result, values are interpreted in a
category of values , and computations are interpreted in a
category of computations . The adjunction between values and
computations means that the hom-set is ismorphic
to the hom-set .
This adjunction gives rise to the two basic judgement forms of
call-by-push-value, the value judgement
and the computation judgement . The idea
is that and
.

The key bit is in the interpretation of contexts in computations, so
let me highlight that:

Note that we interpret contexts as , and so this says that
variables refer to values.

However, in a polarized type theory, we observe that positive types
are “left-invertible”, and negative types are “right-invertible”. In
proof theory, a rule is invertibile when the conclusion implies the
premise. For example, the right rule for implication introduction in
intuitionistic logic reads

This is invertible because you can prove, as a theorem, that

is an admissible rule of the system. Similarly, sums have a
left rule:

such that the following two rules are admissible:

The key idea behind polarization is that one should specify the
calculus modulo the invertible rules. That is, the judgement on the
right should fundamentally be a judgement that a term has a positive
type, and the hypotheses in the context should be negative. That is,
the two primary judgements of a polarized system are the positive introduction
judgement

which explains how introductions for positive types work, and the
negative elimination (or spine judgement)

which explains how eliminations for negative types work. The
eliminations for positive types are derived and the introductions for
negative types are derived judgements (which end up being rules for
pattern matching and lambda-abstractions) which make cut-elimination
hold, plus a few book-keeping rules to hook these two judgements
together. The critical point is that the grammar for consists
of negative types:

This is because positive types are (by definition) left-invertible,
and so there is no reason to permit them to appear as hypotheses. As
a result, the context clearly has a very different character than in
call-by-push-value.

I don't have a punchline for this post, in the sense of “and therefore
the following weird things happen as a consequence”, but I would be
astonished if there weren't some interesting consequences! Both
focalization and call-by-push-value teach us that it pays large
dividends to pay attention to the fine structure of computation, and
it's really surprising that they are apparently not looking at the
same fine structure, despite apparently arising from the same
dichotomy at the type level.

In this paper, we show how to integrate linear types with
type dependency, by extending the linear/non-linear
calculus of Benton to support type dependency.

Next, we give an application of this calculus by giving a
proof-theoretic account of imperative programming, which requires
extending the calculus with computationally irrelevant
quantification, proof irrelevance, and a monad of computations. We
show the soundness of our theory by giving a realizability model in
the style of Nuprl, which permits us to validate not only the
β-laws for each type, but also the η-laws.

These extensions permit us to decompose Hoare triples into
a collection of simpler type-theoretic connectives,
yielding a rich equational theory for dependently-typed
higher-order imperative programs. Furthermore, both the
type theory and its model are relatively simple, even when
all of the extensions are considered.

Sometimes, it seems like every problem in programming languages research can be solved by either linear types, or dependent types. So why not combine them, and see what happens?

Effective support for custom proof automation is
essential for large-scale interactive proof development.
However, existing languages for automation via tactics
either (a) provide no way to specify the behavior of
tactics within the base logic of the accompanying
theorem prover, or (b) rely on advanced type-theoretic
machinery that is not easily integrated into established
theorem provers.

We present Mtac, a lightweight but powerful extension to Coq that
supports dependently-typed tactic programming. Mtac tactics have
access to all the features of ordinary Coq programming, as well as a
new set of typed tactical primitives. We avoid the need to touch the
trusted kernel typechecker of Coq by encapsulating uses of these new
tactical primitives in a monad, and instrumenting Coq so that it
executes monadic tactics during type inference.

Since I'm not the main author of this paper, I feel free to say this is really good! Mtac manages to strike a really amazing balance of simplicity, cleanliness, and power. It's really the first tactic language that I want to implement (rather than grudgingly accepting the necessity of implementing).

Tuesday, July 1, 2014

My university, the University of Birmingham, is looking for applicants to the CS PhD program. I'm putting our advertisement on my blog, in case you (or your students, if you're a professor) are looking for a graduate program -- well, we're looking for students! We have an imminent funding deadline -- please contact us immediately if you are interested!

We invite applications for PhD study at the University of Birmingham.

We are a group of (mostly) theoretical computer scientists who explore fundamental concepts in computation and programming language semantics. This often involves profound and surprising connections between different areas of computer science and mathematics. From category theory to lambda-calculus and computational effects, from topology to constructive mathematics, from game semantics to program
compilation, this is a diverse field of research that continues to provide new insight and underlying structure.