Friday, December 12, 2014

One of my favorite things in language design is when a high-level
language feature lets you change your low-level runtime system in
interesting ways. In this post, I'll talk about one such idea, which I
probably won't have the time to work out in detail any time soon, but
which is pretty enough that I want to offer it to the interweb.

In 2004, David Bacon, Perry Cheng, and V.T. Rajan published their
paper A Unified Theory of Garbage Collection,
which showed that tracing collectors and reference counting were the
two extreme ends of a spectrum of memory management strategies.

Consider Cheney's two-space collector. When performing a garbage
collection, it will start with the root objects, copy them from
old-space to new-space, and then recursively traverse the objects
reachable from the roots, moving them from old-space to new-space as
it goes.

An important property of Cheney's algorithm is that it never touches
any object that needs to be freed; it only follows pointers from live
objects to live objects.

On the other hand, consider the naive reference counting algorithm.
When the reference count on an object goes to zero, the algorithm will
decrement the reference counts of everything the object points to,
recursively invoke itself on all of the objects whose reference counts
also went to zero, and then free the original object.

Bacon, Cheng and Rajan observed that this reference counting algorithm
has the opposite property as Cheney's algorithm -- it only traverses
dead objects, and never follows a pointer inside a live
object.

When object lifetimes are very short, tracing collectors beat
reference counting, because very few objects will be live at gc time,
and tracing only follows live objects. On the other hand, when object
lifetimes are very long, reference counting wins, because very few
objects are dead at each collection, and reference counting only
follows dead objects.

So (a) the best memory management strategy to use depends on the
lifetime of the object in question, and (b) every memory management
algorithm can be seen as a hybrid of reference counting and tracing,
based on which objects it chooses to follow.

In 2003, Stephen Blackburn and Kathryn Mckinley gave an incredibly
slick application of this idea in their paper Ulterior
Reference Counting. (Yes, the chronology is backwards:
research is asynchronous.)

Most production garbage collectors are based on what is called "the
generational hypothesis". This says that in typical programs, most
objects have a very short lifetime, and only a few have a long
lifetime. So it's a good idea to allocate objects into a region of
memory called "the nursery", and when it fills up, to copy live
objects out of it. Because of the generational hypothesis, most
objects in the nursery will be dead, and so collecting the nursery
will be very fast.

Blackburn and McKinley observed that the generational hypothesis also
implies that if an object survives a young collection, it's likely to
live for a very long time. So in their algorithm, they have a nursery
as usual for tracing collectors. But for objects copied out of the
nursery, they use reference counting. That is, for objects likely to
have a short lifetime, they use tracing, and for objects likely to
have a long lifetime, they use reference counting!

Now, if you're a functional programmer, the mere mention of reference
counting very likely rings some alarm bells --- what about cycles?
Reference counting is, after all, notorious for its inability to
handle cyclic data.

Blackburn and McKinley handle this issue with a backup mark-and-sweep
algorithm that is periodically run on the old generation. But wouldn't
it be nice if we could just know that there isn't any cyclic
data in the heap? Then we could do away with the backup collector, and
implement a very simple "trace young, reference count old" collector.

Surprisingly, this is possible! If we program in a pure functional language,
then under the usual implementation strategies, there will never be nontrivial
cycles in the heap. The only way a cyclic reference could occur is in
the closure of a recursive function definition, and we can simply mark such
recursive pointers as something the gc doesn't need to follow.

So a very high-level property (purity) seems to imply something
about our low-level runtime (the gc strategy strategy)! Proving this
works (and benchmarking it) is something I don't have room on my plate
for, but it's something I wish I could do...

Tuesday, December 9, 2014

Indeed, when we realize usual axioms of mathematics, we need to introduce, one after the other, the very standard tools in system programming: for the law of Peirce, these are continuations (particularly useful for exceptions); for the axiom of dependent choice, these are the clock and the process numbering; for the ultrafilter axiom and the well ordering of $\mathbb{R}$, these are no less than read and write instructions on a global memory, in other words assignment.