Sunday, December 8, 2013

Over the past couple of months I've had "a moment of clarity" regarding some problems with Disciple and DDC that have been bothering me for the last six years or so. Itemized, they include:

In the source language, the fact that every function type carries an effect and closure annotation upsets Haskell programmers. They want a function of type (a -> b) to be just a pure function. In Disciple, the function type constructor actually has two other parameters. We write (a -(e|c)> b) where (a) and (b) are the argument and return types as usual, (e) is the effect type of the function and (c) is the closure type. The effect type describes the side effects that the function might have, and the closure type describes what data might be shared via the function closure. Although DDC provides enough type inference that you don't need to write the effect and closure type if you don't want to, this still isn't cool with some people (for understandable reasons).

The fact that the function type has an added effect and closure parameter means that it has a different kind relative to the Haskell version. We've got (->) :: Data -> Data -> Effect -> Closure -> Data, where 'Data' is the (*) kind from Haskell. This means that Haskell programs which are sensitive to the kind of (->) can't be written directly in Disciple.

Even though Disciple has mutability polymorphism, there isn't a safe way to destructively initialize a mutable structure and then freeze it into an immutable one. In Haskell, the client programmer would use a function like (unsafeFreezeArray :: MutableArray a -> IO (Array a)) at their own peril, but you'd think in a language with mutability polymorphism we'd be able to do better than this.

Run and Box

As you might have hoped, all these problems are now solved in the head version of DDC (henceforth known as "new DDC"), and you can expect a new release sometime in January when it's properly baked. I've got the typing rules worked out and partially implemented, but the modified core language doesn't yet compile all the way to object code.
Here is an example in the core language:

The [r : Region] in the type signature means "forall", while [..] in the expression itself indicates a type argument. Things ending in # are primitives. The readRef# and writeRef# functions have the following types:

A type like (S e a) describes a suspended computation with effect (e) and result type (a). The S is a monadic type constructor, though in new DDC it is baked into the type system and not defined directly in the source language.
In the above program the 'run' keyword takes a computation type and "runs it" (yeah!), which will unleash the associated effects. The 'box' keyword takes an expression that has some effects and then packages it up into a suspended computation. Critically, the type checker only lets you abstract over pure expressions. This means that if your expression has a side effect then you must 'box' it when using it as a function body. This restriction ensures that all side effect annotations are attached to 'S' computation types, and the functions themselves are pure. The idea of using 'run' and 'box' comes from Andrzej Filinski's "monadic reification and reflection" operators, as well as the paper "A judgemental reconstruction of modal logic" by Frank Pfenning, both of which I wish I had known about a long time ago. Note that the mechanism that combines the two atomic effect types (Read r) and (Write r) into the compound effect (Read r + Write r) is part of the ambient type system. You don't need to hack up effect composition in the source language by using type classes or some such. In summary, new DDC gives you composable effect indexed state monads, with none of the pain of existing Haskell encodings.

This code creates a new region 'r2' which extends the existing region 'r1', and says that the new one supports allocation and writing. It then allocates a new reference, and destructively updates it before returning it. The overall type of the above expression is:

:: [r1 : Region].Unit -> S (Alloc r1) (Ref# r1 Nat#)

Note that the returned reference is in region 'r1' and NOT region 'r2'. When leaving the body of 'extend', all objects in the inner region (r2) are merged into the outer region (r1), so the returned reference here is in (r1). Also, as far as the caller is concerned, the visible effect of the body is that a new object is allocated into region r1. The fact that the function internally allocates and writes to r2 is not visible.
Importantly, we allow objects in the inner region (r2) to be destructively updated, independent of whether it is possible to destructively update objects in the outer region (r1). Once the body of the 'extend' has finished, all writes to the inner region are done, so it's fine to merge freshly initialized objects into the outer region, and treat them as constant from then on.

Coeffects

Of the previous example one might ask: what would happen if we also returned a function that can destructively update objects in region 'r2', even after the 'extend' has finished? Here we go:

Here, 'T2#' is the data constructor for 'Tuple2#'. The result tuple contains a reference in region r1, as well as a function that produces a computation that, when run, will destructively update that reference. The worry is that if the calling context wants to treat the returned reference as constant, then we can't allow the computation that updates it to ever be executed.

The solution is to say that (Write r1) is NOT an effect type -- it's a coeffect type! Whereas an effect type is a description of the action a computation may have on its calling context when executed, a coeffect type is a description of what capabilities the context must support to be able to execute that computation. It's a subtle, but important difference. A boxed computation with an effect can be executed at any time, and the computation does whatever it says on the box. In contrast, a boxed computation with a coeffect can only be executed when the bearer has the capabilities listed on the box. To put this another way, the effect type (Write r) says that the computation might write to objects in region 'r' when executed, but the coeffect type (Write r) says that the context must support writing to region 'r' for the computation to be executed. There is also a duality: "the context of a computation must support writing to region 'r', because when we execute that computation it might write to region 'r'". Most importantly, though, is to CHECK that the context supports some (co)effect before we run a computation with that (co)effect.

When we take the view of coeffects instead of effects (or as well as effects), then the problem of type safe freezing becomes straightforward. If there is no (Write r1) capability in the calling context, then the type system prevents us from executing computations which require that capability. Here is a complete example:

In the above example, 'private' introduces a new region that is not an extension of an existing one. In the case-expression we unpack the tuple and get a computation that wants to write to an object in region 'r'. Because we have not given 'r' a (Write r) capability, then the type system stops us from running the computation that needs it:

Yeah, you're right, I need to refine the examples. The first example came from a unit test showing that if you omit the use of 'box', then the type checker complains that you can't use an impure expression as a function body. I've kept the dummy Unit argument from this test, but it's not really necessary.

Hi Ben! This is an interesting example. Did you know that Tomas Petricek and I have been working on coeffect typing for a couple of years? (see our ICALP paper from last year: http://www.cl.cam.ac.uk/~dao29/publ/coeffects-icalp13.pdf and Tomas' recent blog post: http://tomasp.net/blog/2014/why-coeffects-matter/).I'm having a little bit of a hard time digesting the notation. The important thing about our coeffect system is that the abstraction rule can "split" the coeffects of a function body between immediate coeffects (context that must be available in the definition context) with latent coeffects (context that must be available in the calling context) (which contrasts with effects where all effects are latent, i.e. delayed to the calling site). In your example is fair to say the coeffects are all latent? Dominic