Saturday, March 06, 2010

When I wrote about memoizing polymorphic types I mentioned that you can think of forall a. F(a) as the product over all types a of F(a), where F is some type level function. For example F might be a type constructor like []. That's not completely accurate, as I hope to now explain. Along the way we should get some insight into the meaning of the limit of a functor in a category.

Suppose we have two types, A and B. We can form their product (A, B). We have the two projections fst and snd and if we have an element x in (A, B) we know that there is no necessary relationship between fst x and snd x. We can freely choose x so that each of fst x and snd x can take on any values we like in A and B.

But now consider an element of forall a. F(a). For each concrete type X we have a projection πX::forall a. F(a) -> F(X). So it looks like a product of all types. However, we can't freely choose elements of forall a. F(a) so as to get any element of F(X) we like for each choice of X. To demonstrate this, consider an element x of the type forall a. [a]. For any choice of X we get a projection. For example, picking X to be Int or String gives:

However, forall a. [a] comes with a free theorem. For this particular type we have a free theorem that says that for any f :: X -> Y, fmap f (p1 x) == p2 x. For example consider the well known function show :: Int -> String. The free theorem tells us that this diagram commutes:

So if p1 x == [3] then p2 x == ["3"]. We have lost free choice. But we have lost a lot more freedom than this. We have a commuting triangle like this for absolutely any function f :: X -> Y. It should be clear that there is no way we can pick elements of our list to satisfy all of these constraints. So x must be the empty list.

ConesThis scenario of having one projection for each type has a name. It's an example of a cone. Let's borrow the definition from Wikipedia:

Let F:J→C be a functor. Let N be an object of C. A cone from N to F is a family of morphisms with one morphism for each X,

πX:N→F(X)

so that for every morphism f:X→Y, the following diagram commutes:

For any Haskell functor F, the free theorem tells us that we have exactly these diagrams with forall a. F(a) playing the role of N. So forall a. F(a), with its projections to F(X), forms a cone.

LimitsIf F is an instance of the Haskell Functor type class, ie. an endofunctor on Hask, then the type forall a. F(a) gives us a cone. But not just any old cone. I don't know how to prove this but I'm pretty sure it's true that the free theorems for a functor are the only non-trivial relations between F(X) and F(Y) that we're forced to obey. If that's true, then, in a sense, forall a. F(a) is the "biggest" type satisfying the free theorems. We can make this more precise by saying that for any cone N with associated projections πX, we can can map it uniquely to forall a. F(a) so that the following diagram commutes:

This special kind of cone has a name. It's called a limit. In other words, ∀a. F(a) = lim F. In fact, this is exactly how Limit is defined in category-extras.

In a sense you can think of a limit of a functor in any category as being like a product for which a version of the free theorems for a functor holds.

ColimitsA dual story can be told for existential types and colimits. But to do this we need free theorems for existential types. I'll leave that until I've figured out a nice way to derive these free theorems...

Final wordsI should have written this article before I wrote one on coends. Think of it as a prequel.

The fact that we don't have complete freedom of choice when defining polymorphic elements in Haskell is what we mean by 'parametric' polymorphism. Instead of specifying one individual value for each type we define the elements in a uniform way. In a language like C++ we can use template specialisation to freely construct a rule for getting a value from a type using 'ad hoc' polymorphism. That freedom comes at a price - it becomes harder to reason about polymorphism.

It's amazing that many definitions in category theory emerge naturally (pun fully intended) from the free theorems. I keep hoping that one day I'll find a paper on exactly what is going on here that I understand. The original free theorems paper is very uncategorical in its language.

18 comments:

We can describe cones in a nicely categorical way as simply natural transformations.

A cone N over the functor F is the same thing as a natural transformation, KN → F where KN is the constantly N functor. Now L is a limit of a functor F, if Hom(-,L) ~ Nat(K-,F), which is to say, if L represents the cone functor. This can be shown to be equivalent to the definition of limit in your post.

Nat(K-,F) is ∀a.K b a → f a or just ∀a.b → f a in Haskell. Then, essentially by continuity of Hom or by continuity of right adjoints, we can move the ∀ inwards and get: b → ∀a.f a, which, by Yoneda, means L is isomorphic to ∀a.f a. A similar manipulation can be done for colimits, and indeed, this was how Kan extensions, (co)ends, and (co)limits were derived for category-extras.

Suppose i has type (∃a. a → a). For example, it might be (+1), or (++"foo"). Then there exists a function f of type X → Y, for some types X and Y, such that f (i x) == i (f x). f = id, for example, does the trick.

Similarly, suppose xs has type (∃a. [a]). For example, it might be [1, 2, 3], or ["foo", "bar"]. Then there exists a function f of type X → Y, for some types X and Y, such that fmap f xs == xs. f = id, for example, does the trick.

I've already guessed the free theorems for those cases. A more interesting case is exists a.(a->X, F a) for functor F which I mentioned in my coend post. It gets more interesting for exists a.X a where X is an interesting algebraic structure of some sort.

In each case, I have a handwavey argument for what the free theorem should be. But I haven't derived them formally yet.

In some cases I think they are useful but they encode knowledge about the notion of an abstract interface that object oriented programmers already informally use.

Wait a minute; it's even worse than I thought. We're not free to functionalize the relation since all we know is that there exists some relation such that so and so. Here is a detailed example of the mistake.

The free theorem for (1) is (2), which we can specialize to (3) by instantiating R as in (4).

So can you derive that theorem about ∃a.(a,a->Z) from the free theorem theorem?

I think these theorems for existentials are non-trivial and useful. They encode the folk knowledge that if a type is hidden behind an interface, and you can replace the type with an isomorphic one, people in front of the interface won't be able to tell. The free theorem theorem should tell you the precise meaning of 'isomorphism' for each type. (Actually, it doesn't have to be an *iso*morphism, just a morphism that doesn't lose too much information.)

@Dan: The free theorem for () is that () == (). This is because constant types, like Bool and (), are read as identity relations, which are only satisfied when both arguments are equal (see Wadler's paper, page 5, first paragraph).

I need to distinguish the A type from the A constructor, so let me give them distinct names.

data T = forall a. C (a, a->Z)

The theorem you gave is the free theorem for C, not T. The fact that T is an existential type plays no role in the derivation of the free theorem, since the type of C begins with a universal quantifier. If it is theorems like the one you gave which you want to derive, it suffices to follow the technique given in Wadler's paper, treating T as a constant type.

For example, here is how to derive the theorem you gave using parametricity.

The free theorem for (1) is (2), which can be specialized to the theorem you gave, (3), by instantiating R as in (4).

Yes, I'd got as far as realizing these were theorems for the constructor type. But when I used the free theorem generator I was using it was taking me a long time to interpret the result. I just tried @free on #haskell and I'm finding it much easier to parse the result. I'll have to install the underlying generator.

I guess the interesting properties of existentials may be related to free theorems about their 'constructors'. For instance, one likes to think of (exists a. a) as isomorphic to the unit type. And the free theorem for:

c :: forall a. a -> R

is c = c . f, which if we take f to be of the form 'const y', can turn into

forall x, y. c x = c y

which, if c is the 'constructor' for exists a. a, states that all elements of that type are equal.

forall a. (a -> a) -> R

comes with a similar theorem, although it has a precondition that might limit it from saying that all elements are equal (I'm not really sure).

Dan, restricting yourself to the functional specialization of the free theorems is slowing you down.

The free theorem for (1) is (2), which can indeed be specialized to (3) by instantiating R as in (4). But if you instead let R be always satisfied, as in (5), you immediately get (6), a proof that c always returns the same value.

The second type (7) you state also guarantees that results are always the same, but as you have noticed, the functional specialization (9) of its free theorem (8) hides this fact behind an assumption. If you once again let R be always satisfied, as in (5), you easily obtain (10), a proof that d always returns the same value.

Although it has no diagrams, all the constructions are essentially categorical. And the free theorem for existentials is at the bottom of p. 11 below Thm 7—see the brief paragraph broken over pp. 11–12.

Moreover, this paper finally enabled me to understand parametricity (i.e. Reynolds' "Abstraction Theorem", which Wadler renamed the "Parametricity Theorem"). Only after reading Plotkin and Abadi was I able to make sense of Reynolds' original papers and Wadler's follow up. In case anyone finds it helpful, I also have an annotated bibliography for these papers from a course last spring.