A Tour of Some Useful Recursive Types

I’m presently at NIPS and so felt like writing about some
appropriate machine learning topic, but along the way I wound up talking about
parameterized recursive types, and here we are. Enjoy!

One starts to see common ‘shapes’ in algebraic data types after working with
them for a while. Take the natural numbers and a standard linked list, for
example:

dataNatural=One|SuccNaturaldataLista=Empty|Consa(Lista)

These are similar in some sense. There are some differences - a list has an
additional type parameter, and each recursive point in the list is tagged with
a value of that type - but the nature of the recursion in each is the same.
There is a single recursive point wrapped up in a single constructor, plus a
single base case.

Consider a recursive type that is parameterized by a functor with kind ‘* ->
*’, such that the kind of the resulting type is something like ‘(* -> *) ->
*’ or ‘(* -> *) -> * -> *’ or so on. It’s interesting to look at the
‘shapes’ of some useful types like this and see what kind of similarities and
differences in recursive structure that we can find.

In this article we’ll look at three such recursive types: ‘Fix’, ‘Free’, and
‘Cofree’. I’ll demonstrate that each can be viewed as a kind of program
parameterized by some underlying instruction set.

Fix

To start, let’s review the famous fixed-point type ‘Fix’. I’ve talked about it
before, but will go into a bit more detail here.

Note: I’ll omit interpreter output for examples throughout this article, but
feel free to try the code yourself in GHCi. I’ll post some gists at the
bottom. The above code block also contains some pragmas that you can ignore;
they’re just there to help GHC derive some instances for us.

Anyway. ‘Fix’ is in some sense a template recursive structure. It relies on
some underlying functor ‘f’ to define the scope of recursion that you can
expect a value with type ‘Fix f’ to support. There is the degenerate constant
case, for example, which supports no recursion:

Here ‘NatF’ provides both a ‘running’ case - ‘SuccF’ - and a ‘terminating’ case
in - ‘OneF’. ‘Fix’ just lets ‘NatF’ do whatever it wants, having no say of its
own about termination. In fact, we could have defined ‘Fix’ like this:

dataProgramf=Running(f(Programf))

Indeed, you can think of ‘Fix’ as defining a program that runs until ‘f’
decides to terminate. In turn, you can think of ‘f’ as an instruction
set for the program. The whole shebang of ‘Fix f’ may only terminate if ‘f’
contains a terminating instruction.

Richness of ‘Fix’

It’s worthwhile to review two functions that are useful for working with ‘Fix’,
unimaginatively named ‘fix’ and ‘unfix’:

fix::f(Fixf)->Fixffix=Fixunfix::Fixf->f(Fixf)unfix(Fixf)=f

You can think of them like so: ‘fix’ embeds a value of type ‘f’ into a
recursive structure by adding a new layer of recursion, while ‘unfix’ projects
a value of type ‘f’ out of a recursive structure by peeling back a layer of
recursion.

This is a pretty rich recursive structure - we have a guarantee that we can
always embed into or project out of something with type ‘Fix f’, no matter
what ‘f’ is.

Free

Next up is ‘Free’, which is really just ‘Fix’ with some added structure. It is
defined as follows:

There’s also another branch here called ‘Pure’, though, that just bluntly wraps
a value of type ‘a’, and has nothing to do with the parameter ‘f’. This has an
interesting consequence: it means that ‘Free’ can have an opinion of its own
about termination, regardless about what ‘f’ might decree:

(Note that here I’ve returned the value of type unit when terminating under the
‘Pure’ branch, but you could pick whatever else you’d like.)

You’ll recall that ‘InfiniteF’ provides no terminating instruction,
and left to its own devices will just recurse endlessly.

So: instead of being forced to choose a branch of the underlying functor to
recurse on, ‘Free’ can just bail out on a whim and return some value wrapped up
in ‘Pure’. We could have defined the whole type like this:

dataProgramfa=Running(f(Programfa))|TerminatedaderivingFunctor

Again, it’s ‘Fix’ with more structure. It’s a program that runs until ‘f’
decides to terminate, or that terminates and returns a value of type ‘a’

As a quick illustration, take our simple stack-based instruction set again. We
can define the following embedded language terms:

Richness of ‘Free’

Try to define the equivalent versions of ‘fix’ and ‘unfix’ for ‘Free’. The
equivalent to ‘fix’ is easy:

free::f(Freefa)->Freefafree=Free

You’ll hit a wall, though, if you want to implement the (total) analogue to
‘unfix’. One wants a function of type ‘Free f a -> f (Free f a)’, but the
existence of the ‘Pure’ branch makes this impossible to implement totally. In
general there is not going to be an ‘f’ to pluck out:

unfree::Freefa->f(Freefa)unfree(Freef)=funfree(Purea)=error"kaboom"

The recursion provided by ‘Free’ is thus a little less rich than that provided
by ‘Fix’. With ‘Fix’ one can always project a value out of its recursive
structure - but that’s not the case with ‘Free’.

It’s well-known that ‘Free’ is monadic, and indeed it’s usually called the
‘free
monad’.
The namesake ‘free’ comes from an algebraic definition; roughly, a free ‘foo’
is a ‘foo’ that satisfies the minimum possible constraints to make it a ‘foo’,
and nothing else. Check out the
slides
from Dan Piponi’s excellent talk from Bayhac a few years back for a deeper dive
on algebraic freeness.

Cofree

‘Cofree’ is also like ‘Fix’, but again with some extra structure. It can be
defined as follows:

(Note that here I’ve again used unit to fill in the first field - you could
of course choose whatever you’d like.)

This looks a lot like ‘Free’, and in fact it’s the categorical dual of
‘Free’. Whereas ‘Free’ is a sum type with two branches, ‘Cofree’ is a
product type with two fields. In the case of ‘Free’, we could have a program
that either runs an instruction from a set ‘f’, or terminates with a value
having type ‘a’. In the case of ‘Cofree’, we have a program that runs an
instruction from a set ‘f’ and returns a value of type ‘a’.

A ‘Free’ value thus contains at most one recursive point wrapping the value
with type ‘a’, while a ‘Cofree’ value contains potentially infinite recursive
points - each one of which is tagged with a value of type ‘a’.

Rolling with the ‘Program’ analogy, we could have written this alternate
definition for ‘Cofree’:

A ‘Cofree’ value is thus a program in which every instruction is annotated with
a value of type ‘a’. This means that, unlike ‘Free’, it can’t have its own
opinion on termination. Like ‘Fix’, it has to let ‘f’ decide how to do that.

We’ll use the stack-based instruction set example to wrap up. Here we can
annotate instructions with progress about how many instructions remain to
execute. First our new embedded language terms:

Richness of ‘Cofree’

If you try and implement the ‘fix’ and ‘unfix’ analogues for ‘Cofree’ you’ll
rapidly infer that we have the opposite situation to ‘Free’ here. Implementing
the ‘unfix’ analogue is easy:

uncofree::Cofreefa->f(Cofreefa)uncofree(Cofree_f)=f

But implementing a total function corresponding to ‘fix’ is impossible - we
can’t just come up with something of arbitrary type ‘a’ to tag an instruction
‘f’ with, so, like before, we can’t do any better than define something
partially:

Just as how ‘Free’ forms a monad, ‘Cofree’ forms a comonad. It’s thus known as
the ‘cofree comonad’, though I can’t claim to really have any idea what the
algebraic notion of ‘cofreeness’ captures, exactly.

Wrapping Up

So: ‘Fix’, ‘Free’, and ‘Cofree’ all share a similar sort of recursive structure
that make them useful for encoding programs, given some instruction set. And
while their definitions are similar, ‘Fix’ supports the richest recursion of
the three in some sense - it can both ‘embed’ things into and ‘project’
things out of its recursive structure, while ‘Free’ supports only embedding and
‘Cofree’ supports only projecting.

This has a practical implication: it means one can’t make use of certain
recursion schemes for ‘Free’ and ‘Cofree’ in the same way that one can for
‘Fix’. There do exist analogues, but they’re sort of out-of-scope for this
post.

I haven’t actually mentioned any truly practical uses of ‘Free’ and ‘Cofree’
here, but they’re wonderful things to keep in your toolkit if you’re doing any
work with embedded languages, and I’ll likely write more about them in the
future. In the meantime, Dave Laing wrote an excellent series of
posts on ‘Free’ and
‘Cofree’ that are more than worth reading. They go into much more interesting
detail than I’ve done here - in particular he details a nice pairing that
exists between ‘Free’ and ‘Cofree’ (also
discussed by Dan
Piponi), plus a whack of examples.

You can also find industrial-strength infrastructure for both ‘Free’ and
‘Cofree’ in Edward Kmett’s excellent
free library, and for ‘Fix’ in
recursion-schemes.

I’ve dumped the code for this article into a few gists.
Here’s one of everything
excluding the running ‘Program’ examples, and here are the corresponding
‘Program’ examples for the
Fix,
Free, and
Cofree cases
respectively.