But, first things first: if you’d like to install the library and play along, just

cabal update; cabal install species

(Man, do I ever love cabal-install! But I digress.)

Combinatorial what?

So, what are combinatorial species? Intuitively, a species describes a certain combinatorial structure: given an underlying set of labels, it specifies a set of structures which can be built using those labels. For example, , the species of lists, when applied to an underlying set of labels yields the set of all linear orderings of the elements of . So in general a species can be viewed as a function which takes any set (the labels) and produces another set (of structures built out of those labels).

The species L of lists.

Actually, this isn’t quite enough to capture our intuition about what a species ought to be: we want a species to work “independently” of the underlying set; which labels we use shouldn’t matter at all when it comes to describing structures. So, additionally, we require that if is a species, any bijection between two sets of labels can be “lifted” to a bijection between sets of -structures, , in a way that respects composition of bijections. (Of course, the categorists ought to be jumping out of their seats right now: all this just amounts to saying that a species is an endofunctor on the category of sets with bijections.) Importantly, it is not too hard to see that this requirement means that for any species , the size of depends only on the size of , and not on the actual elements of .

Counting labelled structures

So, let’s see some examples already! What sorts of things might we want to compute about species?

First, we of course want to be able to count how many structures are generated by a species. As a first example, consider again the species of lists. Given an underlying set of size , how many lists are there? That’s easy: .

The function labelled takes a combinatorial species as an argument, and computes an infinite list where the entry at index is the number of labelled -structures on an underlying set of size .

(This is also a good time to mention that the species library depends on the Numeric Prelude, an alternative Haskell Prelude with a mathematically sane hierarchy of numeric types; hence we must pass ghci the -XNoImplicitPrelude flag so we don’t get lots of horrible name clashes. I’ll write some additional thoughts on the Numeric Prelude in a future post.)

Now, so far this is nothing new: Dan Piponi wrote a blog post about a Haskell DSL for counting labelled structures back in 2007, and in fact, that post was part of my inspiration for this library. Counting labelled structures works by associating exponential generating functions to species. (More on this in a future post.) But we can do more than that!

Counting unlabelled structures

For one, we can also count unlabelled structures. What’s an unlabelled structure? Intuitively, it’s a structure where you can’t tell the difference between the elements of the underlying set; formally, it’s an equivalence class of labelled structures, where two labelled structures are equivalent if one can be transformed into the other by permuting the labels.

So, how about unlabelled lists?

> take 10 $ unlabelled lists
[1,1,1,1,1,1,1,1,1,1]

Boring! This makes sense, though: there’s only one way to make a list out of n identical objects.

This is a bit magical, and of course I will… explain it in a future post. For now, I leave you with this challenge: can you figure out what the asterisks are doing there? (Hint: the curly brackets denote a cycle…)

Of course, no DSL would be complete without operations with which to build up more complicated structures from simpler ones; in my next post I’ll talk about operations on combinatorial species.

I got the following error while installing:
[10:52 PM]$ sudo cabal install species
Resolving dependencies…
cabal: cannot configure unamb-0.2.2. It requires base ==4.*
There is no available version of base that satisfies ==4.*
Any ideas on how to solve this?

What version of ghc do you have? base-4 comes with ghc-6.10.x, so it appears that (for now) the species library will only compile with ghc-6.10.1 or later. I may not ultimately need the unamb dependency, though.

It defines a “fan”: “fan.F of a datatype F is as a non-deterministic program that, given a so-called seed, constructs an arbitrary F structure in which the only stored value is the seed.” Which sounds really similar to what your unlabelled does.