Sunday, November 11, 2007

I thought it'd be fun to dig up my first ever Haskell project from about 2 years ago. I wrote a little about part of it before but I didn't write about what I did with it. Apologies for some of the formatting, I haven't yet got around to updating every line of this code. But it definitely still works. (As usual with old code, I had to fix it in places. How is it that old code ceases to work? Is there some kind of bit rot that sets in if you leave code unused for more than a couple of years?)

And as usual, this post is written in literate Haskell so you can just run it interacively in ghci.

> import Ratio

The idea here was to define a little DSL in Haskell for counting the size of a variety of combinatorial objects. Rather than start with the theory, I'm going to look at some examples in action first.

The idea is that I'll be considering different ways to structure the set {1,2,...,n} (with n=0 meaning the empty set). Let's start with the simplest example: the singleton set with one element. I'll denote this by the symbol z. Consider how many ways we can structure the set {1,2,...,n} as a one element set. It's pretty obvious that you can only structure this set as a one element set if it has one element. So if we use count to count the number of ways of doing things, we expect count 1 z = 1 and count n z = 0 for n equal to anything else.

I haven't told you what value z has, just how it responds to the count function. You can now try evaluating sample ex0.

Now consider the next function in our DSL, list. list a gives the number of ways to form a list of a's from {1,...,n}. A good example is simply

> ex1 :: Fractional a => [a]> ex1 = list z

ie. the number of ways to form a list of n http://sigfpe.blogspot.com/elements. For example, for n=3 we have [1,2,3],[1,3,2],[2,1,3],[2,3,1],[3,1,2],[3,2,1]. Obviously count n (list z) is n!.

Many of the examples I'm going to look at are listed in Sloane's famous encyclopedia of integer sequences. n! is listed as A000142.

Next consider pair a which is the number of ways of forming an ordered pair of a's. For example, the pairs of lists from {1,2,3} are ([],[1,2,3]),([1],[2,3]),([2],[1,3]),([3],[1,2]),([1,2],[3]) and so on. With a little work you should count 24 of those. This is given by

> ex2 :: Fractional a => [a]> ex2 = pair (list z)

Conversely we can consider lists of pairs. Clearly this is zero for n odd. For n=4 we get terms like [(1,2),(3,4)], [(1,3),(2,4)] and so on. Again there are 24. The code is:

> ex3 :: Fractional a => [a]> ex3 = list (pair z)

This DSL also supports constraints. For example we can redo the pair of lists example with the proviso that the lists are both non-empty:

> ex4 :: Fractional a => [a]> ex4 = pair (nonEmpty (list z))

More generally the // operator can provide any numerical constraint. For example:

> ex5 :: Fractional a => [a]> ex5 = pair ((list z) // (>=3))

Gives the ways of forming pairs of lists of length >=3. Clearly this is zero for n<6 and for n=6 we just get 6!

At this point I need to point out that these examples all run petty quickly so we're not simply generating and counting.

The next function is ring. A ring is like a list except that cyclic permutations are considered equal. So [1,2,3], [2,3,1] and [3,1,2] are the same ring. How many rings are there on {1,2,...,n}?

> ex6 :: Fractional a => [a]> ex6 = ring z

If you evaluate sample ex6 you may recognise that it's simply (n-1)!.

The next useful function is set. set a counts the number of sets of a's you can form from {1,...,n}. Clearly count n (set z) is 1 as there's only one set you can form from {1,...,n}. So here's a nice example:

> ex7 :: Fractional a => [a]> ex7 = set (nonEmpty (ring z))

That's counting the ways of forming {1,...,n} into sets of non-empty rings. If you apply sample you may spot that we have n! again. There's a really neat explanation why. Every permutation of {1,...,n} is a product of disjoint cycles. The order of the product doesn't matter so we have a set (rather than a list) of disjoint cycles. A cycle is obviously non-empty and cycles are just rings. So ex7 counts permutations. And the number of permutations on {1,...,n} is n!.

This is the number of ways of partitioning {1,...,n} into sets of non-empty sets. For n=3 we get {{1,2,3}},{{1},{2,3}},{{2},{1,3}},{{3},{1,2}} and {{1},{2},{3}}. In other words, 5.

Another useful function is oneOf. oneOf a b means forming {1,...,n} into either an a or a b.

For example we can count the ways to split a set into a set of singletons and pairs. For n=3 we have {1,(2,3)},{2,(1,3)},{3,(1,2)}, the previous three with the pair flipped and {1,2,3}, in other words count 3 ex9 is 7:

Thirteen examples is enough for anyone. So how does it work? I haven't time to give a crash course on generating functions but I can at least sketch the idea. In particular, exponential generating functions. The idea is this: if I have a sequence a0,a1,a2,... then its exponential generating function is:

a0+a1z+a2z2/2!+a3z3/3!+...

So we can represent any sequence of integers as a power series. For some sequences this series doesn't converge. But that's not a problem. We can still formally manipulate the series as long as we don't try to substitute an actual real number for z. So we call these formal power series. The amazing fact is this: the sorts of operations we might perform on power series when performing ordinary arithmetic (adding hem, multiplying them, finding their logarithms and so on) turn out to be exactly the kind of operations you would perform on them to solve combinatorial problems. So many combinatorial problems can be reduced to finding power series for various arithmetical combinations of other power series.

So now we need some code to manipulate power series. This is such a popular Haskell example that I'll breeze through the code and give some links to other people's documentation. I represent power series simply as a list of coefficients.

When computing with formal power series we frequently require products of the form (a0+a1*t+a2*t2+...)*0 We know the result is zero no matter what the ai are and by immediately returning zero we can eliminate bottomless recursions. We use the operator *! for this purpose. !* is the symmetric counterpart of *!

> (*!) _ 0 = 0> (*!) a b = a*b

> (!*) 0 _ = 0> (!*) a b = a*b

We represent formal power series over an algebraic structure a as infinite list of type a.

Addition and subtraction are straightforward pointwise applications of addition and subtraction in R.

> (^+) a b = zipWith (+) a b> (^-) a b = zipWith (-) a b

Multiplication itself is achieved through the operation of convolution:

Note how that function fails if the series for g starts with anything other than zero. A few of my operations will be like that. They correspond to situations where the operation doesn't make sense on a power series (at least no without some additional structure or assumptions).

We apply transcendental functions to formal power series by finding suitable integro-differential representations of the function. For exp consider:d/dt exp(f(t)) = f'(t)*exp(f(t))We can immediately writeexp(f(t)) = integral f'(t)*exp(f(t))This gives a recursive definition of the exponential of a power series.

The inverse trignometric functions are similarly straightforward except for the inverse cosine. The inverse cosine at zero is transcendental and so might not exist in R. On the other hand there is no power series for acos(1+x). So we can give no formal power series for this function.

t is the generator of the formal power series. t' is the generator for formal power series over the rationals.

> t :: (Num a) => ([a])> t = 0:1:repeat 0

> t' :: [Rational]> t' = t

The ... operator is used to specify a formal power series with known leading terms. If l is a list of n elements then the first n elements of l ... f are given by the elements of l. The remaining terms are those of l.

> lead [] x = x> lead (a:as) x = a : (lead as (tail x))

> a ... x = lead a x

And now we can get back to combinatorics. The definitions of the functions I gave above are stunningly simple:

Now for some more examples. A tree is, by definition, a single parent node and a set of child trees. So we have the recursive definition:

> tree x = p where p = [0] ... union (set p) x

The [0] ... is to get the recursion started. Try sample (tree z) to count the number of trees.

The last example is neat. graph is the exponential generating function of the number of ways of making a graph from {1,...,n}. (A006125) It's not too hard to see what's going on: a graph on a set S is given by a set of edges. In other words it's a subset of the set of unordered pairs of elements of S. If S has size n, there are n(n-1)/2 such pairs.

exp has a really amazing property. If f is the expnential power series corresponding to counting some structure, say F, on {1,...,n} then exp(f) is the power series of the structure consisting of the union of a bunch of non-zero sized F's. We can put this into practice: all graphs can be decomposed as the union of connected graphs. So the generating function of connected graphs is given by

By the way, this plays a big role in quantum field theory where we have to enumerate Feynman diagrams.

Anyway, if you didn't previously know about generating functions I hope this whets your appetite to read up on them some more. They are an almost magical part of mathematics that makes the solution of some very difficult seeming combinatorial problems almost trivial. They also have countless real world applications ranging from statistics to digital signal processing.

6 comments:

Wow! This is great. I've been interested in Haskell and have been learning it, but until now I haven't found a way to directly apply it to my current research on the combinatorics of matchings. Your example 9 is just what I need.

Hrm, your post isn't working in ghci, which complains about the second line in the definition of inverse, invert, and sqrt' (and probably others). The underscores seem to be causing the difficulty; do I need to use a special GHC flag? I'm running 6.6.1 on Ubuntu Gutsy.

blogger.com seems to chew up my spaces. I've already tried to fix this once but if the code still doesn't work, simply insert extra spaces at the beginning of each line that fails to compile. Just keep inserting spaces until each line does compile :-)