Cartesian Product

September 6, 2013

Since our Standard Prelude includes list comprehensions, that’s easy to translate to Scheme. It gets a little bit uglier because Scheme assembles the arguments to a function into a list when called with define‘s dot-notation, which is the only way to pass a variable number of arguments, and the recursive call has to pick apart the pieces of the list with apply so the define can package them up again:

We turn now to the iterative version. The basic idea is similar to the odometer in your car, except that instead of counting from 0 to 9 the digits count from 0 to the length of the list; at each step the odometer digits give the index of the item in each list that appears in the current cross-product. I find it best to think about this as counting in a mixed-radix positional number system; one function converts a number to a list of mixed-radix digits, a second function counts up from zero and outputs the list elements corresponding to the mixed-radix digits:

The cartesian product function in the Standard Prelude, called cross, uses a functional pearl collected by Mike Spivey that uses the higher-order function fold-right so that neither recursion nor iteration is apparent in the source code:

Haskell doesn’t really have functions with variable numbers of arguments, and faking such is more for programmers of Okasaki’s caliber than mine, so I’ll just have it take a list of lists. I started out with a recursive solution, then realized I could make it a little prettier:

cart2 xs ys = [x:y | x <- xs, y [[a]]
cart ll = foldr cart2 [[]] ll

Then I looked on the Internet for advice. Oddly enough, it turns out that Haskell has Cartesian products in the Prelude, but not in an obvious way.

cart :: [[a]] -> [[a]]
cart = sequence

does the trick. It took me a while to figure out how it works, so I figured I’d type up my reasoning below:

In particular, note that return [] is actually the same as [[]]. Also, for lists, we can translate the do notation directly to list comprehension notation, so the above definition for sequence, specialized to lists, is written much more transparently thus:

Now it’s much clearer what is going on: this is just translating the Cartesian product of [l1,l2,…,ln] into l1*(l2*(l3*…)).
Now how is that do notation or list comprehension treated under the hood? Well, it’s translated to

m >>= \x -> m' >>= \xs -> return (x:xs)

Now what does that funky >>= do? Well, m >>= k is foldr ((++).k) [] m, which is not immediately helpful. But letting m=[m1,m2,…,mn] and expanding the foldr simplifies things a drop:

m1 `(++).k` (m2 `(++).k` (m3 `(++).k` (... [])))

It’s still a tad weird, though. What the heck is (++).k? It’s a function, obviously, that takes a value, applies k to it, and then applies (++) to the result. Writing this out, it’s

so we’re just applying k to each of the elements of m and concatenating the results!
So

m >>= \x -> m' >>= \xs -> return (x:xs)

just means that for each element x of m, for each element xs of m’ we form the list [x:xs], concatenate these together to form [x:xs1++x:xs2++…], and concatenate those together to form [x1:xs1++x1:xs2++…++x2:xs1++x2:xs2++…], which is the Cartesian product.