Mazes in Haskell, My Version

Earlier today, I read this article by Mihai Maruseac about generating mazes in Haskell. It’s very interesting, but it turns out I’m a maze generation algorithm bigot. As far as I’m concerned, there’s only one right way to generate mazes, and it’s this:

Construct the maze with all possible walls (so each cell is isolated).

Pick a random wall such that there’s no way to get from one side to the other.

Tear down that wall.

Repeat until the entire maze is navigable.

Sadly, this isn’t how Mihai decided to do it, so I was forced (forced, I tell you!) to spend some time writing my own maze generator.

Equivalence Relations

It turns out that the tricky step here is #2: how can you efficiently tell whether there’s a way to get from one side of a wall to the other. A maze solver would do it, of course, but solving the maze once per wall isn’t the most scalable answer out there. Instead, you want to keep track of the separate “rooms” in your maze as you go.

An efficient data structure for this task, called “union find”, has been known to the imperative world for decades: Tarjan showed back in the 70s that it runs in time that’s very nearly linear (with some additional factor that grows like the inverse of the Ackerman function… that is, ludicrously slowly). Sadly, this seems to be one of those data structures that’s confined to the imperative world. There’s not an obvious translation into the functional world. I could just use an imperative implementation, in IO or ST… but that would be too easy. Instead, I decided to find an implementation of an externally-pure, internally-stateful version of that algorithm. I failed to find one, so I wrote one instead. Here it is: persistent-equivalence.

I based the general technique off of one by Conchon and Filliatre. They used a persistent array under the hood, and threw in a bit of unsafe mutation to implement path compression. Well, the path compression bit is easy: it’s just an atomicModifyIORef. It’s perfectly safe, since no exposed function can ever give a different result depending on whether it’s been modified or not. In fact, I’m not even certain it needs to atomic, but I’ve played it safe for now. The second major change made by Conchon and Filliatre is sadly less safe: they arranged to make changes to the persistent array (naively a DiffArray) to “reroot” it when old versions are accessed. However, their code for doing this is clearly a minefield in the presence of multithreading… I was up for tackling the task, until I realized that STM is forbidden inside unsafePerformIO, and the interactions between various locks are mind-boggling…

Rather than enter the depths of thread-safety hell, or else potentially expose an API that claims it’s safe but really isn’t, I instead gave up. DiffArray is good enough for us anyway, since we won’t be doing any backtracking.

Just as a side comment, this structure is a widely known example of an imperative structure that’s hard to translate into the functional world… but when you *do* translate at least its public interface, the result is rather beautiful. I’ve never seen a specification of “union find” that I’d consider particularly enlightening… but when it’s converted to a functional interface, it’s immediately clear what you’re dealing with: equivalence relations. Instead of talking about some operation names that were made up for this specific purpose, we’re looking at a very simple idea from mathematics. The imperative viewpoint, though, obscured this by encouraging you to only speak in terms of the operations, and avoid ever talking about a specific equivalence relation you’ve got. I’m much happier with the functional interface.

Generating the Maze

Now that I’ve got an implementation of equivalence relations, I’m well on my way to having a maze. I declare a few data types for cells and walls:

-- Vertical walls are to the right of their cell (so the x component
-- must be less than width - 1), and horizontal walls are to the top
-- of their cell (so the y component must be less than height - 1).
type Cell = (Int, Int)
data Wall = H Cell | V Cell deriving (Eq, Show)

And I write the code to generate a maze, which works out in a nice recursive style.

To generate a maze, you make a list of all the walls in the chosen size, shuffle them (using the random-shuffle package from Hackage), build an initial empty equivalence relation between cells (that is, each cell is its own separate room), and start considering the walls one by one in the random order chosen. For each wall, if its two sides are already connected (the cells are in the same equivalence class), you keep the wall. Else, you knock it down, and proceed with a new equivalence relation considering those cells connected now. This is exactly the algorithm given at the start.

Showing the Result

I turned to Gtk2Hs for a quick GUI displaying the resulting maze. The code is not terribly interesting, so I’ll just drop it in for the sake of completeness.

First, we have Cairo code for drawing the maze (for the sake of simplicity, I’ve hard-coded each cell to 30×30 pixels):

Like this:

Related

13 Comments

I was reading the code from hackage to try and get a feeling for how this works, and I was looking for where the ranks array is changed, and it looks like there might be a bug. In the equate function, there is the following code: