Bλog

Links

Using Arrows for Dependency Handling

I recently found a very good use case for Arrows, so I thought I'd share it here
Published on March 26, 2010 under the tag haskell

What is this about

Arrows are, like Monads or Monoids, a mathematical concept that can be used in the Haskell programming language. This post analyzes and explains a certain use case for Arrows, namely dependency tracking. But let’s begin with a quite unrelated quote from Jimi Hendrix.

Well my arrows are made of desire
From far away as Jupiter’s sulphur mines

Arrows are less common than Monads or Monoids (in Haskell code, that is), but they are certainly not harder to understand. This blogpost aims to give a quick, informal introduction to Arrows. As concrete subject and example, we use dependency handling in Hakyll.

The problem

Hakyll is a static site generator I use to run this blog. The principles behind it are pretty simple: you write your pages in markdown or something similar, and you write templates in html. Then, you render the pages with some templates using a configuration DSL.

The catch is, say _site/contact.html is “newer” than contact.markdown and the HTML templates. In this case, we do not want to do anything. Haskell lazyness will not help us a lot here, since we’re dealing with a lot of IO code.

Suppose we’re currently reading the page from a file. We know the timestamp of the file we’re reading, but since we don’t yet know the timestamp of the other files on which the final result depends, we don’t know if we can skip this read or not. This means dependency handling should happen on a higher level, above these specific functions – so we need to abstract dependency handling.

Some explanation might be needed here. You can think of a as the input for our action, and then b is the output. The actionUrl contains the final destination of our computations – this can be Nothing, if it is not yet known. And finally, the actionFunction contains the actual action.

The Hakyll is a usual monad stack with IO at the bottom.

Categories

To qualify as an Arrow, a datatype needs to be a Category. So lets create an instance Category HakyllAction first. There are two functions we need to implement:

id: The simple identity category. This is comparable to the Prelude.id function.

.: Category composition – this is comparable to function composition.

The id action has no dependencies, no destination, and simply returns itself.

The . action is not complicated either. The new dependencies consist of all the dependencies of the two actions. For our destination, we use an mplus with the latest applied function first, so it gets chosen over the other destination.

Arrows

To make our action a real Arrow, we need to implement two more functions:

arr: This should lift a pure function (thus, with an a -> b signature) into HakyllAction, so we have the type signature (a -> b) -> HakyllAction a b.

first: This is a function that should operate on one value of a tuple. This all happens “inside” HakyllAction – perhaps an illustration will explain this better. You can see how f :: a -> b applies to an (a, c) tuple, where f is applied on the first value.

This creates a HakyllAction but doesn’t actually do anything. We still need to run it. And now we can see the benefits of this method, since we can write a function that does dependency checking on the combined functions.

The isFileMoreRecent function checks if the first file is more recent than all of the other files.

Profit!

We have now developped a more robust and better dependency checking system, and learned something about Arrows. If you are interested, the complete code that led to this blogpost is available on here on GitHub. There are also a few other interesting things to consider: