In this blog article we explore the meaning of zippers as well as higher order zippers. The idea for this blog article came while discussing with fasta on Freenode today about zippers for graphs. Out of the blue, the question popped to me, what does it mean to have higher order zippers. Not having thought it through completely, I hope this blog will offer some clarity, and perhaps some feedback.

[EDIT: This is a new version as of 2007-07-11 that takes some comments from reddit and the comments here into account, cleaning up a bit and adding some more content]

For those new to the concept of Zippers, I shall briefly explain what they are and how one can reason with them. The idea of a zipper is to provide an iterator that one can move through an arbitrary functor.

A functor is basically a container of elements. For instance, [] in Haskell is a functor. If you apply to it to a specific type, you get a container-type (in this case a list) of that type. Other typical examples include Maybe, Binary trees and the N-ary tree introduced in Flatten Benchmark.

Due to the fact that Haskell does not have mutability, at first it can seem inefficient to mutate an element inside such a Functor or container. That is why Huet invented the concept of a Zipper. A Zipper allows one to walk through such a container such that one specific element has O(1) access, namely the element that is currently in the ‘hole’. Then, typically, there will be functions that allow one to move the element we’re currently looking at.

First Order Zippers

To illustrate, I will demonstrate the Zipper for a list first.

>{-# OPTIONS_GHC -fglasgow-exts #-}>module Zipper where>import Data.Maybe(fromJust)>>-- A zipper is the current element and a context that stores where this>-- element is.>data Zipper d a = Zipper a (d a)deriving(Eq, Ord, Show)>>-- The context for a list is a combination of the elements before and the>-- elements after the current element.>data DList a = DList [a][a]>deriving(Eq, Ord, Show)>type ZList a = Zipper DList a

The advantage of this zipper is that one can easily (with O(1) complexity) move the current hole we’re looking at back and forth. It also provides an O(1) complexity way of changing the current element. As last requirement is that it should be possible to reconstruct the entire data-type that has been ‘zipped’ from just the Zipper. This process is often referred to as ‘unzipping’ the data structure. The methods to perform these operations, as well as to create a zipper from a list are given here:

In The derivative of a
Regular Type is a Type of One-Hole Contests Conor McBride shows us how it is possible to determine the context for any data-type using algebra. Basically, the context for a data-type is it’s derivative. How can one derive a data-type, however? Well this is done by using the more algebraic notation of a data type. If one takes a list, then the actual data type is: ‘List a = Nil | Cons a (List a)’. In an algebraic notation, this would be: L = 1 + a·L. Namely, a list is a sumtype of a constant (1) and a tuple of an element ‘a’ and the list again. Well, this is not fully correct, a list is actually the least-fixed-type that satisfies this equation (namely: L = 1 + a·Lμ L.1 + a·L). However I will gloss over this. The context of a list is then the derivative of this equation using standard derivation as taught in algebra/calculus:

∂a L = ∂a a · L + a·∂a L = 1·L + a·∂a L = L + a·∂a L

=> data DList a = [a] | DCons a (DList a)

By factoring out the [a] (because any DList will contain exactly one), one gets “[a]*(mu dL.1 + a*dL)” or “[a]*[a]“. In general, I believe that in an equation “(mu x.S + Tx)” one can factor out the S (as long as it doesn’t contain a reference to x of course) to get “(S*(mu x.1 + Tx))”. For exercise’s sake, we will derive the context for a binary tree:

Looking at this from an intuitive point of view, this makes sense. The context of an element in a tree, are the left and right child-trees of the node this element is in, as well as the path that we took to get there by saving the trees that we skipped over along the way to the element.

I will leave the implementation for an N-ary tree as an exercise for the reader.

Second Order Zippers

Now nothing introduced so far is really new. The question that arose to me is: what would a higher-order zipper be? What does it mean to have a double derivative of a data-type. I will first show what the structure for the double derivative of a list looks like by working it out and then give an intuition of what the meaning of this data-type might mean. To make sure that this intuition makes any sense, we will then work out the double derivative for a binary tree, to ascertain that the intuition we had for the list is valid for other data-types. So first we define the generic meaning of a second-order zipper:

A second-order zipper is simply the zipper of the zipper of a datatype parametrized with the second-order derivative of that data-type and a the element-type. So we can already see that a second-order zipper will be containing two ‘a’s as well as this second-order derivative. So what is the second-order derivative of a list? Well we had: “[a]‘ = [a]_h*[a]_t”. Applying simple derivation in the algebraic sense we get: “[a]” = [a]_h*[a]_t’ + [a]_h’*[a]_t”.

Second Order Zipper for a List

If we remember the meaning we gave to the two lists in the case of “[a]‘”, then we had the head part and the tail part of the list around the current element we’re looking at. Since we’re now either deriving one or the other, my intuition tells me that we’ll have either the prefix and suffix of the head part or the prefix and suffix of the tail part. If we take the definition of “[a]‘” and expand it in “[a]”” we get: “[a]*[a]*[a] + [a]*[a]*[a]“. Putting this in a Haskell data-type with names based on my intuition, we get:

So what my intuitition tells me is that where a Zipper gives O(1) mutation of an element, a Zipper2 gives O(1) splicing. After all, we can directly access the part that is between the two singular elements stored in the Zipper2. Of course, to ascertain that this statement is valid for more than just lists, we should ascertain this is true for other data-types. In this case, we shall do the experiment for binary trees. Again, the exercise for N-ary trees is left as an exercise for the reader (feel free to put in a comment if you want to).

Deriving the rules for a tree

Note: I redid the math with a cleaner technique, and it seems I had made some mistakes in the original math, which is why my data-type was not as intuitive as I had hoped it to be. The correct version makes much more sense Namely: I was missing a “p” in the last term.

In this blog article we explore the meaning of zippers as well as higher order zippers. The idea for this blog article came while discussing with fasta on Freenode today about zippers for graphs. Out of the blue, the question popped to me, what does it mean to have higher order zippers. Not having thought it through completely, I hope this blog will offer some clarity, and perhaps some feedback.

[EDIT: This is a new version as of 2007-07-11 that takes some comments from reddit and the comments here into account, cleaning up a bit and adding some more content]

For those new to the concept of Zippers, I shall briefly explain what they are and how one can reason with them. The idea of a zipper is to provide an iterator that one can move through an arbitrary functor.

A functor is basically a container of elements. For instance, [] in Haskell is a functor. If you apply to it to a specific type, you get a container-type (in this case a list) of that type. Other typical examples include Maybe, Binary trees and the N-ary tree introduced in Flatten Benchmark.

Due to the fact that Haskell does not have mutability, at first it can seem inefficient to mutate an element inside such a Functor or container. That is why Huet invented the concept of a Zipper. A Zipper allows one to walk through such a container such that one specific element has O(1) access, namely the element that is currently in the ‘hole’. Then, typically, there will be functions that allow one to move the element we’re currently looking at.

First Order Zippers

To illustrate, I will demonstrate the Zipper for a list first.

>{-# OPTIONS_GHC -fglasgow-exts #-}>module Zipper where>import Data.Maybe(fromJust)>>-- A zipper is the current element and a context that stores where this>-- element is.>data Zipper d a = Zipper a (d a)deriving(Eq, Ord, Show)>>-- The context for a list is a combination of the elements before and the>-- elements after the current element.>data DList a = DList [a][a]>deriving(Eq, Ord, Show)>type ZList a = Zipper DList a

The advantage of this zipper is that one can easily (with O(1) complexity) move the current hole we’re looking at back and forth. It also provides an O(1) complexity way of changing the current element. As last requirement is that it should be possible to reconstruct the entire data-type that has been ‘zipped’ from just the Zipper. This process is often referred to as ‘unzipping’ the data structure. The methods to perform these operations, as well as to create a zipper from a list are given here:

In The derivative of a Regular Type is a Type of One-Hole Contests Conor McBride shows us how it is possible to determine the context for any data-type using algebra. Basically, the context for a data-type is it’s derivative. How can one derive a data-type, however? Well this is done by using the more algebraic notation of a data type. If one takes a list, then the actual data type is: ‘List a = Nil | Cons a (List a)’. In an algebraic notation, this would be: L = 1 + a·L. Namely, a list is a sumtype of a constant (1) and a tuple of an element ‘a’ and the list again. Well, this is not fully correct, a list is actually the least-fixed-type that satisfies this equation (namely: L = 1 + a·L μ L.1 + a·L). However I will gloss over this. The context of a list is then the derivative of this equation using standard derivation as taught in algebra/calculus:

∂a L = ∂a a · L + a·∂a L = 1·L + a·∂a L = L + a·∂a L

=> data DList a = [a] | DCons a (DList a)

By factoring out the [a] (because any DList will contain exactly one), one gets “[a]*(mu dL.1 + a*dL)” or “[a]*[a]“. In general, I believe that in an equation “(mu x.S + Tx)” one can factor out the S (as long as it doesn’t contain a reference to x of course) to get “(S*(mu x.1 + Tx))”. For exercise’s sake, we will derive the context for a binary tree:

Looking at this from an intuitive point of view, this makes sense. The context of an element in a tree, are the left and right child-trees of the node this element is in, as well as the path that we took to get there by saving the trees that we skipped over along the way to the element.

I will leave the implementation for an N-ary tree as an exercise for the reader.

Second Order Zippers

Now nothing introduced so far is really new. The question that arose to me is: what would a higher-order zipper be? What does it mean to have a double derivative of a data-type. I will first show what the structure for the double derivative of a list looks like by working it out and then give an intuition of what the meaning of this data-type might mean. To make sure that this intuition makes any sense, we will then work out the double derivative for a binary tree, to ascertain that the intuition we had for the list is valid for other data-types. So first we define the generic meaning of a second-order zipper:

A second-order zipper is simply the zipper of the zipper of a datatype parametrized with the second-order derivative of that data-type and a the element-type. So we can already see that a second-order zipper will be containing two ‘a’s as well as this second-order derivative. So what is the second-order derivative of a list? Well we had: “[a]‘ = [a]_h*[a]_t”. Applying simple derivation in the algebraic sense we get: “[a]” = [a]_h*[a]_t’ + [a]_h’*[a]_t”.

Second Order Zipper for a List

If we remember the meaning we gave to the two lists in the case of “[a]‘”, then we had the head part and the tail part of the list around the current element we’re looking at. Since we’re now either deriving one or the other, my intuition tells me that we’ll have either the prefix and suffix of the head part or the prefix and suffix of the tail part. If we take the definition of “[a]‘” and expand it in “[a]”” we get: “[a]*[a]*[a] + [a]*[a]*[a]“. Putting this in a Haskell data-type with names based on my intuition, we get:

So what my intuitition tells me is that where a Zipper gives O(1) mutation of an element, a Zipper2 gives O(1) splicing. After all, we can directly access the part that is between the two singular elements stored in the Zipper2. Of course, to ascertain that this statement is valid for more than just lists, we should ascertain this is true for other data-types. In this case, we shall do the experiment for binary trees. Again, the exercise for N-ary trees is left as an exercise for the reader (feel free to put in a comment if you want to).

Deriving the rules for a tree

Note: I redid the math with a cleaner technique, and it seems I had made some mistakes in the original math, which is why my data-type was not as intuitive as I had hoped it to be. The correct version makes much more sense Namely: I was missing a “p” in the last term.<!–$t = 1 + a * t^2 $

The first term, D, is basically a derivative of one or the left or right sub-trees of the tree we are currently looking at it with our first-order zipper. More specifically, if we go down the left-subtree, we save the right subtree in t1. Then the second finger will be pointing at a tree further down, so we need to save it’s children, namely: (t2)2. And obviously we have the two paths, one from the root of the tree to the first finger, p3, and one from this finger to the second finger, p4.

The second term, U, is analogous, except that the first finger will now be at the bottom and the second finger more at the top. This means that the path that goes from the first finger to the second finger, p4, will be reversed. Additionally (at least I think this should be so for efficiency reasons), the path from the root, p3, will only extend down to the second finger. Moving the second-finger around is then O(1) as we can just pop a path-element from one of the paths and put it on the other path or vice-versa.

The final term, UD, should then obviously be a path that walks up along the tree and then down from another node. In this case, we still have the path from the root of the tree, p3, but instead of going down to the first finger or second finger, it will go down to the join-point of these two fingers (namely the place where the path stops going up and starts going down again. Because neith
er subtree of this node is being stored in this path (as we’re deriving both sides), we need to save the element at this point, namely in a1. The two paths, (p2)4, then simply consist of the path up from the first finger and then down to the second finger. We have two options for this one, as we might be going up from the left side of the join point or the right side of the join point. Finally, the two pairs of trees, (t2)2L and (t2)2R obviously are the subtrees of the two nodes where our fingers are pointing at.

The operations for this tree are left as an exercise for the reader (mostly for conciseness purposes and because I have other things to do, I might add them later). Suffice to say that intuitively (due to the way the lists are ordered (reverse path or not), moving the second finger will be easier than moving the first finger). Obviously, splicing will depend on whether we have a stricly unidirectional path, or an updown path.

I hope the above is clear, otherwise I can add figures, though if possible I’d prefer not to to say on time (and I don’t have a nice diagram editor :/).

Generating higher-order zippers

By now the pattern for creating higher-order zippers should be clear. Given the definition of a data-type as an algebraic entity. The first-order zipper would be:

type FirstOrderZipper d a = Zipper d a

where ‘d’ is the first derivative of the data-type in question. To then go to higher-order zippers, we simply apply the meta-rule (note that this is a meta-rule as there are no haskell facilities to derive data-types that I am aware of).

Zipper d a -> Zipper (Zipper d') a

Where d’ obviously stands for the derivative of d. As a reader of reddit pointed out, this is “simply” having several pointers that point into a functional data-structure. However, due to the way that these ‘pointers’ work (for instance, you can not just get a second one from a data-type, you need to go higher-order to get more pointers) we get extra functionality, for instance the O(1) splicing which I mentioned previously. For completeness’ sake, I here show the rules that were introduced in McBride’s paper.

Anyone in contact with me either online or IRL knows that I’m looking forward to the ICFP Contest with much anticipation. If possible, I will push for the #oasis channel on Freenode to be the de-facto discussion channel for different teams. Last year, there was a grass-roots movement whereby #oasis ended up to be this icfp-discussion channel, and it’d be nice to have it again.

Last year our team, Lazy Bottoms, made it in the top 5% (18th) and was mentioned several times in the results presentation. Hopefully this year we can make it top-10 or better. I think that our team has some very smart members of Haskell-noteworthiness, so I’m hopeful.

For more details about our team: Lazy Bottoms Team . I think the double entendre of the name should be obvious to most people reading this blog.

Anyone in contact with me either online or IRL knows that I’m looking forward to the ICFP Contest with much anticipation. If possible, I will push for the #oasis channel on Freenode to be the de-facto discussion channel for different teams. Last year, there was a grass-roots movement whereby #oasis ended up to be this icfp-discussion channel, and it’d be nice to have it again.

Last year our team, Lazy Bottoms, made it in the top 5% (18th) and was mentioned several times in the results presentation. Hopefully this year we can make it top-10 or better. I think that our team has some very smart members of Haskell-noteworthiness, so I’m hopeful.

For more details about our team: Lazy Bottoms Team . I think the double entendre of the name should be obvious to most people reading this blog.

In this blog article, we look at the benchmarks as proposed in Kyle’s Smith’s blog regarding the flattening of lists in Scheme. However, we perform the same experiment in Haskell. Obviously, due to typing reasons, this will not be done on lists but N-ary trees. It should be interesting to see how these algorithms perform in Haskell, especially due to the very different semantics caused by laziness.

In fact, a very common idiom in Scheme, or even SML or O’Caml, namely ensuring tail-call recursion, is not always a good idea in Haskell when working with data-structures. They are too strict in a sense, and as such they do not work well on infinite data-structures.

The first thing to do in Haskell, of course, is to define a module. We also define the data-type that will hold a list to be flattened. In this case, it is an N-ary tree where the options for the data-type is sumtype of a child of a single element, or a parent of a list of this sum-type. To keep it concise, I will give the data-constructors short names.

As example, we show the encoding of a one of the ‘lists’ from Kyle’s blog. The notation is a bit verbose, but one could easily add a read-instance to fix this. Since this is not crucial, I will not do so at the moment (let’s just say, yay for VIM and some regexp-fu to turn the data into Haskelly data).

All the algorithms will by definition be slightly different as we have one more case to check due to the design of our data-type. First of all, instead of checking on whether the input is null, we will instead check when we have a child element. And secondly, because each level of the tree is a pure list, we can use standard list operations like map and fold.

Looking through the algorithms, a lot use side-effects or reversing at the end. Obviously, side-effects are not a standard idiom in Haskell. Reversing is not a good idiom for Haskell programming, because reversing tends to destroy laziness. As such, the version that is most straightforward to code in Haskell is the one that uses ‘cons’ with an accumulator. The version in scheme required reversing at teh end. We do not need to reverse the list at the end because we use ‘foldr’ while the algorithms in scheme used a fold-left mechanism. The reason why we want foldr is because it respects laziness, while the reason why schemers and mlers want foldl is because it is tail-call recursive.

I will add type-declarations to the function but these are purely for documentation purposes and are not actually required for Haskell. The scheme version uses a mutation on the result variable. Instead of doing this, we will take a more Haskelly approach (I was going to say pure, in the purely functional, but that could be misconstrued to mean pure as in purity of algorithm) and thread this result through the calls.

To test we simply run ‘ghci Flatten.lhs’ and then when we run ‘flatten abc’ we get “abcdefghijklmnopqrstuvwxyz. We also see that this definition respects lazyness in the horizontal (but therefore not perse the vertical) direction. Namely, we construct an infinite flat tree of elements and see whether it succeeds in flattening. We will also construct an infinite vertical tree (obviously with some elements at eachc level, or it will take an infinite amount of time to reach the first element.

Indeed, if we load Flatten.lhs in ghci and run flatten, everything works out fine (make sure to do something like: ‘take 10 $ flatten infinite_flat_tree’, or your terminal will spam a lot of numbers). Similarly, it also respects the infinitely vertical tree (which the reader should note, has been built up in a way to ensure the creation of the tree itself is also lazy).

Originally I had intended to try out all the different algorithms that were written in Kyle’s blog, but it seems most of the trade-off is between using set! as well as reversing and other techniques to circumvent this. However, when using foldr and lazy structures, as long as you make sure to respect laziness, this is not required and there’s only one way of coding this function, which is probably the way anyone would write it in Haskell. If any of the readers have feedback on other versions that might be applicable for Haskell, I am definitely interested in seeing their version.

Therefore, to give this article any meaningful content, we will add some benchmarks. My computer is not so powerful, it is a simple Dell Latitude D610 laptop with 1GB of memory and a Centrino of 1.6Ghz.

First we need to generate some data-sets besides the one above. The first one was far too small and infinite data-sets will have obvious results if computed completely. To ensure the calculation actually occurs, I will get th
e last element of the computed list to force the entire list to be generated. To measure how complicated an N-ary tree is, Kyle introduces the concept of complexity. I will translate this to Haskell (If anyone spots algorithmic differences, feel free to point them out).

Due to laziness, it is important that we force the creation of the input tree before actually timing the flattening of the tree. But since we need to know the complexity of the tree before we try to flatten it, we can let the complexity function force the creation of the tree. Therefore, to get a test,case, we simply get the following code:

Finally, we can tie it all together with a little main that will allow us to run this program to test different complexities. While the scheme version allowed the user to choose which subtrees to use to build up the big tree, I will not give this option as in the end, the only metric used was complexity. I would like to briefly point out, however, that it might be interesting to test deeper trees, as the trees seem to have been limited to the depth of trees defined at the beginning.

In this blog article, we look at the benchmarks as proposed in Kyle’s Smith’s blog regarding the flattening of lists in Scheme. However, we perform the same experiment in Haskell. Obviously, due to typing reasons, this will not be done on lists but N-ary trees. It should be interesting to see how these algorithms perform in Haskell, especially due to the very different semantics caused by laziness.

In fact, a very common idiom in Scheme, or even SML or O’Caml, namely ensuring tail-call recursion, is not always a good idea in Haskell when working with data-structures. They are too strict in a sense, and as such they do not work well on infinite data-structures.

The first thing to do in Haskell, of course, is to define a module. We also define the data-type that will hold a list to be flattened. In this case, it is an N-ary tree where the options for the data-type is sumtype of a child of a single element, or a parent of a list of this sum-type. To keep it concise, I will give the data-constructors short names.

As example, we show the encoding of a one of the ‘lists’ from Kyle’s blog. The notation is a bit verbose, but one could easily add a read-instance to fix this. Since this is not crucial, I will not do so at the moment (let’s just say, yay for VIM and some regexp-fu to turn the data into Haskelly data).

All the algorithms will by definition be slightly different as we have one more case to check due to the design of our data-type. First of all, instead of checking on whether the input is null, we will instead check when we have a child element. And secondly, because each level of the tree is a pure list, we can use standard list operations like map and fold.

Looking through the algorithms, a lot use side-effects or reversing at the end. Obviously, side-effects are not a standard idiom in Haskell. Reversing is not a good idiom for Haskell programming, because reversing tends to destroy laziness. As such, the version that is most straightforward to code in Haskell is the one that uses ‘cons’ with an accumulator. The version in scheme required reversing at teh end. We do not need to reverse the list at the end because we use ‘foldr’ while the algorithms in scheme used a fold-left mechanism. The reason why we want foldr is because it respects laziness, while the reason why schemers and mlers want foldl is because it is tail-call recursive.

I will add type-declarations to the function but these are purely for documentation purposes and are not actually required for Haskell. The scheme version uses a mutation on the result variable. Instead of doing this, we will take a more Haskelly approach (I was going to say pure, in the purely functional, but that could be misconstrued to mean pure as in purity of algorithm) and thread this result through the calls.

To test we simply run ‘ghci Flatten.lhs’ and then when we run ‘flatten abc’ we get “abcdefghijklmnopqrstuvwxyz. We also see that this definition respects lazyness in the horizontal (but therefore not perse the vertical) direction. Namely, we construct an infinite flat tree of elements and see whether it succeeds in flattening. We will also construct an infinite vertical tree (obviously with some elements at eachc level, or it will take an infinite amount of time to reach the first element.

Indeed, if we load Flatten.lhs in ghci and run flatten, everything works out fine (make sure to do something like: ‘take 10 $ flatten infinite_flat_tree’, or your terminal will spam a lot of numbers). Similarly, it also respects the infinitely vertical tree (which the reader should note, has been built up in a way to ensure the creation of the tree itself is also lazy).

Originally I had intended to try out all the different algorithms that were written in Kyle’s blog, but it seems most of the trade-off is between using set! as well as reversing and other techniques to circumvent this. However, when using foldr and lazy structures, as long as you make sure to respect laziness, this is not required and there’s only one way of coding this function, which is probably the way anyone would write it in Haskell. If any of the readers have feedback on other versions that might be applicable for Haskell, I am definitely interested in seeing their version.

Therefore, to give this article any meaningful content, we will add some benchmarks. My computer is not so powerful, it is a simple Dell Latitude D610 laptop with 1GB of memory and a Centrino of 1.6Ghz.

First we need to generate some data-sets besides the one above. The first one was far too small and infinite data-sets will have obv
ious results if computed completely. To ensure the calculation actually occurs, I will get the last element of the computed list to force the entire list to be generated. To measure how complicated an N-ary tree is, Kyle introduces the concept of complexity. I will translate this to Haskell (If anyone spots algorithmic differences, feel free to point them out).

Due to laziness, it is important that we force the creation of the input tree before actually timing the flattening of the tree. But since we need to know the complexity of the tree before we try to flatten it, we can let the complexity function force the creation of the tree. Therefore, to get a test,case, we simply get the following code:

Finally, we can tie it all together with a little main that will allow us to run this program to test different complexities. While the scheme version allowed the user to choose which subtrees to use to build up the big tree, I will not give this option as in the end, the only metric used was complexity. I would like to briefly point out, however, that it might be interesting to test deeper trees, as the trees seem to have been limited to the depth of trees defined at the beginning.

Edit: I’ve changed the way my posts are displayed to have “Read more” to keep the top-page cleaner. However, it’s still not what I want. For one, bloghelper says it leaves as homework how to remove the “Read more!”s from pages that do not have an extended article. I will go back and edit the other posts to add this span. If anyone knows how to remove the readmores from short posts that do not have an extended article, I’d be greatful. And if someone knows how to do it without just hiding the content (thereby reducing download time for readers), I’d be even more greatful.

Edit: I’ve changed the way my posts are displayed to have “Read more” to keep the top-page cleaner. However, it’s still not what I want. For one, bloghelper says it leaves as homework how to remove the “Read more!”s from pages that do not have an extended article. I will go back and edit the other posts to add this span. If anyone knows how to remove the readmores from short posts that do not have an extended article, I’d be greatful. And if someone knows how to do it without just hiding the content (thereby reducing download time for readers), I’d be even more greatful.

Recently I gave a short intro course on functional programming languages to people at work, introducing some basic comments. At the end, I added a very very tiny compiler to show how easy it is to create a compiler in Haskell.

I thought it might be interesting for the people out there to see it as well. As mentioned, it is very minimalistic. Keeping with the trend of the previous post, I will ensure this blogpost is proper literal haskell code.

So first we create our module. We also import the Control.Monad for liftM and liftM2. Finally, we import QuickCheck so we can create some easy tests at the end.

> module Alg where> import Control.Monad> import Test.QuickCheck

Next, we define the domain of our compiler. Namely, our compiler will compile simple arithmetic expressions, that can be nested arbitrarily deep, to a stack machine. An expression consists of either a simple number, the addition of two expressions, the multiplication of two expressions, or the substraction of two expressions. We add some standard typeclasses that allow us to easily work with them in the GHC interpreter (for instance Show to show them).

Without compiling, we can already write a mini interpreter that interprets an expression straight away. One option of making this more generic is abstracting away the specific binary operator instead of creating a specific data-constructor for each, but I will leave that as an excercise.

Next we define the codomain, or the target, of our compiler. For this I have opted for a very simple stack machine with only four instructions. Either one pushes a number, or one applies an operator to the top two numbers on the stack. As for the stack, it is simply a list of numbers.

We can also write an ‘interpreter’ for this stack-based language. First, we write the function that interprets a single stack operation with a given stack and returns a new stack. For clarity sake, I have not included error code for when the stack is empty but numbers are required.

To run a set operations, one can simply fold over the list of operations, starting with an empty stack:

> run :: [Op] -> Stk> run = foldl interpret []

Next, we define the compiler function. This compiles algebraic expressions to a list of stack operations. Notice to do this, first we calculate the two sub expressions, and then compile the operation in question:

The code is now done, and in fact, one can simply type ‘ghci Alg.lhs” and try it out. However, we will add a quickcheck instance so we can test the correctness of the compiler. Simply, we require that interpreting an expression yields the same result as the top of the stack after compiling and interpreting the stack operations. To enable this, we first need to define a quickcheck instance for the domain, namely algebraic expressions. The code is a bit more complicated as it makes sure that it never generates infinite expression trees, so I will not explain it in detail. I suggest for those interested to check The quickcheck manual, or the haskell documentation.

Now that we have an implementation that generates arbitrary algebraic expressions, it’s time to write our test case. Namely we always require (True ==>), that the result of interpretation of the algebraic expression is the same as the top of the stack after compiling and interpreting the stack operations. We could add an additional requirement that the stack only has 1 element remaining in it.

Recently I gave a short intro course on functional programming languages to people at work, introducing some basic comments. At the end, I added a very very tiny compiler to show how easy it is to create a compiler in Haskell.

I thought it might be interesting for the people out there to see it as well. As mentioned, it is very minimalistic. Keeping with the trend of the previous post, I will ensure this blogpost is proper literal haskell code.

So first we create our module. We also import the Control.Monad for liftM and liftM2. Finally, we import QuickCheck so we can create some easy tests at the end.

> module Alg where> import Control.Monad> import Test.QuickCheck

Next, we define the domain of our compiler. Namely, our compiler will compile simple arithmetic expressions, that can be nested arbitrarily deep, to a stack machine. An expression consists of either a simple number, the addition of two expressions, the multiplication of two expressions, or the substraction of two expressions. We add some standard typeclasses that allow us to easily work with them in the GHC interpreter (for instance Show to show them).

Without compiling, we can already write a mini interpreter that interprets an expression straight away. One option of making this more generic is abstracting away the specific binary operator instead of creating a specific data-constructor for each, but I will leave that as an excercise.

Next we define the codomain, or the target, of our compiler. For this I have opted for a very simple stack machine with only four instructions. Either one pushes a number, or one applies an operator to the top two numbers on the stack. As for the stack, it is simply a list of numbers.

We can also write an ‘interpreter’ for this stack-based language. First, we write the function that interprets a single stack operation with a given stack and returns a new stack. For clarity sake, I have not included error code for when the stack is empty but numbers are required.

To run a set operations, one can simply fold over the list of operations, starting with an empty stack:

> run :: [Op] -> Stk> run = foldl interpret []

Next, we define the compiler function. This compiles algebraic expressions to a list of stack operations. Notice to do this, first we calculate the two sub expressions, and then compile the operation in question:

The code is now done, and in fact, one can simply type ‘ghci Alg.lhs” and try it out. However, we will add a quickcheck instance so we can test the correctness of the compiler. Simply, we require that interpreting an expression yields the same result as the top of the stack after compiling and interpreting the stack operations. To enable this, we first need to define a quickcheck instance for the domain, namely algebraic expressions. The code is a bit more complicated as it makes sure that it never generates infinite expression trees, so I will not explain it in detail. I suggest for those interested to check The quickcheck manual, or the haskell documentation.

Now that we have an implementation that generates arbitrary algebraic expressions, it’s time to write our test case. Namely we always require (True ==>), that the result of interpretation of the algebraic expression is the same as the top of the stack after compiling and interpreting the stack operations. We could add an additional requirement that the stack only has 1 element remaining in it.