Read my book

Friday, September 23, 2011

Infinite Lists in Haskell

Lately I've been shaking up my world by delving into Haskell. It has been quite an interesting experience. Compared to the languages I've used before it's somewhat different given it's a pure, functional language. The language is filled with features. In this post I'm going to discuss one of those, infinite lists in particular.

Infinite Lists in a Nutshell

Infinite lists provide a way to define mathematical series in a neat manner. Consider a definition such as [1,3..] for example. As you might have guessed this gives you an ascending series with numbers 1, 3, 5 , 7 and so on. You can do this the other way too. [-1,-3..] gives you the same in negative.

The same idea works with characters too. ['a','b'..] gives you "abcd..." and some weirdness after you reach the end of alphabet. To give you an example of this, consider the following:

As noted by "cgibbard" at Reddit, ['a','b'..] doesn't actually yield an infinite list. Instead you just get a really long one. It's easy to verify this simply by checking the length of the list (length ['a','b'..]). What a curious property!

Besides the behavior, this example should give you an idea how functions work in Haskell (roughly). "take" function receives two parameters, namely 50 and our infinite list. Based on those it picks 50 items from the list itself. The example also highlights the way Haskell separates characters from strings via notation. Note that a string is essentially just a list of characters.

It is possible to use ".." to produce bounded lists as well. Just consider [1..10] or [10,9..1]. Curiously [10..1] yields an empty list. It seems in case you want to generate a decreasing list, you need to explicitly state it.

The notation has its limits, though. You cannot state something like [1,2,3..] for instance and have it infer the rest of your list. It's easy to see this could be quite hard. Should the fourth item be 4 or 5 or perhaps something else?

Generating Fibonacci Series as an Infinite List

You can get only so far by knowing the basic notation. What if you wanted to generate a more complicated list? Haskell's standard module, Prelude, contains quite a few ways to accomplish this. It can be quite an intimidating API to study, especially for a beginner. You might want to study this alternative description to get a nice idea of the basic functionality.

"Infinite list tricks in Haskell" contains many nice ways to generate various infinite lists. Just to give some idea of these, consider the following definition of the Fibonacci series I picked from the article: fibs3 = 0 : scanl (+) 1 fibs3 . Let's spell that out a bit.

In the beginning you see a "cons" operator. It inserts a new item to the beginning of the given ( : ). This operation is a cheap one given lists are defined as singly linked lists in Haskell. There's an operator for appending (++) as well. This operation is expensive given the aforementioned fact and probably should be avoided if possible.

"scanl" is a function that receives some function ((+) in this case), the first item of the resulting list (1 in this case) and a list. It then executes the function successively as follows ({firstItem} {op} {listItem0}, {listItem0} {op} {listItem1}, {listItem1} {op} {listItem2}, ...). In case of our algorithm, it is applied like this: 1 + 0, 0 + 1, 1 + 1, 1 + 2, 2 + 3, 3 + 5, ... .

Just like with infinite lists in general Haskell generates new list items on demand, lazily. There are various other ways to define the Fibonacci series. Thanks to Haskell's expressiveness you can get quite close to its pure mathematical definition if you want to. The other implementations available are definitely worth looking into as this will give you some additional insight to the language.

Joining Multiple Infinite Lists into One

So what if you wanted to get a bunch of cool infinite lists like the ones mentioned above and combine them into one really neat one? You might want to generate something like [1, -1, 2, -2, 3, -3, ...] for instance. As it happens there are a few nifty functions that allows us to get the job done.

Our desired list contains two smaller ones ([1,2..] and [-1,-2..]). They also need to be combined, positive list first. We can do this as follows: concat $ zipWith (\x y -> [x, y]) [1,2..] [-1,-2..] .

"zipWith" is one of those nice functions that allows us to join lists and define the way they are joined. In this case we join them using a lambda. After this operation we have a list like this: [[1, -1], [2, -2], ...]. That's not quite what we're after, though. That's why we "concat" at the end.

If we want just ten first items, we can use "take" just like above. In this case, however, we need to use the $ operator just like in the function definition itself: take 10 $ concat $ ...

What does that dollar do there? It is easier to visualize like this (concat)(zipWith (...) [...] [...]). As you can see, it splits the expressions at its location. This is just a neat way to avoid extra parentheses in the code.

Conclusion

I hope this post served as a some kind of introduction to Haskell's infinite lists. They are a powerful concept that will come in handy especially if you have to tackle mathematical problems. This was just a tip of an iceberg, however. The language is filled with little gems like this. It's definitely worth looking into in case you want to take your programming skills to the next level.