A Taste of Haskell

In this post I want to highlight a few fun aspects of the Haskell programming language. The purpose is to give you a taste of Haskell so that you will want to learn more of it. Don’t consider this as a tutorial or guide but rather as a starting point, as it is based on a short talk I held at work, which in turn is based on my favorite material from holding practical courses about Haskell at university.

Let’s start by seeing how programmers compare Haskell to a mainstream programming language, for example Java:

While this definition sounds complicated you should understand more of it after reading this post. Let’s start with the basics: What’s the difference between an imperative programming language and a functional one? In imperative programming the basic operation is changing a stored value. In functional programming the basic operation is applying a function to arguments.

Haskell itself was designed in 1990 by a scientific committee with the purpose of being a basis for functional programming research. The Glasgow Haskell Compiler (GHC) is the most popular implementation and the one we will be using. You can get it as part of the Haskell Platform as well. Haskell code can be compiled (ghc), but also interpreted (ghci, interactive).

In the rest of this blog post we will look at some key features of Haskell interactively, so you can get your own installation of GHC and follow along and experiment with the code.

Arithmetic

I belive that the best way to learn a programming language is by playing around with it. So that’s what we’re going to do now. I’ll show a few examples and explain the cool Haskell features that we encounter on the way.

We start out by running ghci, the interactive Glasgow Haskell Compiler interpreter (indicated by λ>). We also create a single file tut.hs which we will use to write some more advanced Haskell code and load in the interpreter.

For starters let’s do some basic arithmetic in GHCi, replacing our dusty old calculator:

We see that Integers can be of unbounded size, instead of the usual limits of 32 or 64 bits in many programming languages. Of course if you really want them there are also more efficient machine ints in Haskell:

λ>(2::Int)^101024λ>(2::Int)^1000

Using the :: Int notation we specifiy that the number 2 is explicitly of type Int instead of being automatically inferred to be of type Integer. While we’re looking at types, there are also floating point numbers of course:

λ>2**101024.0λ>2**10001.0715086071862673e301λ>2**10000Infinity

As usual, floating point numbers are of limited precision, so at some point we just reach “approximately infinity”. Let’s do some more math:

Looks like sin is a function, but the error message is already confusing. Looking at the part in brackets gives us a good hint: maybe you haven’t applied enough arguments to a function?. Let’s check out the type of sin:

λ>:typesinsin::Floatinga=>a->a

Invoking :type tells us that sin is a function that takes a value of type a and returns a value of type a, where a is a floating point number. So let’s pass a number to sin:

div is another function. As you can see it accepts two parameters, both of type a and finally returns a value of type a. In this case a has to be an Integral, some kind of integer-like type. We can even ask GHCi what an Integral is supposed to be:

λ>:infoIntegralclass(Reala,Enuma)=>Integralawherequot::a->a->arem::a->a->adiv::a->a->amod::a->a->aquotRem::a->a->(a,a)divMod::a->a->(a,a)toInteger::a->Integer-- Defined in ‘GHC.Real’instanceIntegralWord-- Defined in ‘GHC.Real’instanceIntegralInteger-- Defined in ‘GHC.Real’instanceIntegralInt-- Defined in ‘GHC.Real’

Integral is a type class which requires a few functions to be defined for the type, like quot and rem. We also see that an Integral type needs to have all properties of a Real and an Enum type. Three types are known to GHCi which adhere to this type class: Word, Integer and Int

It looks a bit awkward to write div 15 6, so Haskell offers some syntactic sugar to use the div function in infix notation:

λ>15`div`62λ>15`mod`63

The same thing works in reverse to use operators as regular functions, using brackets:

λ>(+)235λ>:t(+)(+)::Numa=>a->a->a

Of course there are quite a lot of other functions that we won’t have time to explore, but you can always explore them yourself in the documentation or using Hoogle:

Lists

Lists are the most important data structure for us, they are simply a collection of values of the same type:

λ>:info[]data[]a=[]|a:[a]-- Defined in ‘GHC.Types’...

What this definition tells us is that a list is a data type that is either an empty list ([]) or a value concatenated with a list itself (a : [a]). So the data type is recursively defined, referring to itself. With this knowledge we can create basic lists ourselves:

λ>[][]λ>1:[][1]λ>1:2:[][1,2]

We can also use some syntactic sugar to create lists instead:

λ>[1,2,3,4,5][1,2,3,4,5]

We just said that a list is supposed to be a collection of values of the same type. What happens if we try to break that?

Good, that shouldn’t work and indeed it doesn’t. The error message tells us that the list is assumed to be a list of booleans because of the final value. But 1 is not a boolean value, so there is no valid type for this list.

λ>[1..5][1,2,3,4,5]λ>[1,3..20][1,3,5,7,9,11,13,15,17,19]

We can store values in variables, but the name “variable” might confuse you a bit, because variables can not be overwritten:

λ>letxs=[1..5]λ>letxs=[1..6]

The seconds line creates a new variable xs that makes the old one invisible in the new scope. Actually variables are always immutable in Haskell. That means you can easily share access to the same data because there is no way in which it can be overwritten:

Knowing how lists are implemented we can easily implement our own definitions of these functions:

-- data [] a = [] | a : [a]head'(x:xs)=x

This definition uses pattern matching. The pattern is (x:xs) where x and xs are variables delimited by the cons (:). We use the knowledge of the definition to split up the passed value into two parts, the first element x and the rest of the list xs. Then the result of our function is simply the first element x.

Here the idea is to reduce the problem to something that we can solve. If the list only contains a single element, we know that this exact element is the last one. If the list has more than one element, we remove the first element of the list and recursively call last' on the rest of the list. Yet again the last element of an empty list makes no sense, so we throw an error.

The idea here is a bit more complicated. We know that init of a list with one element [x] is the empty list []. If the list has more than one element we know that init contains the first element x, concatenated with init of the rest of the list.

Finally let’s define the length function, which works similarly:

length'[]=0length'(x:xs)=1+length'xs

The length of an empty list is 0. The length of a list with more than 0 elements is 1 plus the length of the rest of the list. These functions read like mathematical definitions of the properties they are encoding.

Higher-order Functions

So we have defined our first basic functions and noticed that there is no magic happening in Haskell’s standard library. Instead all of these functions are easily implementable. Now let’s look at some more advanced operations that we can perform on lists, for example mapping a function to a list, thus applying it to each value in the list:

λ>letfx=2*xλ>mapf[1..5][2,4,6,8,10]

It’s a common theme in functional programming to pass functions as parameters to higher order functions. But it’s a bit annoying to define a named function explicitly all the time, so instead we can quickly create an unnamed function, called a lambda function, instead:

λ>map(\x->x*2)[1..5][2,4,6,8,10]

You can see the lambda function (\x -> x * 2), which is specified to take a parameter x and return x * 2. Of course Haskell has some sweet syntactic sugar to do this even more succinctly by just writing (*2):

λ>map(*2)[1..5][2,4,6,8,10]

Let’s see how we can define our own map function:

map'f[]=[]

The base case is easy: When we get an empty list passed, the result is also an empty list.

map'f(x:xs)=fx:map'fxs

Otherwise we take the first element x in the list, apply the function f to it and create a new list with this new value as the initial element. We recurse and apply map to the rest of the list in the same way.

At this point we can take a look at the actual implementations in GHC’s standard library and we notice that map is implemented exactly as we wrote it.

Note that we’re not modifying the passed data structure directly, instead we create a new one. Actually in Haskell there is no way to modify data structures, they are all immutable. And functions are pure, so there is no way for a function to have any side effects other than returning a value directly. That’s also why we have referential transparency: When you call a function with the same inputs, it will always return the same output.

Let’s turn to the next function, filter, which keeps only those elements in a list which fulfil a predicate:

λ>filterodd[1..5][1,3,5]

Filter is also easy to implement:

filter'f[]=[]filter'f(x:xs)|fx=x:filter'fxs|otherwise=filter'fxs

The guarded equation at the start of the line means that we check a boolean. When f x is true we return the first line, including x, otherwise we don’t include x in the rest of the list. Finally we recurse with the rest of the list, until we reach the base case for the empty list.

By implementing a few functions on list a common pattern emerges. Let’s see how to implement the sum over a list:

λ>sum[1..5]15

sum'[]=0sum'(x:xs)=x+sum'xs

We already implemented the length of a list:

length'[]=0length'(x:xs)=1+length'xs

How do we define the and function over an entire list?

λ>and[True,False,True]Falseλ>and[True,True,True]True

and'[]=Trueand'(x:xs)=x&&and'xs

Notice the similarity by now?

The functions sum, length and and all are built in pretty much the same way. So we can create an abstraction over this pattern, which is called a right-associative fold, or foldr in short:

foldr'fi[]=ifoldr'fi(x:xs)=x`f`foldr'fixs

Using this pattern it becomes trivial to define these functions:

sum''xs=foldr'(+)0xs

Even better, we don’t need to write down the last parameter, xs:

sum''=foldr'(+)0

What’s going on there? Actually in Haskell when you pass a parameter to a function, a new function is returned. So you can consider each function as accepting a single parameter, then returning a new function, which is applied to the next parameter, and so on.

So in our definition of sum'' we don’t need to have a parameter and put it into foldr', instead we can also have no parameter and just return the function that is returned from (foldr' (+) 0), which takes a list as its parameter.

Lazy Evaluation and Sharing

While working with lists in Haskell you might wonder what happens if we never reach the base case? We have lazy evaluation in Haskell, which tells us that a data structure is only evaluated when it’s actually needed. So we have no problem handling lists of infinite size:

So, zipWith is a function that takes a function (a -> b -> c) as its first parameter. The second parameter is a list of values of type a, the third parameter is a list of values of type b and finally a list of type c is returned.

Of course to understand it we best implement it:

zipWith'f(x:xs)(y:ys)=fxy:zipWith'fxsyszipWith'fxsys=[]

We can use zipWith to trivially define a list of all fibonacci numbers:

fibs=0:1:zipWith'(+)fibs(tailfibs)

λ>take10fibs[0,1,1,2,3,5,8,13,21,34]

The start makes sense, a list of fibs starts with 0 and 1, then the rest is defined with zipWith over fibs itself and (tail fibs). This smells like magic, how can it possibly work? Functions are pure in Haskell. That means that a function always returns the same value when you call it with the same arguments. There is no way to have a variable in which you store some state. Since data structures and functions in Haskell are immutable and pure we can not change them in any way, so we can reuse them without having to recalculate them. So in this case we can refer to fibs' multiple times even inside the definition of fibs' itself.

List Comprehensions

Instead of all these filters and maps we can also use list comprehensions:

λ>map(*2)$filterodd[1..5][2,6,10]λ>[x*2|x<-[1..5],oddx][2,6,10]

We can even implement the sieve of eratosthenes to calculate all prime numbers as a oneliner with a list comprehension:

sieve(p:xs)=p:sieve[x|x<-xs,x`mod`p>0]primes=sieve[2..]

Because in Haskell functions are pure and data is immutable we can perform equational reasoning. We can guarantee that code can be replaced by its definition. Basically this means that you can do refactoring without any risks. You know that nothing can go wrong because there is no state. Every function only takes its arguemnts and returns a value based on those, always the same value. So you can reorganize your functions however you want, the order does not matter and there is actually no order semantically enforced in Haskell.

Conclusion

I hope you enjoyed this small excursion into functional programming land with Haskell. Maybe you learned something that gives you something to think when programming in your favorite programming language. If you’re interested in learning more Haskell, here are a few books you can read: