1 Preface

This page shows several examples of how code can be improved. We try to derive general rules from them, though they cannot be applied deterministically and are a matter of taste. We all know this, please don't add "this is disputable" to each item!

Instead, you can now add "this is disputable" on /Discussion and change this page only when some sort of consensus is reached.

2 Be concise

2.1 Don't reinvent the wheel

The standard libraries are full of useful, well-tuned functions. If you rewrite an existing library function, the reader of your code might spend a minute trying to figure out why you've done that. But if you use a standard function, the reader will either immediately understand what you've done, or can learn something new.

2.2 Avoid explicit recursion

Explicit recursion is not generally bad, but you should spend some time trying to find a more declarative implementation using higher order functions.

because it is hard for the reader to find out
how much of the list is processed and
on which values the elements of the output list depend.
Just write

raise x ys =map(x+) ys

or even

raise x =map(x+)

and the reader knows that the complete list is processed and that each output element depends only on the corresponding input element.

If you don't find appropriate functions in the standard library, extract a general function.
This helps you and others understand the program.
Thanks to higher order functions Haskell gives you very many opportunities to factor out parts of the code.
If you find the function very general, put it in a separate module and re-use it. It may appear in the standard libraries later, or you may later find that it is already there in an even more general way.

Decomposing a problem this way also has the advantage that you can debug more easily. If the last implementation of

raise

does not show the expected behaviour, you can inspect

map

(I hope it is correct :-) ) and the invoked instance of

(+)

separately.

This is a special case of the general principle of separating concerns. If you can write the loop over a data structure once and debug it, then there's no need to duplicate that code.

Another example: the function

count

counts the number of elements which fulfill a certain property, i.e. the elements for which the predicate

p

is

True

.

I found the following code (but convoluted in a more specific function) in a Haskell program

2.4 Remember the zero

Don't forget that zero is a natural number. Recursive definitions become more complicated if the recursion anchor is not chosen properly. For example the function

tupel

presented in DMV-Mitteilungen 2004/12-3, Jürgen Bokowski: Haskell, ein gutes Werkzeug der Diskreten Mathematik (Haskell, a good tool for discrete mathematics). This is also a good example of how to avoid guards.

As a rule of thumb, once your expression becomes too long to easily be point-freed, it probably deserves a name anyway.
Lambdas are occasionally appropriate however, e.g. for control structures in monadic code (in this example, a control-structure "foreach2" which most languages don't even support.):

3 Use syntactic sugar wisely

People who employ syntactic sugar extensively
argue that it makes their code more readable.
The following sections show several examples
where less syntactic sugar is more readable.

It is argued that a special notation is often
more intuitive than a purely functional expression.
But the term "intuitive notation" is always a matter of habit.
You can also develop an intuition for analytic expressions
that don't match your habits at the first glance.
So why not making a habit of less sugar sometimes?

3.1 List comprehension

List comprehension lets you remain in imperative thinking, that is it lets you think in variables rather than transformations. Open your mind, discover the flavour of the pointfree style!

Instead of

[toUpper c | c <- s]

write

map toUpper s

.

Consider

[toUpper c | s <- strings, c <- s]

where it takes some time for the reader
to discover which value depends on what other value
and it is not so clear how many times

the interim values

s

and

c

are used.

In contrast to that

map toUpper (concat strings)

can't be clearer.

When using higher order functions you can switch more easily from

List

to other data structures.

Compare

map(1+) list

and

mapSet (1+) set

.

If there were a standard instance for the

Functor

class

you could use the code

fmap(1+) pool

for both choices.

If you are not used to higher order functions for list processing
you may feel you need parallel list comprehension.
This is unfortunately supported by GHC now,

but it is arguably superfluous since various flavours of

zip

already do a great job.

3.2

do

notation

do notation is useful to express the imperative nature (e.g. a hidden state or an order of execution) of a piece of code.

Nevertheless it's sometimes useful to remember that the

do

notation is explained in terms of functions.

Instead of

do
text <-readFile"foo"writeFile"bar" text

one can write

readFile"foo">>=writeFile"bar"

.

The code

do
text <-readFile"foo"return text

can be simplified to

readFile"foo"

by a law that each Monad must fulfill.

You certainly also agree that

do
text <-readFile"foobar"return(lines text)

is more complicated than

liftM lines(readFile"foobar")

.

By the way, the

Functor

class method

fmap

and the

Monad

based function

liftM

are the same (as long as both are defined, as they should be).
Be aware that "more complicated" does not imply "worse". If your do-expression was longer than this, then mixing do-notation and

fmap

might be precisely the wrong thing to do, because it adds one more thing to think about. Be natural. Only change it if you gain something by changing it. -- AndrewBromage

3.3 Guards

Disclaimer: This section is NOT advising you to avoid guards. It is advising you to prefer pattern matching to guards when both are appropriate. -- AndrewBromage

which implements a factorial function. This example, like a lot of uses of guards, has a number of problems.

The first problem is that it's nearly impossible for the compiler to check whether guards like this are exhaustive, as the guard conditions may be arbitrarily complex (GHC will warn you if you use the -Wall option). To avoid this problem and potential bugs through non exhaustive patterns you should use an

Another reason to prefer this one is its greater readability for humans and optimizability for compilers. Though it may not matter much in a simple case like this, when seeing an

otherwise

it's immediately clear that it's used whenever the previous guard fails, which isn't true if the "negation of the previous test" is spelled out. The same applies to the compiler: It probably will be able to optimize an

otherwise

(which is a synonym for

True

) away but cannot do that for most expressions.
This can be done with even less sugar using

if

,

-- Less sugar (though the verbosity of if-then-else can also be considered as sugar :-)
fac ::Integer->Integer
fac n =if n ==0then1else n * fac (n-1)

Note that

if

has its own set of problems, for example in connection with the layout rule or that nested

might be optimized by the library-writer... In GHC, when compiling with optimizations turned on, this version runs in O(1) stack-space, whereas the previous versions run in O(n) stack-space.

Note however, that there is a difference between this version and the previous ones: When given a negative number, the previous versions do not terminate (until StackOverflow-time), while the last implementation returns 1.

Only use guards when you need to. In general, you should stick to pattern matching whenever possible.

3.4

n+k

patterns

In order to allow pattern matching against numerical types, Haskell 98 provides so-called n+k patterns, as in

take::Int->[a]->[a]take(n+1)(x:xs)= x: take n xs
take__=[]

However, they are often criticized for hiding computational complexity and producing ambiguities, see /Discussion for details. They are subsumed by the more general Views proposal, which has unfortunately never been implemented despite being around for quite some time now.

4 Efficiency and infinity

A rule of thumb is:
If a function makes sense for an infinite data structure but the implementation at hand fails for an infinite amount of data, then the implementation is probably also inefficient for finite data.

4.1 Don't ask for the length of a list when you don't need it

Don't write

length x ==0

to find out if the list

x

is empty.
If you write it, you force Haskell to create all list nodes. It fails on an infinite list although the expression should be evaluated to

False

in this case. (Nevertheless the content of the list elements may not be evaluated.)

In contrast

x ==[]

is faster but it requires the list

x

to be of type

[a]

where

a

is a type of class

Eq

.

The best thing to do is

null x

Additionally, many uses of the length function are overspecifying the problem: one may only need to check that a list is at least a certain length, and not a specific length. Thus use of

length

could be replaced with an

atLeast

function that only checks to see that a list is greater than the required minimum length.

6 Miscellaneous

6.1 Separate IO and data processing

It's not good to use the IO Monad everywhere,
much of the data processing can be done without IO interaction.
You should separate data processing and IO
because pure data processing can be done purely functionally,
that is you don't have to specify an order of execution
and you don't have to worry about what computations are actually necessary.
Useful techniques are described in Avoiding IO.

6.2 Forget about quot and rem

They complicate handling of negative dividends.

div

and

mod

are almost always the better choice. If

b >0

then it always holds

a == b *div a b +mod a b
mod a b < b
mod a b >=0

The first equation is true also for

quot

and

rem

,
but the two others are true only for

mod

, but not for

rem

. That is,

mod a b

always wraps

a

to an element from

[0..(b-1)]

, whereas the sign of

rem a b

depends on the sign of

a

.
This seems to be more an issue of experience rather than one of a superior reason. You might argue, that the sign of the dividend is more important for you, than that of the divisor. However, I have never seen such an application, but many uses of

quot

and

rem

where

div

and

mod

were clearly superior.

Examples:

Conversion from a continuously counted tone pitch to the pitch class, like C, D, E etc.: