The Haskell School of Expression: Learning Functional Programming through Multimedia

author

Paul Hudak

pages

363

publisher

Cambridge University Press

rating

9

reviewer

Isaac Jones

ISBN

0521644089

summary

Learn to program in a functional style in Haskell by implementing graphics, music, and robots simulations.

As the title implies, The Haskell School of Expression introduces functional
programming through the Haskell programming language and through the use of graphics and music. It serves as an effective introduction to both the language and the concepts behind functional programming. This text was published in 2000, but since Haskell 98 is the current
standard, this is still a very relevant book.

Haskell's standardization process gives us a window into two different facets of the community: Haskell is designed to be both a stable, standardized language (called Haskell 98), and a platform for experimentation in cutting-edge programming language research. So
though we have a standard from 1998, the implementations (both compilers and interpreters) are continually evolving to implement new,
experimental features which may or may not make it into the next standard.

For instance, the Glasgow
Haskell Compiler has implemented a meta-programming environment
called Template Haskell. Haskell is also easy to extend in directions that don't change the
language itself, through the use of Embedded Domain-Specific
Languages (EDSLs)
such as WASH
for web authoring, Parsec for parsing,
and Dance
(more of Paul Hudak's work) for controlling humanoid robots.

Before we get too far, I should offer a disclaimer: The Haskell
community is rather small, and if you scour the net, you may find
conversations between myself and Paul Hudak or folks in his research
group, since I use some of their software. That said, I don't work
directly with Hudak or his research group.

In fact, the small size of the Haskell community is a useful feature. It
is very easy to get involved, and folks are always willing to help
newbies learn, since we love sharing what we know. You may even find
that if you post a question about an exercise in The Haskell School of Expression , you'll get a reply from the author himself.

I consider this book to be written in a "tutorial" style. It walks
the reader through the building of applications, but doesn't skimp on
the concepts (indeed, the chapters are meant to alternate between
"concepts" and "applications"). In some ways, the code examples make
it a little difficult to jump around, since you are expected to build
upon previous code. The web site provides code, however, so you can
always grab that and use it to fill in the missing pieces.

For readers who wish to use this book as a tutorial, and to
implement all of the examples (which is highly recommended), I suggest
that you grab
the Hugs interpreter and
read the User's
Guide while you're reading the first few chapters of
The Haskell School of Expression. Hugs is very portable, free, and easy to use. It
also has an interface with Emacs.
Unfortunately, some of the example code has suffered from bit-rot, and
certain things don't work out-of-the-box for X11-based systems. The
bit-rot can be solved by using the "November 2002" version of Hugs.
This is all explained on SOE's web page.

The Haskell School of Expression should be very
effective for programmers who have experience in more traditional
languages, and programmers with a Lisp background can probably move
quickly through some of the early material. If you've never learned a
functional language, I highly recommend Haskell: Since Haskell is
purely functional (unlike Lisp), it will more or less prevent
you from "cheating" by reverting to a non-functional style. In fact,
if you've never really looked at functional programming languages, it
may surprise you to learn that Haskell has no looping constructs or
destructive assignment (that is, no x = x + 1). All of the tasks
that you would accomplish through the use of loops are accomplished
instead through recursion, or through higher-level abstractions upon
recursion.

Since I was already comfortable with recursion when I started this
book, it is hard for me to gauge how a reader who has never
encountered recursion would find this book's explanation of the concept. The Haskell School of Expression
introduces recursion early on, in section 1.4. It is used in examples
throughout the book, and if you follow along with these examples, you
will most certainly be using it a lot. The introduction seems natural
enough to me, but I note that Hudak does not give the reader any extra
insight or tricks to help them along. Not to worry, though; recursion
is very natural in Haskell and the reader may not even notice that
they are doing something a little tricky.

The use of multimedia was a lot of fun for me, and should quickly
dispel the myth that IO is difficult in Haskell. For instance, Hudak
has the reader drawing fractals by page 44, and throughout the book,
the reader will be drawing shapes, playing music, and controlling
animated robots.

Any book on Haskell must be appraised for its explanation of
monads in general and IO specifically. Monads are a purely
functional way to elegantly carry state across several computations
(rather than passing state explicitly as a parameter to each
function). They are a common stumbling block in learning Haskell,
though in my opinion, their difficulty is over-hyped.

Since input and output cause side-effects, they are not purely
functional, and don't fit nicely into a function-call and recursion
structure. Haskell has therefore evolved a way to deal safely and
logically with IO through the use of monads, which encapsulate mutable
state. In order to perform IO in Haskell, one must use monads, but
not necessarily understand them.

Some people find monads confusing; I've even heard a joke that you
need a Ph.D. in computer science in order to perform IO in Haskell.
This is clearly not true, and this book takes an approach which I
whole-heartedly agree with. It gets the reader using monads and IO in
chapter 3 without explaining them deeply until chapters 16 (IO) and 18
(monads). By the time you get there, if you have heard that monads
are confusing, you might be inclined to say "how is this different
from what we've been doing all along?" Over all, I was pleased with
the explanation of monads, especially state monads in chapter
18, but I felt that the reader is not given enough exercises where
they implement their own monads.

If you're worried that drawing shapes and playing music will not
appeal to your mathematic side, you will be pleased by the focus on
algebraic reasoning for shapes (section 8.3) and music (section 21.2),
and a chapter on proof by induction (chapter 11).

After reading this book you will be prepared to take either of the
two paths that Haskell is designed for: You can start writing useful and elegant tools,
or you can dig into the fascinating programming language research
going on. You will be prepared to approach arrows, a newer
addition to Haskell which, like monads, have a deep relationship to
category theory. Arrows
are used extensively in some of the Yale Haskell group's recent work.
You will see a lot of shared concepts between the animation in
The Haskell School of Expression and Yale's "Functional
Reactive Programming" framework, Yampa. If you like little languages,
you'll appreciate how useful Haskell is for embedded domain-specific
languages. It may
be even more useful now that Template Haskell is in the
works. Andrew
Cooke described Purely Functional Data Structures as
a great second book on functional programming. In my opinion,
The Haskell School of Expression is the great first book
you're looking for.

I've read this book and think this is a good book for learning Haskell (perhaps the best one) and that it explains monads well.

However, it won't get the reader fully up and running as a productive Haskell programmer, because for that it is basically required that you master the GHC's (Glorious/Glasgow Haskell Compiler) standard library. Otherwise you won't know how to use a fast unboxed array, etc. This library is actually intelligently designed, but it is poorly documented, and there are lots of quirks for people who aren't totally familiar with Haskell yet. The best way to learn still seems to be to read the Haskell newsgroup and look at other people's code.

But Haskell is an extremely interesting language and well worth learning IMHO.

The foreign language interface manual [unsw.edu.au]. Since Haskell has a small library you will probably need to call functions written in C a lot to get anything done. Luckily, Haskell's foreign function interface is quite nice.

Who is the target audience for this book? I would assume programmers, of at least moderate experience.

Actually, the book is targeted mainly at novices (although experienced programmers who have never seen Haskell will also learn). In fact, the author even mentions high school students. This is from the preface (page xvi):

All the material in this textbook can be covered in one semester as an advanced undergraduate course. For lower-level courses, including possible use in high school, some mathematics may cause problems, but for bright students I suspect most of the material can still be covered.

So obviously an intimate knowledge of red-black trees is not required.

OCaml, which is not purely functional but still closely related to Haskell, is nearly as fast as C. Haskell somewhat acts as a testbed for ideas in the ML language family, and future versions of OCaml are expected to include many features that were first implemented in Haskell. I'd also suggest that Haskell is a good introductory language for future OCaml programmers as it ensures they won't just try and use OCaml like a weird imperative language.

OTOH, it is theoretically possible to automatically multithread purely functional programs, especially if they're lazy like Haskell. So it could end up being a very important language on multi-processor and distributed systems.

Finally, Haskell has an excellent foreign function interface for when you need C-like performance and control.

Long answer:Monads are a pattern for hiding a state that gets passed from a sequence of functions. For example, when you assign to a variable in an imperative language, the value of that variable in the implicit state is updated and all future phrases accessing that variable will get the new value. If you're using a Haskell state monad it works the same way, but you need to explicitly specify which phrases can be executed in the future (using sequencing operators much like C's semicolon). Effectively Haskell monads allow all imperative constructs except for backwards jumps (goto).

The Haskell IO Monad is a purely functional mechanism for ensuring that modifications of the RealWorld are done in the correct order. It is implemented to call system calls which have real side-effects, but wrapping IO in a monad ensures that you can still reason about the order the side-effects will occur in. Since Haskell is lazy, these side-effects won't actually be triggered until necessary, either because the program needs an input value or it exits.

(I can give examples if anyone wants, but the resources at haskell.org [haskell.org] may be more helpful.)

Ask again in comp.lang.functiona;l the boffins there are pretty good at explaining this precise question (a VFAQ).

For now, think of monads as bathroom fixtures and monad combinators as plumbing.

You COULD implement variables with state by creating a purely-functional assoc-list and explicitly threading it through your code. The state monad does that for you automatically, by forcing you to write your code as higher order functions, and by linking up the plumbing for you behind the scenes.

As a much simpler example, consider exception handling, were instead of returning a value of type t, a function that can fail returns the type (Maybe t), indicating failure or a returned result.

these are normal functions, no magic needed (written here using case rather than pattern matching to highlight similarity to above code)

unitM r = Ok arunM m = case m of
Fail = error "oops!"
| Ok r = a

bindM m f = case m of
Fail = Fail
| Ok r = (f r)

(./'s ecode tag is Borken, I think)

... anyway, so what? The cool thing here is that in the case of IO (where instead of doign exception pattern matching, the monad combinators pass a representation of the whole external world), we can prove that the code inside the monads cannot ever directly access the world representation. Thus, the world is single-threaded through the computation. This is good, because then we can skip passing a representation, and just do the IO operations directly on the real world. So we can actually do side-effecting operations in a pure language.

if this doesn't seem like magic, you are either much to smart, or have missed something.

Thus, there's a MASSIVE performance loss when a functional programming language is executed on any of the existing processors. Because the compilers can't think and optimise the code to best fit the imperative model. Where as the human being s can. That's why we should stick to imperative programming languages.

You are exaggerating the performance penalty. See for instance the old version of The Great Computer Language Shootout [bagley.org] where Haskell is ranked faster than Java. Of course those benchmarks don't tell the whole story and should be taken with a grain of salt. In my experience though, Haskell is only about 4 times slower than C, compared to, for instance, Python, which is about 20 times slower (I am a big python fan, so this isn't a criticism of Python).

Just as plenty of people are willing to put up with Python's slowness in exchange for better debugging, faster development, dynamic typing, etc., I think plenty of people would benefit by moving from C to Haskell, which is purely functional, has a great type inferencing system, never seg faults, etc.

One final note is that Haskell programs can often be optimized in Haskell itself and approach C speeds. This is because Haskell is compiled and statically typed and can deal with unboxed elements and so forth. This is a big difference from Python and other dynamically typed languages where optimization sometimes must be done in C for best results.

b) It's much easier to hack out some program in procedural language and then hack bits of it randomly until it kind of works. Programming in functional languages requires more thought.

This is nonsense. Functional programming almost always requires less thought because:

They are garbage collected

Programs tend to be much shorted

There are fewer assignments

Type systems catch a huge number of stupid mistakes

Functional programming is only feels harder because your are forced to correct your errors (instead of silently failing).

It might be hard (only at first) to wrap your head around higher-order functions, but all of you java programmers do the same thing (with ugly syntax) when you create inner classes.

For example in C, imagine we have the following function:

void foo(int& i) {if(i==1) g();if(i==2) h();}

In C, we have to assume that the call to g() may have changed the value of i. In a functional language this is not true.

That is not true. If you were to transliterate this into caml, then the type of i would be a reference to foo, which g() could still change. Odds are though, you will very infrequently need to pass a reference.

Since the answer is always kept in acc, it's an iterative process, so it will never overflow if the compiler optimizes it. O'Caml does it, Scheme implementations do it, I think Haskell does it, many Common Lisp implementations do it, etc. So it's not a problem.

That's what bothers me about "pure" languages of any form. You're intentionally throwing away some of the available tools to prove a point.

Well, Haskell isn't throwing anything out. You can do both purely functional and imperative programming with Haskell. It's just that Haskell's designers went deep enough into the theory to come up with an elegant way to bring the functional and imperative worlds together (category theory and monads). So you need not give up the benefits of one to have the other.

A better book for functinoal programming is actually one about python.Its called http://www.amazon.com/exec/obidos/tg/detail/-/0321 112547/qid=1079471944/sr=8-1/ref=sr_8_xs_ap_i1_xgl 14/102-3354031-2256126?v=glance&s=books&n=5078 46 Text Processing with Python and while it only deals with 'text'. The author explains that tends to cover almost everything we do with computers these says and goes on to explain that pyhton is kind of schizophrnic in that while it is ver object oriented, It is also very functional-programming-ly:) oriented. Its just the python doesnt tend to showcase that side, but within the developers there is a strong bent towards functional programming. Anyway, I find the book extremely useful when I need to learn the best way to do something python and highly recommend it to anyone that wants to develop their python skills and/or learn functional programmign since he explains it very well.
Thanks for reading

Lots of other comments, as well as the story, have pointed to a number of good tutorials and introductions. I'd like to recommend also the one I wrote for IBM developerWorks. I believe my tutorial is a bit better for real beginners to FP than are most of the others out there.

This is totally not true. I build real programs that do real stuff in SML every day.

Thus, there's a MASSIVE performance loss when a functional programming language is executed on any of the existing processors.

This is also completely false. Optimizing high-level languages is often easier, because there is more semantic information to exploit (types, higher-order code). My SML programs typically run about 20% slower than a C counterpart, while being much shorter, more frequently correct, and more secure.

Most people think of a tree and implement it in SQL essentially as a one-way linked list, i.e., RecordID, ParentRecordID. These just are a PITA to navigate.

Others (Joe Celko, for one "SQL For Smarties") have proposed that a L-R list would work better, in that instead of storing the connection points, you are storing the edges as L-R pairs. The only problem is updating the L and R keys on inserts and deletes would require either triggers or using stored procedures to maintain the key list, so this would bog down greatly for OLTP systems. But they certainly are a dream to query up *and* down the tree, unlike the single linked list implementation!

And, they would probably work better than Oracle's functions that support hierarchical queries.

Builds a new array with the same dimensions (indices from a..b), initializes it with a list of tuples that gets created using "zip" which takes the elements of the input array as the indices of the output array. Note: the signature in the first line is optional.Nice, isn't it?

You're kidding, right?
It depends, of course, on your representation of a permutation; let's take [(Int, Int)] as our permutation type. So the permutation on three elements that reverses it is [(1,3),(2,2),(3,1)].

invert:: [(Int,Int)] -> [(Int,Int)]invert = map (\(a,b) -> (b,a))

Now, if you want to use [Int] as a representation, so that the example permutation is [3,2,1], then you need to first add the indices (linear time), reducing it to the previous problem, and then extract the corresponding list. Therein lies the rub.

It's not the imperative nature of the permutation algorithm that's magic; it's that C provides a primitive for (it seems) constant-time array access. Provide such a primitive in Haskell (see the Array class) and you have essentially the same linear-time algorithm.
In fact, the Array [haskell.org]section of the Haskell 98 report provides the following example:

-- Inverting an array that holds a permutation of its indicesinvPerm:: (Ix a) => Array a a -> Array a ainvPerm a = array b [(a!i, i) | i <- range b] where b = bounds a

I thought about coming up with a syllabus for myself of C, Haskell, LISP and Perl (which just evades me....)

I'd like to strongly recommend some books. The first is Modern C++ Design [moderncppdesign.com] by Andrei Alexandrescu. The second is On Lisp [paulgraham.com] by Paul Graham. In conjunction with that, you will need an introductory text on Lisp if you don't already know it and a good book on C++ templates. While I don't know what the best Lisp text currently in print is, I'd be willing to give Graham's ANSI Common Lisp [paulgraham.com] a try on the strength of his other book. And C++ Templates: A Complete Guide [josuttis.com] by Vandevoorde and Josuttis illuminates a lot of dark corners in templates and explains some new features.

In the end, the goal is to understand how many times Graham and Alexandrescu are talking about the same things using very different languages. C++ templates are in many important ways a compile-time, functional macro language on top of C++ that implement many of the features of Lisp macros. For what it's worth, Bruce Eckel has mentioned this generic programming link [mindview.net] in the context of discussing Java generics [sun.com].