No monadic headaches: multi-core concurrency is easy in Haskell

Haskell systems don’t, in general, parallelize well. They’re particularly bad for the kind of very coarse thread-based concurrency that we need to program for on multi-core computers, or distributed systems.

I suspect Mark hasn’t written much multicore concurrent Haskell code, as he resorts to the usual “monads are hard” argument, rather than offering example code. However, here in the real world, concurrent programming in Haskell is pretty much identical to Erlang, though with nicer syntax, in a more modern language.

Importantly, there are no monadic headaches involved (monads don’t even enter the picture). And the results often out-perform Erlang (and most other languages), due to native code compilation with type erasure, and better data structure support, provided by Haskell libraries (particular for strings) simply not available in Erlang.

This translates directly to Haskell’s forkIO and MVars for communication: We just launch the main thread here, which forks a second Haskell thread, which in turn sits receiving messages and printing them:

To run such a threaded Haskell program across multiple cores (this coarse gained, multicore concurrency Mark talks about), we need to nothing more than compile in the SMP runtime:

$ ghc -threaded -O Concurrent.hs

Now that’s compiled to real native code, on an smp parallel runtime! We can run this over as many cores as we have, and GHC will schedule threads across the cores:

$ ./a.out +RTS -N2

There are no monadic headaches. Concurrent and parallel multicore programming in Haskell is simple, efficient and easy!

Since its so easy, and has such little impact on the structure of your Haskell programs, you can start speculatively supporting multicore hardware: your Haskell program will utilise as many real cores as it needs, without needing to recompile and modify the code. Just change the value of `N’, and throw around forkIO liberally, much as you would in Erlang.

So let’s do something useful with this, how about a little program that computes primes and fibonacci numbers? We’ll just fork processes to compute prime numbers and fibonacci numbers, and have the main thread lazily print results as they’re found:

importControl.Concurrent.ChanimportControl.Concurrent-- Fork some computation processes, print their resultsmain=doprimes<-runprimesfibs<-runfibonaccimapM_print$zipprimesfibs-- fork a process, return any messages it produces as a listwhererunf=doc<-newChanl<-getChanContentscforkIO(writeList2Chancf)returnl-- A function to compute primesprimes=sieve[2..]wheresieve(p:xs)=p:sieve[x|x<-xs,x`mod`p>0]-- A function to compute fibonacci numbersfibonacci=0:1:zipWith(+)fibonacci(tailfibonacci)

Very importantly, we see our computational processes are just pure Haskell code, and our main function is some IO skin, as usual. This is almost identical to code you would write ignoring concurrency: nothing scary is needed to write parallel Haskell!

3 comments

I wish for an answer this easy for *distributed* Haskell programs. Erlang does seem to have a better default there, and better interoperability with other systems (using its binary pattern matching syntax).

I also know I’ve had problems with early experiments with GHC’s parallel Channels. Channels tend to fill up, as the producer gets scheduled more often than the consumer—or the consumer reliably hits STM retries, so the producer fills memory. Erlang’s VM has a nice default scheduler: it balances reductions in expressions between threads, and it rewards consumption of messages from channels. I don’t know how to imitate that behavior in Haskell—how to write one library once and have the scheduling behavior I’d like.

Of course, for distributed systems, Erlang’s libraries are far more developed than Haskells, though nothing’s stopping you putting nodes of Haskell runtimes across a cluster speaking, say, the Erlang wire protocol.

The channel-filling-up leading to STM contention issue is intersting. Maybe you should be using Chan rather than STM Chans there.