Concurrent Haskell

Concurrency is "lightweight", which means that both thread creation
and context switching overheads are extremely low. Scheduling of
Haskell threads is done internally in the Haskell runtime system, and
doesn't make use of any operating system-supplied thread packages.

However, if you want to interact with a foreign library that expects your
program to use the operating system-supplied thread package, you can do so
by using forkOS instead of forkIO.

Haskell threads can communicate via MVars, a kind of synchronised
mutable variable (see Control.Concurrent.MVar). Several common
concurrency abstractions can be built from MVars, and these are
provided by the Control.Concurrent library.
In GHC, threads may also communicate via exceptions.

Basic concurrency operations

A ThreadId is an abstract type representing a handle to a thread.
ThreadId is an instance of Eq, Ord and Show, where
the Ord instance implements an arbitrary total ordering over
ThreadIds. The Show instance lets you convert an arbitrary-valued
ThreadId to string form; showing a ThreadId value is occasionally
useful when debugging or diagnosing the behaviour of a concurrent
program.

Note: in GHC, if you have a ThreadId, you essentially have
a pointer to the thread itself. This means the thread itself can't be
garbage collected until you drop the ThreadId.
This misfeature will hopefully be corrected at a later date.

Note: Hugs does not provide any operations on other threads;
it defines ThreadId as a synonym for ().

throwTo does not return until the exception has been raised in the
target thread.
The calling thread can thus be certain that the target
thread has received the exception. This is a useful property to know
when dealing with race conditions: eg. if there are two threads that
can kill each other, it is guaranteed that only one of the threads
will get to kill the other.

Whatever work the target thread was doing when the exception was
raised is not lost: the computation is suspended until required by
another thread.

If the target thread is currently making a foreign call, then the
exception will not be raised (and hence throwTo will not return)
until the call has completed. This is the case regardless of whether
the call is inside a mask or not.

Important note: the behaviour of throwTo differs from that described in
the paper "Asynchronous exceptions in Haskell"
(http://research.microsoft.com/~simonpj/Papers/asynch-exns.htm).
In the paper, throwTo is non-blocking; but the library implementation adopts
a more synchronous design in which throwTo does not return until the exception
is received by the target thread. The trade-off is discussed in Section 9 of the paper.
Like any blocking operation, throwTo is therefore interruptible (see Section 5.3 of
the paper). Unlike other interruptible operations, however, throwTo
is always interruptible, even if it does not actually block.

There is no guarantee that the exception will be delivered promptly,
although the runtime will endeavour to ensure that arbitrary
delays don't occur. In GHC, an exception can only be raised when a
thread reaches a safe point, where a safe point is where memory
allocation occurs. Some loops do not perform any memory allocation
inside the loop and therefore cannot be interrupted by a throwTo.

Blocked throwTo is fair: if multiple threads are trying to throw an
exception to the same target thread, they will succeed in FIFO order.

Scheduling

Scheduling may be either pre-emptive or co-operative,
depending on the implementation of Concurrent Haskell (see below
for information related to specific compilers). In a co-operative
system, context switches only occur when you use one of the
primitives defined in this module. This means that programs such
as:

will print either aaaaaaaaaaaaaa... or bbbbbbbbbbbb...,
instead of some random interleaving of as and bs. In
practice, cooperative multitasking is sufficient for writing
simple graphical user interfaces.

The yield action allows (forces, in a co-operative multitasking
implementation) a context-switch to any other currently runnable
threads (if any), and is occasionally useful when implementing
concurrency abstractions.

Blocking

Different Haskell implementations have different characteristics with
regard to which operations block all threads.

Using GHC without the -threaded option, all foreign calls will block
all other Haskell threads in the system, although I/O operations will
not. With the -threaded option, only foreign calls with the unsafe
attribute will block all other threads.

Using Hugs, all I/O operations and foreign calls will block all other
Haskell threads.

Merging of streams

The mergeIO and nmergeIO functions fork one thread for each
input list that concurrently evaluates that list; the results are
merged into a single output list.

Note: Hugs does not provide these functions, since they require
preemptive multitasking.

Bound Threads

Support for multiple operating system threads and bound threads as described
below is currently only available in the GHC runtime system if you use the
-threaded option when linking.

Other Haskell systems do not currently support multiple operating system threads.

A bound thread is a haskell thread that is bound to an operating system
thread. While the bound thread is still scheduled by the Haskell run-time
system, the operating system thread takes care of all the foreign calls made
by the bound thread.

To a foreign library, the bound thread will look exactly like an ordinary
operating system thread created using OS functions like pthread_create
or CreateThread.

Bound threads can be created using the forkOS function below. All foreign
exported functions are run in a bound thread (bound to the OS thread that
called the function). Also, the main action of every Haskell program is
run in a bound thread.

Why do we need this? Because if a foreign library is called from a thread
created using forkIO, it won't have access to any thread-local state -
state variables that have specific values for each OS thread
(see POSIX's pthread_key_create or Win32's TlsAlloc). Therefore, some
libraries (OpenGL, for example) will not work from a thread created using
forkIO. They work fine in threads created using forkOS or when called
from main or from a foreign export.

In terms of performance, forkOS (aka bound) threads are much more
expensive than forkIO (aka unbound) threads, because a forkOS
thread is tied to a particular OS thread, whereas a forkIO thread
can be run by any OS thread. Context-switching between a forkOS
thread and a forkIO thread is many times more expensive than between
two forkIO threads.

Note in particular that the main program thread (the thread running
Main.main) is always a bound thread, so for good concurrency
performance you should ensure that the main thread is not doing
repeated communication with other threads in the system. Typically
this means forking subthreads to do the work using forkIO, and
waiting for the results in the main thread.

Like forkIO, this sparks off a new thread to run the IO
computation passed as the first argument, and returns the ThreadId
of the newly created thread.

However, forkOS creates a bound thread, which is necessary if you
need to call foreign (non-Haskell) libraries that make use of
thread-local state, such as OpenGL (see Control.Concurrent).

Using forkOS instead of forkIO makes no difference at all to the
scheduling behaviour of the Haskell runtime system. It is a common
misconception that you need to use forkOS instead of forkIO to
avoid blocking all the Haskell threads when making a foreign call;
this isn't the case. To allow foreign calls to be made without
blocking all the Haskell threads (with GHC), it is only necessary to
use the -threaded option when linking your program, and to make sure
the foreign import is not marked unsafe.

Run the IO computation passed as the first argument. If the calling thread
is bound, an unbound thread is created temporarily using forkIO.
runInBoundThread doesn't finish until the IO computation finishes.

Use this function only in the rare case that you have actually observed a
performance loss due to the use of bound threads. A program that
doesn't need it's main thread to be bound and makes heavy use of concurrency
(e.g. a web server), might want to wrap it's main action in
runInUnboundThread.

GHC's implementation of concurrency

This section describes features specific to GHC's
implementation of Concurrent Haskell.

Haskell threads and Operating System threads

In GHC, threads created by forkIO are lightweight threads, and
are managed entirely by the GHC runtime. Typically Haskell
threads are an order of magnitude or two more efficient (in
terms of both time and space) than operating system threads.

The downside of having lightweight threads is that only one can
run at a time, so if one thread blocks in a foreign call, for
example, the other threads cannot continue. The GHC runtime
works around this by making use of full OS threads where
necessary. When the program is built with the -threaded
option (to link against the multithreaded version of the
runtime), a thread making a safe foreign call will not block
the other threads in the system; another OS thread will take
over running Haskell threads until the original call returns.
The runtime maintains a pool of these worker threads so that
multiple Haskell threads can be involved in external calls
simultaneously.

The System.IO library manages multiplexing in its own way. On
Windows systems it uses safe foreign calls to ensure that
threads doing I/O operations don't block the whole runtime,
whereas on Unix systems all the currently blocked I/O requests
are managed by a single thread (the IO manager thread) using
select.

The runtime will run a Haskell thread using any of the available
worker OS threads. If you need control over which particular OS
thread is used to run a given Haskell thread, perhaps because
you need to call a foreign library that uses OS-thread-local
state, then you need bound threads (see Control.Concurrent).

If you don't use the -threaded option, then the runtime does
not make use of multiple OS threads. Foreign calls will block
all other running Haskell threads until the call returns. The
System.IO library still does multiplexing, so there can be multiple
threads doing I/O, and this is handled internally by the runtime using
select.

Terminating the program

In a standalone GHC program, only the main thread is
required to terminate in order for the process to terminate.
Thus all other forked threads will simply terminate at the same
time as the main thread (the terminology for this kind of
behaviour is "daemonic threads").

If you want the program to wait for child threads to
finish before exiting, you need to program this yourself. A
simple mechanism is to have each child thread write to an
MVar when it completes, and have the main
thread wait on all the MVars before
exiting:

The main thread principle also applies to calls to Haskell from
outside, using foreign export. When the foreign exported
function is invoked, it starts a new main thread, and it returns
when this main thread terminates. If the call causes new
threads to be forked, they may remain in the system after the
foreign exported function has returned.

Pre-emption

GHC implements pre-emptive multitasking: the execution of
threads are interleaved in a random fashion. More specifically,
a thread may be pre-empted whenever it allocates some memory,
which unfortunately means that tight loops which do no
allocation tend to lock out other threads (this only seems to
happen with pathological benchmark-style code, however).

The rescheduling timer runs on a 20ms granularity by
default, but this may be altered using the
-i<n> RTS option. After a rescheduling
"tick" the running thread is pre-empted as soon as
possible.

One final note: the
aaaabbbb example may not
work too well on GHC (see Scheduling, above), due
to the locking on a System.IO.Handle. Only one thread
may hold the lock on a System.IO.Handle at any one
time, so if a reschedule happens while a thread is holding the
lock, the other thread won't be able to run. The upshot is that
the switch from aaaa to
bbbbb happens infrequently. It can be
improved by lowering the reschedule tick period. We also have a
patch that causes a reschedule whenever a thread waiting on a
lock is woken up, but haven't found it to be useful for anything
other than this example :-)