POSIX Parallel Programming, Part 3: Threads

In the conclusion of his parallel programming series, David Chisnall looks at using threads. Threads are not a traditional way of achieving parallelism on UNIX platforms, but the newer POSIX standards come with a comprehensive set of functions to support them.

Like this article? We recommend

Like this article? We recommend

The examples for this section are written to the latest published ISO standard for the C programming language, commonly known as C99. All POSIX-compliant systems, including any that ship with GCC, should include an executable called c99, which can be used to compile them using the command 'c99 {sourcefile}'. A working knowledge of C99 is considered a pre-requisite of this tutorial, and some experience with POSIX systems is considered useful.

So far in this series we’ve looked at spawning new processes and
communicating among them. Processes are the traditional mechanism for
parallelism on UNIX platforms. Recently, however, the POSIX threading APIs have
gained widespread support. Unlike processes created using fork(2),
threads spawned with pthread_create(3) exist in the same address space
as their parent.

Older versions of Linux relied on a userspace implementation provided by
glibc. This technique put all of a process’s threads in the same
kernel-scheduled entity and used timer signals to switch between them. More
recent versions use the clone(2) system call, which is similar to
fork(2) but allows the child process to share the parent’s
address space. Other UNIX-like systems have similar mechanisms, although some
use a N:M kernel-scheduled entities-to-threads mapping. This enables threads
that spend most of their time waiting for data to be multiplexed onto a single
kernelspace entity, while allowing CPU-limited ones to be scheduled
independently. On paper, this strategy has a number of advantages, although in
practice it is harder to get right.

NOTE

The examples in this article can be downloaded in the accompanying
source.zip.

Workers of the World

The primary reason for creating a thread is to get some work done in the
background. Since the main way of getting work done in a C program is to call a
function, the pthread_create(3) call takes a function as an argument
and runs that function in a separate thread.

The function passed to the pthread_create(3) call takes a pointer as
an argument, and returns a pointer. This pointer can later be retrieved using
the pthread_join(3) function. This setup allows you to implement
futures quite easily; your parent thread calls a function in a new thread, does
some other work, and then waits for the worker thread to finish.

Listing 1 contains a simple program for determining whether a number is
prime.

When you run this program, it spawns a worker thread that crease a Sieve of Eratosthenes—an
array indicating whether a range of numbers is prime. This process happens in
the background while the program asks the user to enter a number. Once the user
has entered the number, the main thread waits for the worker thread to finish
and then uses the result to see whether the entered number is prime.

Note that the signature of the make_sieve() function doesn’t
match that expected by the pthread_create(3) function. Because both
accept a pointer, however, we can cast it to the correct form and receive no
errors.

The first argument is a pointer to a pthread_t that’s set to
an identifier for this thread. Future thread operations should use this
identifier to identify the created thread.

The second argument specifies some attributes for the thread. This can be an
attribute set created with the pthread_attr_*(3) family of functions,
or NULL for the default options.

The third and fourth arguments are the function to start and the argument to
pass to it, respectively. Notice that we put in an explicit cast here so that
our function can receive an int* rather than a void*.