If we create 10 threads and run them, counter will be 1.000.000, won’t it? No, it won’t.

Story Time

The other day I was working on a program for parsing log files, analyzing them and storing the results in a database. Because processing my test data took somewhere between 10 and 15 minutes, I decided to split the task into two threads: one thread would parse the log files and another thread would analyze them and put them into the database.

After adding threading, I did some testing and performance actually improved – a little. Only occasionally the program wouldn’t terminate after finishing. When chatting with my boss about it, he said something along the lines of

A programmer had a problem. He thought “I know, I’ll solve it with threads!”. Now he has two problems.2

I didn’t agree with him at all. I mean, threads are easy and the bug with non-termination was just a stupid deadlock that was easy to find. But the more I think about it, the more I come to agree with him, that:

Threads are hard

To understand why threads are hard, we’ll first have a look at multitasking. How does an operating system handle running multiple programs (tasks) in parallel? There are two approaches:

Cooperative scheduling: This approach lets every program decide for itself how much CPU time it needs. For example, MS-DOS provided a system call named TSR (terminate and stay resident) that allowed programs to pass control back to the OS.
But as the name implies, cooperative scheduling requires all programs to cooperate. Nothing stops a program from consuming all resources without ever allowing other programs to run. This leads us to:

Preemptive scheduling: According to this idea, the OS has the power to interrupt any process at any time. A scheduler allows a program to run for a fixed time and then suspends it to run other programs. This is how modern operating systems implement multitasking.
Unlike its cooperative counterpart, preemptive scheduling needs hardware support in the form of a hardware timer that will interrupt the currently running process and pass control to the OS by calling its interrupt handler. That way, the OS can enforce the time limit it calculated for the currently running process.

What does this have to do with threads? Basically, multithreading allows a program to run multiple instances in the same memory context. The OS keeps track of all threads and schedules them appropriately along with all other processes.

And this is, where the troubles begin. The OS doesn’t care about the current state of the program when it takes control. When dissecting the introductory Java example, we notice that incrementing counter consists of three steps:

Read the current value of counter from main memory into a processor register.

Increment the current value.

Write the incremented value into counter into main memory.

Say we have two threads that both increase counter. Thread 1 has just read the value, when the OS passes control to Thread 2 which also reads the value. It increments the counter and writes it back. Now it’s the turn of Thread 1, which also increments the value and writes it. But in total, the counter has not been incremented two times, but one time. We’ve encountered a race condition.

Thread 1

Thread 2

Main memory

(1) Read, counter = 0

Suspended

counter = 0

Suspended, counter = 0

(1) Read, counter = 0

counter = 0

Suspended, counter = 0

(2) Increment, counter = 1

counter = 0

Suspended, counter = 0

(3) Write, counter = 1

counter = 1

(2) Increment, counter = 1

Suspended, counter = 1

counter = 1

(3) Write, counter = 1

Suspended, counter = 1

counter = 1

To fix our buggy example we have to use locks making the whole operation effectively atomic:3

Using a lock solves this particular problem, but if used wrongly, can introduce a whole other class of bugs: Deadlocks. It’s the situation when several threads wait for each other, blocking each other forever. All in all, using threads is not only hard to get right, but can lead to very subtle bugs. Quoting a paper on this topic:

[Threads] discard the most essential and appealing properties of sequential computation: understandability, predictability, and determinism. Threads, as a model of computation, are wildly non-deterministic, and the job of the programmer becomes one of pruning that nondeterminism.

Threads are useful

To explain why threads are useful, we’ll use an analogy: Let’s say you are a computer. Your brain is the CPU and interacting with the world around you is I/O. Actually, you’re a single-core computer as your brain can only concentrate on one task at once.

Now there are tasks where the limiting factor is the capacity and speed of your brain, for example math homework. Improving your brain power will lead to better and faster results. We’ll call these tasks CPU-bound. For other tasks, the limiting factor is the world around you, like cooking or waiting for the bus. Here, improving brain power doesn’t help because the performance depends on the world around you. We’ll call these tasks I/O-bound.

Here multithreading comes in. Say you want to cook dinner. Now, you can process the recipe in strictly sequential order. That’s what a single-threaded program would do. This, of course, involves a lot of waiting: Waiting for the water to boil, waiting for the food to cook through. But instead of waiting, it would be better to do other tasks like preparing the ingredients or setting the table. In other words, you make use of multithreading.

It’s worth mentioning that on singe-core computers multithreading improves I/O-bound tasks, but not of CPU-bound tasks. If you think about it, it’s quite obvious: When working on your math homework, there is no waiting you could use for other tasks. On the contrary, if you randomly jump between tasks, your productivity will suffer from the context switches.

The case of math homework changes, if you ask a friend to solve some of the tasks for you. This way, solving your homework becomes much faster. Looking at computers, this corresponds to multi-core computers where the performance of CPU-bound tasks improves with multithreading because multiple cores can solve different subtasks simultaneously.

What’s the point? The takeaway from our analogy is that on single-core computers multithreading can improve I/O-bound applications, while on multi-core computers both I/O-bound and CPU-bound applications can profit.

Bottom line

Seems like threads turn out to be hard, but pretty useful too. What can we say, whether to use threads or not? Here’s what I consider important: