Malicious Race Conditions and Data Races

This post is about malicious race conditions and data races. Malicious race conditions are race conditions that cause the breaking of invariants, blocking issues of threads, or lifetime issues of variables.

At first, let me remind you, what a race condition is.

Race condition: A race condition is a situation, in which the result of an operation depends on the interleaving of certain individual operations.

That's fine as starting point. A race condition can break the invariant of a program.

Breaking of invariants

In the last post Race Conditions and Data Races, I use the transfer of money between two accounts to show a data race. There was a benign race condition involved. To be honest, there was also a malicious race condition.

The malicious race condition breaks an invariant of the program. The invariant is, that the sum of all balances should always have the same amount. Which is in our case is 200 because each account starts with 100 (1). For simplicity reason, the unit should be euro. Neither I want to create money by transferring it nor I want to destroy it.

At the begin, the sum of the accounts is 200 euro. (4) display the sum by using the function printSum (3). Line (5) makes the invariant visible. Because there is a short sleep of 1ns in line (2), the intermediate sum is 182 euro. At the end, all is fine. Each account has the right balance (6) and the sum is 200 euro (8).

Here is the output of the program.

The malicious story goes on. Let's create a deadlock by using conditions variables without a predicate.

Blocking issues with race conditions

Only to make my point clear. You have to use a condition variable in combination with a predicate. For the details read my post Condition Variables. If not, your program may become the victim of a spurious wakeup or lost wakeup.

If you use a condition variable without a predicate, it may happen that the notifying thread sends it notification before the waiting thread is in the waiting state. Therefore, the waiting thread waits forever. That phenomenon is called a lost wakeup.

The first invocations of the program work fine. The second invocation locks because the notify call (1) happens before the thread t2 (2) is in the waiting state (3).

Of course, deadlocks and livelocks are other effects of race conditions. A deadlock depends in general on the interleaving of the threads and my sometimes happen or not. A livelock is similar to a deadlock. While a deadlock blocks, I livelock seems to make progress. The emphasis lies on seems. Think about a transaction in a transactional memory use case. Each time the transaction should be committed, a conflict happens. Therefore a rollback takes place. Here is my post about Transactional Memory.

Showing lifetime issues of variables is not so challenging.

Lifetime issues of variables

The recipe of a lifetime issue is quite simple. Let the created thread run in the background and you are half done. That means the creator thread will not wait until its child is done. In this case, you have to be extremely careful that the child is not using something belonging to the creator.

This is too simple. The thread t is using std::cout and the variable mess. Both belong to the main thread. The effect is that we don't see the output of the child thread in the second run. Only "Begin:" (2) and "End:" (3) are displayed.

I want to emphasise it very explicitly. All the programs in this post are up to this point without a data race. You know is was my idea to write about race conditions and data races. They are a related, but different concept.

I can even create a data race without a race condition.

A data race without a race condition

But first, let me remind you, what a data race is.

Data race: A data race is a situation, in which at least two threads access a shared variable at the same time. At least on thread tries to modify the variable.

100 threads are adding 50 euro (3) to the same account (1). They use the function addMoney. The key observation is, that the writing to the account is done without synchronisation. Therefore we have a data race and no valid result. That is undefined behaviour and the final balance (4) differs between 5000 and 5100 euro.

What's next?

I often hear at concurrency conference discussions about the terms non-blocking, lock-free, and wait-free. So let me write about these terms in my next post.