Having written a small amount of multi-threaded code, one of the hardest things to deal with is detecting where race conditions could occur. This is especially true in code you write, believing it is thread safe, only to find out later it isn't.

Are there any general tips for analyzing and writing code to determine where race conditions could occur?

April 20th, 2010, 01:35 AM

dvyukov

Re: Detecting Race Conditions

The situation is quite similar to memory leaks. Just do not do chaotic ad-hoc programming. First, define a system, define components and responsibilities, them implement them.
For example, the data-race/deadlock-bulletproof pattern - encapsulated synchronization:
class foo_t
{
private:
mutex guard;
public:
void bar()
{
scoped_lock lock (guard);
...
}
void baz()
{
scoped_lock lock (guard);
...
}
};
It's trivial to implement correctly, and there is no way one can get a data-race or deadlock with it.
If it's not applicable, then there are other patterns - hierarchical locking, ordered locking, etc.
Just do not do "ok, here I need to lock this, and then lock that, and now probably I can unlock this, and now I need access to that object as well, so lock it too, ...".

If you think the iterations of a loop are independent, but you're not sure, run the loop in reverse (in serial) and check to see if you get the same answer. For example, if your for-loop goes from 0 to 100, run the loop from 100 down to 0. It's not a guaranteed method, but it can detect unsafe loops in many cases. If the problem isn't the final result but all the intermediate computation results--like adding up a list of numbers; the total is the same, but the intermediate sums are different if done in a different order--you might need to look at the partial results along the way and figure out how those are being affected by running the loop in reverse and ultimately in some concurrent schedule.