Recent Posts

Testing Race Conditions In Python

Race conditions are a danger whenever you have more than one process or thread accessing the same data. This post explores how to test race conditions after identifying them.

Incrmnt

You’re working on a hot new startup, Incrmnt, which does one thing and does it well.

You display a global counter and a plus sign. Users can click the plus sign and the counter increases by one. It’s so simple! It’s so addictive! It’s the next big thing for sure!

Investors are tripping over themselves to get on board but you have a problem.

The race condition

During your private beta, Abraham and Belinda were so super excited that they each clicked the plus button 100 times each. Your server logs show 200 requests, but the counter only shows 173. Something doesn’t add up.

Trying to push the headline “Incrmnt turns out to be Excrmnt” to the back of your mind, you inspect the code (all code used for this post can be found on Github).

# incrmnt.pyimportdb

defincrement():count=db.get_count()

new_count=count+1db.set_count(new_count)

returnnew_count

Your web server uses multiple processes to increase throughput, so this function can be running simultaneously in two different threads. If you’re unlucky with the timing, this occurs:

# Thread 1 and Thread 2 are executing in different processes at the same time# For purposes of illustration, they're placed side by side here# They're vertically spaced to show what code is executing at each point in time

# set_count called with 1new_count=count+1db.set_count(new_count)# set_count called with 1 againnew_count=count+1db.set_count(new_count)

before_after is a library that provides utilities to help reproduce this situation. It can insert arbitrary code before or after a function.

before_after relies on the mock library to patch functions. If you’re not familiar with mock then I suggest reading the excellent docs. Of particular importance is Where To Patch.

We want to wait until just after Thread 1 has called get_count, then execute Thread 2 in its entirety, then resume execution of Thread 1

We can write the following test:

# test_incrmnt.py

importunittest

importbefore_after

importdbimportincrmnt

classTestIncrmnt(unittest.TestCase):defsetUp(self):db.reset_db()

deftest_increment_race(self):# after a call to get_count, call incrementwithbefore_after.after('incrmnt.db.get_count',incrmnt.increment):# start off the race with a call to incrementincrmnt.increment()

count=db.get_count()self.assertEqual(count,2)

We’ve used before_after’s after context manager to insert another call to increment after the first get_count call.

By default, before_after only calls the after function once. This is useful in this particular situation, since otherwise we’d blow the stack (increment would call get_count which would chain a call to increment which would call get_count…).

This test fails, since count is equal to 1, not 2. Now we have a red test that reproduces our race condition, so let’s work on fixing it.

Preventing the race

We’re going to mitigate the race using a simple lock. This is obviously not the ideal solution - we’d be better offloading the problem to our data store using atomic updates - but this approach allows better demonstration of before_after and its usefulness for testing multithreaded applications.

We add a new function to incrmnt.py:

# incrmnt.py

deflocking_increment():withdb.get_lock():returnincrement()

This ensures that only one thread can read from and write to the counter at once. If one thread tries to get the lock while it’s held by another, a CouldNotLock exception will be raised.

We can now add this test:

# test_incrmnt.py

deftest_locking_increment_race(self):deferroring_locking_increment():# Trying to get a lock when the other thread has it will cause a# CouldNotLock exception - catch it here or the test will failwithself.assertRaises(db.CouldNotLock):incrmnt.locking_increment()

Mitigating the race

We still have a problem here, in that if two requests collide in this way, one will not be registered. In order to mitigate this, we can retry the increment (using something like funcy retry is a concise way of doing so):

When we need more scale than this method provides, we can move the increment into our database as an atomic update or transaction, taking the responsibility away from our application.

Conclusion

Incrmnt is now race free, and people can happily click all day long without worrying about not being counted.

This was a simple example, but before_after can be used in more complicated races to ensure that your functions deal with the situation properly. Being able to test and reproduce in a single threaded environment is key to being more confident that you’re handling your races properly.