Clay Breshears

Dr. Dobb's Bloggers

Game of Life — Using Threads and Lists, Part 1

April 03, 2013

I'm still looking at the Game of Life simulation.

I'm still looking at the Game of Life simulation. In my last post I described the algorithm for focusing processing in only those areas of a grid where cells are volatile; that is, living or dying or being born. A list data structure was used to hold a collection of potentially changing cells and the computation in each generation only considered those locations. For this article and the next two, I want to rework that serial algorithm and describe some approaches to thread it for parallel execution. I can think of three different methods that could be attempted. Unlike my previous parallelization of the original serial code version, the threading ideas on the list-based variation of the Game of Life application will use a task decomposition for the parallelism.

Method the First: Parallelize the List Traversals

In my list-based Game of Life, the code to compute the current generation involved the traversal of the four declared lists. I can consider each of these list traversals as a separate task. These tasks aren't all independent, though. Recall the main processing loop code from the previous post:

Since the first two calls to TraverseList() take data from different lists (maylive and maydie) and update different lists (newlive and newdie, respectively), there are no obvious dependencies and these two calls can be run in parallel with some synchronization either before or after clearing the lists (depending on what you assign to each thread). Plus, you don't need to use a thread-safe list implementation. Any parallelization construct that can generate separate tasks and assign them to threads can be used. For example, creating a new thread in Pthreads for each call in each loop iteration (high overhead), assigning the task to a thread within a pool of threads, Intel Threading Building Blocks tasks, or an OpenMP sections construct, are all possible.

What about duplicate grid points on the two lists involved? This situation was raised at the end of my previous post and I showed that it was a possibility. However, as was pointed out, too, only at most one of the traversals will actually succeed since the state of neighbor counts will have stabilized before these list traversals. The conditions dictate either birthing a new cell or killing a live cell or leaving the situation as is, not a combination of two or more. So this poses no dependency between the two calls.

The second pair of traversals passes through the "might" lists to see if there could be newly living or newly dead cells in the next generation. If I try the same kind of parallelism on the second pair of TraverseList() calls (treating each call as separate task to be assigned to a thread), I run into problems. While each of these calls gets grid points from different lists, the two functions executed on each element of the list being traversed can potentially update either the maylive and maydie lists. Thus, there is the possibility that two threads will both attempt to add a cell to the maylive list and I need to have some thread-safe list structure.

My List data structure is defined as

typedef struct list {
int count;
Cell entry[MAXLIST];
} List;

where a Cell is pair of integers representing the row and column index of a grid point and MAXLIST is some upper bound on the size of each list based on the size of the grid being used. With this List structure, the code for the AddList() operation can easily be made thread-safe. Since I've implemented my example code on a Windows system, I have employed the InterlockedIncrement intrinsic function and use the return value as the index in which to write a new element into the array holding List items.

For other threading methods without the Interlocked intrinsics, any method to enforce mutually exclusive access to updating the count of items on the list and return that count for indexing will be sufficient. (Note that there is no need to require a thread-safe method of reading an element from any one list since only one thread will be taking data from any one list. Thus, no dependency on that front.)

Regardless of the need for thread-safe lists, you will have to implement some form of mutual exclusion for updating the numNeighbors values that could possibly be incremented and decremented concurrently. I'll have some discussion on how to do that in support of the second parallelization method in the next post.

Now, if you've actually read down this far, you probably have figured out that parallelizing the algorithm this way is not going to be very scalable. Each parallel portion of the code will have at most two threads active at any time and there is a good chance of load imbalance if one list is longer than the other. I don't recommend this approach, but I wanted to describe and debunk an obvious method before getting to something more substantial and satisfying in the next installment.

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Video

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!