Parallel

If Small Tasks Are the New Program Unit for a Multicore World, When Will We Assemble Programs From Them?

The right way to write multithreading code keeps evolving. If only we could keep up!

There are two revolutions going on right now in computing — occurring at opposite ends of the spectrum. At the high end, we're finding that modest systems are capable of handling so-called "Big Data" problems with the tools we're currently inventing. We're also finding that the supercomputing powers of today's servers coupled with their counterparts in the cloud provide more than enough horsepower to perform analysis that was inconceivable a few years ago. On the other end of the spectrum, smart phones are running dual-core processors; and quad-core CPUs are on their way shortly. Driven by this power, phones and tablets can easily run games at full speed, and comfortably serve as scaled-down PCs and laptops. As more cores are added, the "scaled-down" aspect will relate more to things like storage and networking, rather than processing power.

The key link between these two ends of the processing gamut is the reliance on multiple cores to achieve the processing power. In Dr. Dobb's, we have discussed many times the day when software will need to be parallelized to find its way onto any device, because serial code will no longer be welcome. That day is no longer far off.

However, the multiprocessing techniques of old will no longer be sufficient. In days past, when multiprocessing was primarily enabled by a costly platform with multiple single-threaded processors, the general guidelines advised decomposing tasks into threadable functions (functional decomposition) or into chunks of data whose transformation could be parallelized (data decomposition). Today, we recognize that this kind of decomposition, while marginally desirable, is far too coarse-grained. If one large task is broken down into only a handful of medium tasks, chances are good that on many systems, core resources will be left fallow. Rather, the desired goal today is for programs to emit work in chunks called tasks that are pumped into work queues that drive thread pools. The threads in the pools live on endlessly, picking up a new task, executing it, and then returning to the pool for the next item to accomplish.

Most major languages, today, have a thread pool capability built in. (Except for C/C++. The thread pool was, sadly, remanded to a future release. However, Intel's Threading Building Blocks library (TBB), which is available at no cost, is an excellent drop-in solution.) The JVM- and .NET-hosted languages have had threading pools for a long time, and Apple recently added one to its arsenal. The model is surely on the verge of becoming completely ubiquitous.

The reason for this ubiquity is the undeniable set of benefits it delivers. By having work chunked into small tasks, all cores on the runtime system can be put to work. Moreover, load balancing between thread is greatly simplified: As execution pipelines run out of work, a new task is sent to them from the thread pool. Not all approaches are quite so dynamic. TBB tends to have fixed task queues. If one queue starts to fall behind the others, tasks are removed from it by other queues — this is called "work stealing" — and the load is balanced that way. For either approach to function properly, tasks have to be small and quick. Long-running tasks, such as those that can arise from old-fashioned decomposition, are highly unwelcome in this scenario as they gum up the works and reduce the number of cores that can run tasks at any given moment.

As long as programs are being migrated from serial code to multithreading, decomposition is a required step. However, henceforth, it needs to be done with greater granularity.

The real problem going forward is not program decomposition, but composition. Why are we not currently designing programs as a series of small asynchronous tasks? After all, we have already crossed into a world in which we break programs into objects. Why not then into tasks? Properly done, this would move today's OOP more closely to its original intent, which was to focus on the messages passed between objects, rather than the objects themselves (according to the widely quoted observation from Alan Kay, who coined the term "object orientation.")

The problems facing such an approach rest on its profound unfamiliarity. There are few languages that provide all the needs of this model, few frameworks that facilitate its design, and few developers conversant with the problems and limitations of this approach. I'll discuss these in my next editorial. In the meantime, it's worth considering how an existing program broken down into smaller tasks might function. What exactly would it look like?

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Video

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!