Jolt Awards

Reducing Computing Time with Multithreading

By Daniele Bochicchio, Stefano Mostarda, and Marco De Sanctis, June 17, 2010

Multithreading can have a big impact on application response time

Using ParallelFX

ParallelFX is debuting with .NET Framework 4.0. It is a framework specifically designed to build applications that need parallelism. Parallel computing is a new trend among developers. It is now clear that in the future it will be difficult to increase CPU power (in GHz) but it will be pretty common to have multicore CPUs everywhere.

Right now, the most common server hardware architecture is multicore enabled but applications are not. Parallel computing is not easy to master unless you are lucky -- as we are -- to have a framework to develop on.

Problem

We want to execute multiple tasks in parallel to gain in total computation time. This time, we want to use directly ParallelFX, a new feature in .NET Framework 4.0. Solution Parallel task execution is not easy to implement. You have to take care of concurrent access from multiple threads, threads join, and other problems we addressed earlier. .NET Framework 4.0 introduces new high-level APIs, called ParallelFX, to easily use parallelism in your applications. The idea behind ParallelFx vs. manual thread allocation is in Figure 4.

Figure 4: In the upper part of this image, you can find how manual thread allocation works. As you can note, there is a context switch between the threads. ParallelFX, on the other side, avoids this problem by using a new architecture, which in this example uses two cores.

The example we will use is the same in the previous scenario: we want to provide a flight search system that can query multiple providers in order to have the best price on a specified fictitious flight number.

ParallelFX Task Parallel Library (TPL) is designed to be optimized against the direct use of ThreadPool. To scale well on multiple processors, TPL uses an algorithm to dynamically adapt work items over the threads and distribute them accordingly. By default, one single thread per process is created to avoid thread switching otherwise performed by the underlying operating system. A specific task manager is responsible for this action.

Each worker thread has a local task queue, representing the actions to be completed. Usually, they use a simple push/pop mechanism to enqueue and dequeue the tasks. To optimize the computing time, when a local queue is empty, it looks for another worker thread's queue to perform the work associated to a tasks and remove it from the corresponding queue. TPL has a main advantage in manual ThreadPool use; since the queues are distributed, it does not use synchronization between worker threads to join them. This is very important to achieve a true scalability. ParallelFX is not limited to tasks; it can be used with query (with Parallel LINQ), iterations, and collections.

.NET Framework 4.0 includes new classes specifically designed to execute parallel tasks, under the System.Threading.Tasks namespace. The Task class can be used when, just like in this scenario, you want more control over the task, controlling when it ends, appending execution of other tasks, and managing extension; in simple scenarios you can also directly use the Parallel.Invoke() method. These new APIs are very simple to use; in fact, to parallelize a task, you have to write something like this:

Using Task, you can write more concise code and you do not need to directly handle thread creation and its lifecycle. You have Wait, WaitAll, and WaitAny, methods, to wait for a single task to complete, all the tasks, or any task in the array, respectively.

To simplify exception management, when any exception is raised in a task, it is saved by the task scheduler and then raised when all tasks are completed. TPL will create an AggregatedException that has an InnerExceptions property containing all the exceptions generated by your tasks so exceptions can be managed centrally. The exceptions will be raised only if you call one of the Wait methods; otherwise, you will never receive them.

Both the single task and an array of task can use the ContinueWith or ContinueWhenAll methods to associate a code to be executed after the tasks are completed. In Listing 6, you can find the first part, where the providers are instantiated and executed in parallel.

This method is very interesting because the tasks are loaded in an array. Since this is a typical fire-and-forget situation, we can use ContinueWhenAll method instead of the typical WaitAll. ContinueWhenAll will wait for all the tasks to be completed and then run asynchronously the corresponding code. You can find the code in Listing 7.

Listing 7: The results from the providers are aggregated when all the work is done: (a) C#; (b) VB

If you execute this code in debug, you can verify that the code in Listing 7 is executed after the providers have completed their corresponding work. In the meantime, you are not blocking any threads to wait for the tasks to be completed. And, this is accomplished very easily because ParallelFX simplifies the use of these techniques. In the System.Collections.Concurrent namespace you can find a specific thread-safe collection to be used in these scenarios. In our example in Listing 7, we used ConcurrentQueue in order to enqueue the results as they arrive. As you can see, we do not need to take care of concurrent threads accessing the collection in write. This is a fantastic feature, if you think of all the code necessary to do the same thing manually, as we did in the previous example!

As you can see, with TPL you can simplify your code, take care of multithreading access to collections, handle exception more easily, and increase performance, thanks to the minimum thread switching it provides.

ParallelFX is a new feature introduced in .NET Framework 4.0 that probably you will not use directly in an ASP.NET page, as we did in our example, but you wrap in a middle tier, or something similar. Anyway, it can really help your routines to perform faster, and if you had troubles in the past using ThreadPool, this represents a giant step forward in accessing the operating system multi-core inner features.

Summary

Multithreading techniques have a high impact on response time in applications with intensive I/O requests. ASP.NET is so powerful that you can literally do anything you need to: you just have to write code and express your imagination!

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Jolt Awards Video

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!