My programming ramblings

C++11 async tutorial

Posted on October 17, 2012 by Paul

For a few years now, we live in a multiprocessor world, starting from the phone in my pocket to the parallel quad-core beast I have on my table. Today, you could easily buy a six or twelve core machine that is several orders of magnitude more powerful than the super computers from a decade ago.

As programmers, we need to be able to use at full capacity the available computing power, you can’t buy a new computer and expect that your serial code will run faster. You need to write code that can run on multiple core machines or you will deliver a low quality product to your potential clients.

The new C++11 standard allows us to maximize the use of the available hardware directly from the language. Today, you could write portable multithreading code using only the standard library of the language.

std::async allows you to write code that could potentially run in one or more separate threads than the main thread of your program. std::async takes as argument a callable object, a function for example, and returns a std::future, that will store the result returned by your function or an error message.

std::async can be seen as a high level interface to std::threads. Let’s see a simplified example of async usage:

What just happened ? In the first case, the message from the main thread was printed on the screen before the called_from_async was executed, while for the second run, the main thread and the thread in which called_from_async was launched tried to use the screen in the same time.

Let’s try another example that actually return a value. To make things more interesting, I’m going to use a lambda this time:

Please note the use of std::future<int> to indicate that the return value of the lambda is an integer and the way in which we pass arguments, 2 and 4, to the lambda in std::async.

Usually, you will need to launch more than one asynchronous calls in a row. Giving separate names for each returned future is tedious and error prone, an elegant way to avoid this problem is to use a vector for storing the futures:

Now, it’s time to try a slightly more ambitious project, something that will allow us to measure the performance of a serial code versus the same code with std::async. For this, I’m going to use a Perlin noise I’ve build some time ago. The idea is that I can use my Perlin noise generator to make a set of pictures in parallel. We’ll use std::chrono to time the performance of the code.

The reference Perlin noise function is a 3D function that given 3 numbers on the unit cube will always generate the same result. In this article, we will use the third dimension as a seed to generate 2D images of 1280px width and 720px height. For our purposes, we will treat this as a black box function that given as input a real number on the [0, 1] interval will generate and save a 2D image. If you want more informations about how I’ve implemented this function, you could read my Perlin noise in C++11 article.

We’ll generate 1800 images from the values of z (the third dimension) in the interval [0, 1]. Each image will be generated by the make_perlin_noise function in a std::async call:

The serial version of the above code uses approx 3.2 MB of memory and runs for about 14 minutes with no optimization. With full optimization the code runs for 4.5 minutes, so about 3x speedup of the code for the serial code.

The parallel version of the code uses about 700 MB of memory and runs for about 7.3 minutes with no optimization. The run time for the parallel code with full optimizations is about 2.65 minutes, about 2.75x speedup.

Comparing the serial with the parallel code without optimization, we note a 1.9x speedup for the parallel version. With optimizations we have a 1.7x speedup for the parallel version.

As a side note, the parallel version uses about 280 threads on my machine vs a single thread for the serial version. This explains the memory usage difference between the serial vs the parallel versions.

Using the resulting 1800 images, I’ve made a small, one minute, movie that shows how the image evolves for z in the interval [0, 1].

Disclaimer:All data and information provided on this site is for informational purposes only. solarianprogrammer.com makes no representations as to accuracy, completeness, currentness, suitability, or validity of any information on this site and will not be liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use. All information is provided on an as-is basis. solarianprogrammer.com does not collect any personal information about its visitors except that which they provide voluntarily when leaving comments. This information will never be disclosed to any third party for any purpose. Some of the links contained within this site have my referral id, which provides me with a small commission for each sale. Thank you for understanding.