Introduction to Parallel loops

Introduction

Loops account for a large percentage of computation time for many scientific computing problems. For this reason, GAUSS provides an easy method to make parallel loops with the threadfor keyword.

This tutorial will start out by making a simple single-threaded code to compute parameter estimates for Rolling Regression on some simulated data. Then we will use the threadfor keyword to parallelize the code.

Simulate some data

In order to keep this tutorial focused on parallelization and make it simple for you to scale the problem up and down easily, we will just simulate random normal data.

The time to run the above code will vary based upon your computer and the version of GAUSS you are running. On the reference device for this tutorial, a quad-core Macbook Pro, this code takes about 1.73 seconds.

threadfor basics

The threadfor keyword takes the same inputs as the standard for loop:

threadfor i(start, stop, step);
//loop body
threadendfor;

i

The loop counter

start

Value of the loop counter for the first iteration

stop

The final value of the loop counter in the last iteration

step

The amount to change the loop counter on each iteration The difference, of course, is that threadfor will run multiple iterations at the same time. Since any two iterations could be running at the same time, we cannot write data to the same location from separate iterations. Fortunately, in this example, we write to a separate row in each iteration. Since that row is controlled by the loop counter, i, we do not have to worry about collisions. So we only need to change for to threadfor and our code will be run in parallel:

On the same quad-core Macbook Pro, the threaded code runs in about 0.44 seconds. This gives us a speed-up of almost exactly 4 times. Not all algorithms will scale as perfectly as this one, but it does indicate that threadfor can be quite simple to use and provide a great performance boost.

Temporary variables

Our first example was very easy and clean to convert to parallel, with excellent performance as well. However, we stated above that you cannot assign to the same variable (or more precisely the same index location) from separate threads. Since it is common for loops to require some scalar temporary variables, how can we handle this? Do we need to create an array for every single temporary variable? Fortunately, the answer is no.

In threadfor loops, GAUSS will automatically create and manage loop specific versions of temporary variables that are created and used in a threadfor loop. To illustrate this behavior, let's make a slight modification to our parallel Rolling Regression example:

In this case, the use of the temporary variables a and b is not critical for the code. But there are many cases in which it would be a burden to not be able to use them. However, due to the fact that we do not know the order in which the iterations will run, these temporary variables have some special behavior and rules. The rules for threadfor temporary variables are:

A variable that is assigned to without using an index is a temporary variable. Variables that are assigned to by index, such as the `out` vector in our example are called 'slice' variables.

The first reference to a temporary variable in a `threadfor` loop must be an assignment.

The value of a variable after the loop will not be changed by actions inside the loop.

Rules number 2 and 3 above are intended to prevent you from creating code with bugs called 'data races'. A 'data race' is when the output of your program is dependent on the order that parallel calculations are run. To make this more clear, let's look at a very simple example:

What do you think my_var will be equal to after the threadfor statement? The answer is 3. Since my_var was not assigned to by index inside the threadfor loop, it became a temporary variable. While a loop is a natural way to think about this code, the actual behavior of the threadfor loop is more like this:

Have a Specific Question?

Need Support?

Try GAUSS for 30 days for FREE

GAUSS is the product of decades of innovation and enhancement by Aptech Systems, a supportive team of experts dedicated to the success of the worldwide GAUSS user community. Aptech helps people achieve their goals by offering products and applications that define the leading edge of statistical analysis capabilities.