.KnowHowTo.UseParallelRunning

Rick Mugridge, 15 July 2011

When it takes ages for tests to run, delaying the build, it's tempting to look at running them in parallel.

We look at some techniques that can be used to parallelise test execution.

We'll see that awkward interactions between our tests are the major limiting factor to speeding things up. And that we'll have to choose the right tradeoff between speed and the various costs.

1. Are Your Tests Ready?

If your tests are not in a good state, it's going to be difficult to parallelise them. It may be hard enough anyway.

First, be sure that:

The tests pass consistently.

If some tests sometimes fail, such as due to timing problems, then it's going to be confusing when you run them in parallel.

One important test smell here is having explicit delays sprinkled in your tests to allow for delays in processing, such as with ajax. The time delays may need to be tailored to the particular test execution environment, which complicates things and adds to the cost.

So avoid the use of such explicit delays and use a polling approach instead. This will also speed up your tests immediately.

The tests are independent.

If one test assumes that a previous one will leave the application in a particular state, your tests are not independent.

Eg, one test creates a particular customer and a later one assumes that customer is there.

One way to check this is to run them in random order lots of times to check that they don't break.

Rather than trying to parallelise all the tests at once, a good strategy is to do it incrementally. Pick a few of the slower-running tests, ready them, and parallelise those first. This will give you experience with the issues, making it easier to parallelise further tests (or not, if it's too hard).

2. Running on Separate Machines

The easiest way to speed up your tests is to run sub-suites on different machines, each containing a complete test environment consisting of:

The platform (eg, JVM or .net) and library code

The test framework

The application being tested

Any backend systems that are needed, including databases

Each test machine runs through its own sub-suite in sequence. If you have ten machines, you can possibly reduce the test execution time to a tenth of what it was before.

The speedup that can be achieved is usually limited by the cost of each additional test system, whether they are handled on local machines or on a cloud. And there is a cost to setting up and managing multiple machines, regardless of where they are.

To use this approach, we now have to:

Split the test suite up into sub-suites to run on the duplicated test machines

Test execution time will be determined by the test machine that takes the longest.

So we want to size the sub-suites so that each one takes about the same amount of time.

Automate the process of running the tests across the machines as a part of the build, collecting the test reports together

Look out for any sharing that we we forgot about that ends up having an impact; eg, shared files, data warehousing, backend credit card services.

Over time, we have to:

Resplit the sub-suites as we add new tests and as the test execution times change.

But there are some limitations to this approach, as we see in the next section.

2. Running on Separate Machines, With Shared Backends

It may not be possible to have all backend systems running on each machine, due to:

Costs of running duplicate copies of all of the test environment, or insufficient resources (eg, each test "machine" may need to be a set of interconnected machines)

Specialised hardware that can't be duplicated

Licensing costs: eg, database systems, etc

"Political" issues related to the ownership of backend systems

Problems with configuring the various pieces so that they can run independently without clashing with each other

If some backend systems have to be shared, this approach can only work if:

The backend systems are immutable (their data doesn't change); or the changes due to different tests across the duplicate test environments don't affect each other.

Eg, This won't work if a test (on one test system) needs to ensure that some particular data is missing, while another test (on another test system) creates that data.

So this means we have to ensure that any tests that will cause clashes in the shared systems have to run on the same test machine to ensure that they can't run at the same time.

This may reduce the opportunities for parallelism, and not give the speedups you were hoping for.

We now have to also:

Ensure that the tests across test machines don't cause clashes in the backend systems that are shared.

This can be done by ensuring that different tests change different data so that they don't interact (but this may require many changes)

When tests start failing, we have to find out whether the test failed due to:

A fault in the application or in the test.

We can determine this by running that test in isolation.

A good test execution tool may be able to do this automatically.

A clash in the changes in the backend due to two tests on different machines running at the same time.

So we need good logging information to find out what was running at the same time as a failing test.

If we don't have test independence, this gets much harder to diagnose.

3. Multiple Processes per Machine, with Shared Backend

It's possible to have several processes running a test environment on each machine, with sharing of (some) backend systems on each machine or across machines. But the bulk of the test environment will still be duplicated in each process, so the savings are not likely to be that great.

You need to select the tests that run in each of the processes to avoid clashes on the shared systems on each machine.

This approach can only work if all of the above restrictions are met, and:

The application system does not have absolute references to any files that may be changed or that have to differ between the processes (such as for configuration of the application).

Eg, this won't work if the application writes important data to a specific file location

The different copies of the application don't use resources that will clash

Eg, the port of a web application has to differ; different instances will have to run on different ports (eg, specified through distinct configuration).

The different copies of the test framework don't use resources that will clash, such as where reports are generated

We now have to also:

Manage the configuration of the applications, etc running on each of the processes to make sure they don't clash on machine-shared resources

We can share more of the test environment by running multiple threads on each machine. This can allow us to speed things up without increasing the number of machines, as long as the tests are not compute-bound.

The number of threads we can run without introducing extra delays is determined by the number of cores available and by the inherent delays in the execution of each of the tests due to delays in backend processing and I/O.

Eg, if we have one process running on a 4-core cpu and the tests typically spend 75% of the time waiting due to non-cpu activity in the application, we can usefully run about 16 threads.

This allows us to share much more, making a big difference to the memory footprint. But again, sharing leads to potential problems.

This approach can only work if all of the above restrictions are met, and:

The application system is thread safe. You have to assume that it's not thread-safe until you are confident otherwise.

The test framework is thread safe, so multiple instances won't clash. Some test frameworks are not thread safe, because they have not been developed with this in mind. FitLibrary in Java is thread safe, by design.

All libraries are used in a thread-safe way. This is not easy to verify; few libraries clearly document their thread safety.

The tests do not alter the data/time or other important "global" values of the application

Any system is not thread safe if it makes use of global variables (such as statics in Java or C#) or global resources without specifically allowing for such sharing (such as with locks).

We now have to also:

Ensure thread safety across all the code

5. Summary

While it is desirable to speed up test execution, there are tradeoffs in doing so.

As we saw, the easiest way to speed up your tests by a factor of 10 is to run 10 (or so) separate machines in parallel. However, each machine needs to contain a complete copy of the test environment, including all backend systems. Scaling this up to 1000 in parallel is not always possible, due to cost or other constraints.

To reduce the costs, we can introduce sharing of resources. Unfortunately, the more we share, the harder it is to ensure that our tests run independently. When tests fail, it's more difficult to diagnose the fault when we use sharing.

Managing test suites for parallel execution takes real effort. It's best to organise tests around the structure of the application domain, so it's unfortunate to have to reorganise test suites for test execution. It's possible to use tags to specify sets of tests that cannot run at the same time, but there are no FitNesse-based tools available that do that at this point.

The easiest tests to parallelise are those that don't change the "global" state of the application system and backend systems at all. They are relatively cheap to parallelise, but there are usually few such tests.

5.1 Other Approaches

There are other approaches that can be used:

Improve the chance of getting faster feedback by running the more critical tests earlier. A pipelined approach can work here. But it's not always obvious which tests are more critical, and it's difficult to determine this reliably from automatic analysis of code, etc changes.

If testing through the UI:

Avoid using explicit delays when there are timing issues

Consider whether some tests can call directly into the application, reducing the time

Consider whether some of the test setup can be done in another way, such as directly to a database or through web service calls