Test Dependencies and The Future of Build Acceleration

tl;dr Tests can depend on other tests’ execution which can cause them to unexpectedly fail. These dependencies crop up especially when developers try to speedup their slow test suites with test acceleration techniques (like test parallelization or selection). We have created efficient techniques for detecting these dependencies and isolating them, most recently published at FSE 2015, ElectricTest.

More Tests, More Problems

With the proliferation of testing culture, many developers are facing new challenges. As projects are getting started, the focus may be on developing enough tests to maintain confidence that the code is correct. However, as developers write more and more tests, performance and repeatability become growing concerns. As test suites grow, they take longer and longer to run, and it becomes harder to run them with every single change set.

In the projects that took the very longest to build (more than an hour), testing completely dominated the build process – on average, 90% of time was spent running tests!

In fact, we did a study of open source Java projects to assess the impact of testing on the overall build process. Our result was startling: in the 351 Maven+Java projects on GitHub that we built, testing consumed on average 41% of build time (that is, total time needed to run “mvn package”). In the projects that took the very longest to build (more than an hour), testing completely dominated the build process – on average, 90% of time was spent running tests! Clearly, running tests can take a very long time, and a reduction in testing time can be a big win for developers. Recent tools can help everyday testing, for instance, Milos Gligoric’s tool, Ekstazi, is a maven plugin that keeps track of which files are changed between each execution of the test suite, and which files each test depends on. If a developer makes just one change to the code, Ekstazi will automatically run only the tests that might have a different outcome because of that change. Tests can also be accelerated when running on multi core hardware, for instance, Maven and Ant (since 1.9.4) both now can run many tests in parallel.

XKCD

Unfortunately, sometimes developers notice that some tests fail when running in parallel, or just running a subset of tests, even though they pass when they run the complete test suite normally! This can be caused by test dependencies.

While we might think that each test should be independent (i.e. that a test’s outcome isn’t influenced by the execution of another test), we and others have found many examples in real software projects where tests truly have these dependencies: running tests in a different order causes their outcome to change! If there are test dependencies, then we are forced to always run all tests together, in their normal order. This prevents us from accelerating our test suites with typical approaches.

Test Dependencies

Prior to our work, the best chance that developers had to detect these test dependencies was to just run the tests and hope for the best. Some tools, like Maven’s surefire plugin, allow developers to run tests in different orders for each run, which might expose a dependency, when tests run out of their normal order. Sai Zhang and his colleagues also developed a tool for JUnit, DTDetector, that executes every possible combination of tests and automatically reports tests that depend on each other. Unfortunately, neither of these approaches is very scalable, even for a project like joda-time, with tests that complete in less than a minute, it would take over 46 days to run all of the possible combinations of the tests.

ElectricTest reports all data dependencies between tests, indicating which need to be run in order

My advisor, Gail Kaiser, and I joined up with ElectricCloud’s Eric Melski and Mohan Dattatreya to try to tackle this problem. ElectricCloud offers a tool that automatically and safely parallelizes long and complex Makefile-based software building, working with companies like Qualcomm, Cisco, and Huawei. To address this particular challenge of test dependency detection, we built ElectricTest, which detects data dependencies between tests with a reasonable overhead: only taking on average 20 times longer to run than running the test suite normally (this is over 1,000 times faster than the previous approach of running all pairs of all tests and observing their outputs, on average). ElectricTest reports all data dependencies between tests, indicating which need to be run in order. ElectricTest can provide developers with a complete stack trace showing how the dependency occurred, allowing them to determine if it’s a valid dependency, or if it should be eliminated. ElectricTest can then automatically enable sound test parallelization or selection, by ensuring that tests with order dependencies are always executed in the correct sequence.

I’ll be presenting ElectricTest at ESEC/FSE in September, 2015 (preprint available). ElectricCloud is continuing to develop ElectricTest from our research prototype.

Test isolation: The Price of Reliability

The speedup in test execution when using VMVM instead of traditional isolation

When starting out from scratch (with no existing tests and no dependencies), one easy approach to avoiding dependencies is to simply isolate each test. If there is no way that one test can pollute the state of another, then there will never be dependencies. One way to isolate JUnit tests is to execute each in its own JVM. However, unit tests are often fairly quick (for instance, taking only 100’s of milliseconds), while the time to create a new JVM for each test is relatively steep: 1-2 seconds. Gail and I developed VMVM (ICSE ’14 paper, GitHub, demo), a tool for efficiently isolating test’s in-memory state without requiring a new JVM for each test. VMVM can reduce the time needed to run an isolated test suite by up to 97% (on average, 62% in our evaluation of 20 open source projects), and is available under an MIT license on GitHub.