Aggressive Database Optimisation

While my Enterprise Perl cartoon may seem like a joke, it's not. It's a sad fact that for larger codebases, tests can take a long, long time to run. The one I used on the BBC PIPs project took an hour and twenty minutes to run when I left that team. The one I use on the BBC Dynamite project takes just over an hour to run. Adam Kennedy, on the Enterprise Perl post, reported his tests can take a couple of hours to run.

On my Veure project, I've been very, very aggressive about tests. I just finished another optimization and the tests now take just over 30 seconds to run. Out of curiosity, I disabled most optimizations and the tests take almost 6 minutes. In other words, my tests are now over an order of magnitude faster than if I had written them like most developers write code for modules. Modules really do foster a different testing strategy than applications.

I primarily manage to gain my optimisations by:

Rebuilding changed tables rather than rebuilding the entire database

Using a small but valid subset of data (rebuilding tables with thousands of rows is slow)

Running almost all tests in a single process

The danger is that a valid subset of data is tough to achieve. What's valid? Part of it is simply knowing your system very well, but it could mean that there are use cases you'll miss. It's a tradeoff, but given that we already know that testing can't find all bugs, it's OK to say "I know I'll miss some things with a subset of data". To handle this, when you find something you miss, simply add it back to your subset or, alternatively, create a "fixture" that specific tests can load in the Test::Class "setup" phase (in other words, load before every test method for a subset of test classes) or just at the start of the test methods which need it.

I would love to see "tags" available on Test::Class tests so you could do this:

Adrian Howard has long wanted to add tags or something similar to Test::Class. Perhaps I can use this use case to bug him :)

If you're lucky enough to be starting a new, large-scale project for work, pay attention to optimising your test suite up front. At my current rate, my tests might take 15 minutes to run if they start approaching the size of the PIPs test suite. What that means, in a nutshell, is that I save an hour per test suite run. If a team of six developers each runs the full test suite once a day, that can be an extra developer's worth of time per day. It's probably worth more. That's because while it's six hours of savings directly, indirectly it's much more as the developers are forced to multi-task and possible bounce back and forth between different tasks they're not familiar with.

Now let's take it one step further: you switch to using Johan Lindström's Devel::CoverX::Covered. Make a change to some code? Rather than running the full test suite, you could use that module to only run tests which exercise the code you've just changed. It's a bit more work to set up, but once you've managed proper tool support, you can run a useful subset of tests, have them run quicker, feel more confident that they actually exercise the code in question, and only run the full test suite before merging (or going on lunch, going home, etc.).

This is important because you want your developers writing code, not fencing on the office chairs, waiting for the test suite to finish. And let's face it, many developers fight like mad to avoid any non-development tasks, so while those tests are running, you're paying them to do nothing. Not good for you. Not good for their morale.

Tagged as:

6 Comments

Is there a perl equivalent to Ruby's autotest? I.E., you start it up, it watches your files (code and test files), and when one changes it runs only the test file that changed or which corresponds to the source file; if that test passes, then it runs all tests.

@Shalon: you could use Dave Rolsky's File::ChangeNotify in a daemon to see which files have changed and hook into that. I seem to recall that the Perl Hacks book has an example of how to do this, but I don't have my copy handy (disclaimer: I'm one of the authors).

@pedro: I've been off work for a few days with a pretty serious ear infection and now I have one and a half weeks of holiday, so I can't pull in work stats.

However, on Veure, I just hit 400 tests at 97.1% coverage. They currently take 17 wallclock seconds to run.

Leave a comment

Name

Email Address

URL

Remember personal info?

Comments
(You may use HTML tags for style)

About Ovid

Have Perl; Will Travel. Freelance Perl/Testing/Agile consultant.
Photo by http://www.circle23.com/. Warning: that site is not safe for work. The photographer is a good friend of mine, though, and it's appropriate to credit his work.