Stopping the Bad Test Death Spiral

The "SCUMmy Cycle" is all-too-common. A team with legacy (untested) code tries TDD hoping they will be able to continue making improvements. First efforts result in integration tests, perhaps because the code is tightly coupled and not cohesive. The team intends to someday replace them with proper unit tests. A team lacking essential understanding of the qualities of a good unit test will write integration tests unwittingly.

Months or years later the tests are abandoned, with a significant investment in their construction and maintenance having gone to waste. How does this happen?

Here's how the cycle generally plays out:

Only integration tests are written. One common cause: business logic is intertwined with UI or database code, perhaps as a reflection of examples found in framework and library tutorials.

More tests are added, until running them is slow/painful. Fifty to 100ms to interact with a database doesn't seem bad. But multiply that by a few hundred or thousand tests, and a even small test suite execution takes several minutes.

Tests are run less often because developers can't afford to run them. Developers will resist running ten-minute test suites more than a few times a day. Less-frequently-run tests are much harder to resolve when they fail. Large tests tend to be fragile and fail intermittently. They have runtime dependencies on external elements that are not controlled by the tests, and perhaps dependencies on side-effects other tests.

Tests are disabled because they are unreliable, obsolete from lack of maintenance, or simply too slow to tolerate.

Bugs become commonplace just as they were before the team started doing automated testing. Disabling too many tests lowers coverage and the remaining tests become ineffective.

Value of automated testing is questioned--"we're no better off than before!" And yet the team still wastes time writing and (sometimes) running tests.

Team quits testing in disgust, or managers mandate a stop to testing. The experiment is deemed an expensive failure. Teams are now free to return to the good old days of rapid coding and expensive manual testing. As W. Edwards Deming said, "Let's make toast the American way: I'll burn, and you scrape."

A sad progression, and it's real. Both of us (Tim and Jeff) have experiences confirming Rod and Ben's SCUMmy Test Progression. At each step along the progression it becomes harder to salvage the testing effort. Plenty of teams have started rough, but have recovered before reaching the bottom of the progression.

One possible lesson: it's cheaper to have no tests than to have bad tests. A better lesson: life is too short to settle for crummy tests.

The flip side of this card lists some therapeutic strategies for each downward step:

Only integration tests are written -> Learn unit testing. This also ties in with a better understanding of (and adherence to) the SRP and LoD. Consider hiring a short term coach to teach healthy habits in the team, or invest in better reading materials and the time to absorb the material. Attend a training session.

Overall suite time slows -> Break into "slow/fast" suites. Establish a time limit for the fast suite, and strive to keep the fast suite large and fast. Thousands of unit tests can easily run in under 10 seconds. Consider a tool like Infinitest to help keep tests running fast (but note that everything works better in a system that exhibits low coupling).

Tests are run less often -> Report test timeouts as build failures. The measures you institute will be arbitrary, but the key focus is on continually monitoring the health of your test suite. If the suite slows dramatically, developers will soon skimp on testing.

Tests are disabled -> Monitor coverage. New functionality should have coverage in the mid-to-high-90% range, and the rest of the system should exhibit stable or increasing coverage. System changes resulting in reductions in coverage should be rejected. Integration tests provide broad coverage, but you should either replace these with unit tests or elevate them to acceptance tests. You should otherwise delete disabled tests.

Bugs become commonplace -> Always write tests to "cover" a bug. These tests should always be written first, a la TDD. A defect is evidence of inadequate test coverage. Make sure you always track defects and understand the root cause of each and every one! Insist that these tests be fast tests.

Value of automated testing is questioned -> Commit to TDD, Acceptance Testing, Refactoring. Committing to TDD means learning how to do it properly--it is of low or negative value otherwise! Also note that many "regressions" are rooted in code duplication. Refactoring to eliminate duplication is critical for quality improvement. It is reasonable that a quality crisis causes a reduction in new features production.

Team quits testing in disgust -> Don't wait until it's too late! If a team's gotten to this point of admitting defeat, it's often too late--management won't normally tolerate a second attempt at what they think is the same thing.

In the adoption of unit testing, a few training sessions and a little time with a good coach can make all the difference in the world.

If you must self-coach, then ensure that team members don't view TDD as simply "management forcing programmers to test their code." Ensure that programmers understand the significant design and documentation benefits that TDD can deliver. Ensure they understand the scalability advantages of fast automated tests over manual testing.

A team that understands TDD and strives to attain the benefits it offers will avoid the Bad Test Death Spiral.

3 comments:

Hi Tim,Having been working at TDD for many years, the big breakthrough I've had recently is using the repository pattern for data access rather than active record. I'm sure this is old news to you. But for me, this design change has been a major plus both for testability, and the design in general.CheersMat

ActiveRecord has had a big surge in popularity (Django, Rails, NHibernate, etc) and it is great for getting something started. In some cases it is good enough for some pretty big web pages and apps. Still, I keep finding that the repository pattern works better.

I do a lot of work in legacy code. Often the legacy code uses ActiveRecord heavily, and often it is one of the biggest testability problems.

My current team is starting the migration out of active record.

Maybe we need to create a card on architectural patterns that are known to frustrate testing efforts....