public business secrets

Post navigation

A blind spot of Continuous Integration

In the early days of April 2008, we updated our hudson continuous integration (CI) server to a new version. This was no unusual action, as there was a new version every day back then, bringing new features in a rapid rate. What was unusual after the upgrade was that one of the surveilled projects failed to build all of a sudden.

Sudden (test) failure without a change

The build was started manually, without a code change. The project itself was inactive back then, meaning that no changes were made for months. And suddenly, a unit test broke. The test was in the project for two whole years without ever going off. What happened?

Good unit tests

There are rules for good unit tests. A basic set are the A-TRIP rules formulated in the excellent beginner’s book “Pragmatic Unit Testing” by Andy Hunt and Dave Thomas. The failed test clearly disobeyed the “repeatable” rule (the R in A-TRIP): It didn’t result in the same result as before while the code under test didn’t change.

Write repeatable tests or your CI will be blind (partially)

The cause of the failed test was putting the clocks back because the daylight-saving time part of the year began. The unit test secured some date calculations by taking the current date and comparing it to future and past dates that got calculated. The calculation went wrong when the daylight-saving mode changes in the calculated period, which was a bug. Repeatability of the unit test was lost when “the current date” entered the code – whether on the unit test or productive code side.

Two years of blindness

How could this bug survive two years without being noticed? The project was under CI surveillance since the beginning, the unit test to detect the bug was present along with the bugged code. The answer is: We never programmed for this project around the weeks of the year when the clock is adjusted and the bug occurs. This was a coincidence influenced by the customer’s schedule. So every time CI (or we) ran the unit test, it passed. Until that day right after putting the clocks back.

How to avoid this blind spot

There are two things you can do to avoid this scenario:

Always inject a fixed “the current date” into your code when dealing with date calculations. Only use absolute dates in your unit tests. Time isn’t a healer for your tests, it’s a beast to be tamed.

Set up a nightly build for your project that runs once a day even when no changes have been made. It would have caught this bug one and a half year earlier.

To sum it up:

CI only spots bugs when they move (aka the code is changed). Nightly builds provide a (fuzzy) security layer against non-repeatable unit tests. And unit tests with flaws provide only delusory security.

Additional background information

After fully understanding the circumstances, we were curious why the customer didn’t notice the bug and asked him about it. The answer was delightful: “Our computers don’t adjust their clocks. Daylight-saving time only causes trouble.” What a wise decision!