Wednesday, June 4, 2008

Dirty Glass

There's an anti-crime theory called the "Broken Windows" approach. Basically, the idea is that when something small is wrong, it should be fixed or it will lead to larger things going wrong. (The example that lends this the name is that broken windows in a building should be fixed, or they will lead to people breaking into the house, then squatters, and eventually larger crimes.)

This theory is increasingly being applied to software development, particularly by proponents of continuous integration. Basically, if you skip the small stuff, it rapidly becomes big stuff. Ignore a small bug and all of a sudden you'll have a lot of bugs - leading to the "whoa! That's an impossible bug list!" scenario. Ignore a failed build and eventually you won't be able to reliably build at all. This concept also goes under the term "technical debt". Basically, being a little behind with the "nice to haves" (refactoring, test coverage, etc) isn't too noticeable, but being a little behind often leads to being farther behind, and then big things start to fall apart.

I think it's a great ideal to say that we won't tolerate "broken windows". If a build fails, the top priority is to fix it. If a bug occurs, the top priority is to fix it rather than finish this other new feature.

But...

At some level a zero tolerance policy is unrealistic. Sometimes a build will fail, and the whole team will be in the middle of a major client issue. You just can't look at the build right then. Sometimes you can't do a code review because too many team members are on vacation and you can't get enough fresh eyes on it - so it has to wait a week. And that's okay, as long as you recognize it, account for it, and build in time to catch up.

So, to stretch our metaphor to the limit, what are the software broken windows, and what is just a bit of dirty glass? When is something really a big problem, and when is it just something to keep an eye on and not let it get worse?

Broken Windows:

These are the things that shouldn't be allowed to continue for longer than some very small period of time.

Failing builds

Tests that don't even run

Checking in without code review

Checking in without running tests

Dirty Glass:

These are the things that will bite you in the end, but they're not emergencies. As long as you're regularly going back and cleaning them up, it's okay to let these slip for a little while.

Build warnings

A few failing tests in edge cases

Checking in without pairing (we're in an XP shop. As long as we're code reviewing, we're doing okay. But pairing is more desirable.)

I suspect this will be a bit controversial, but I'd rather have a way to handle our imperfections than to simply state that we're just going to have to be perfect. An easy summary: