Wednesday, August 08, 2007

I am constantly producing estimates. How long will this project take? How long will that feature take? How long will it take us to test things? And my favorite: How long will it take to fix the bugs we haven't found?

Other than the last, nonsense question, these are legitimate issues to address in the process of software development. Unfortunately, our industry (and some others) tends to be very simplistic in their estimates.

There are many ways to go wrong in estimating. Perhaps the the most common is attempting to pretend that people are 'resources' and that everyone is interchangeable. But that's a topic for another day. Today, I want to look at a way of estimating that is simple enough to use yet isn't one dimensional.

One of the biggest problems with estimating is trying to incorporate some notion of risk in the estimate. Is your estimate iron-clad, where you are 99.99999% sure you'll meet the deadline (and consequently there's a chance you can beat it), or will you be working flat-out trying to meet the date with only a limited chance of making it? This is one of the most common problems in projects: one party gives an aggressive estimate and another party assumes it's conservative. We need a better way to communicate the likelihood of completing a task in a given amount of time.

To capture this idea, we need to represent the work with more than a single number, and we need to incorporate risk. Happily, this lends itself to a two-dimensional graph, with effort (time) as the x-axis and likelihood of completion (what I'll call "confidence") as the y-axis, like this:

The line represents your confidence, at the time you make the estimate, that you will solve the problem with the effort specified. Depending on the scope of the work, the effort line might be measured in hours, days, weeks, months, or even years. The basic idea is that it will take some period of time to understand the unknowns of a problem, which is the early, low-confidence part of the curve. Once the basic problem is understood work proceeds apace, and confidence goes up rapidly. The tail of the curve represents the fact that things might go well and we could get done earlier than expected. Let's take an example.

Assume you are asked to implement a new kind of cache for your system. It involves a "least recently used" (LRU) strategy along with a maximum age for cache entries. Instead of saying "it'll take 10 man-days of effort", as if it can not take only 9 or would never take 11, you can graph it, like this:

This says that you think there's about a 2% chance of completing the work in 1 man-day, a 20% chance in four man-days, a 90% chance you'll get done in 8 days, etc. You are not saying you'll be done in a specific number of man-days. Instead, you provide an estimate of the likelihood you'll complete the work in a particular number of man-days. This is useful because it helps provide some insight into the overall risk you see in the work.

This is not to say that all problems follow this curve. Some may in fact be so difficult that they never approach 100%. For example, if you thought there was only a 50% chance you would actually be able to come up with a solution to a problem, you might have a graph like this:

Similarly, if you think the problem is well-understood, your graph might look like this:

At first this might seem odd. But if you really understand the problem you'll be able to estimate the required effort very accurately. This means there's very little chance of getting done faster than your estimate, but it also means there's very little chance you'll run long.

Perhaps this is a well-known thing in project management circles, but I've never seen anything like it in over twenty years in industry. I plan to use these estimation curves to better communicate my estimates to others.