Thursday, 1 May 2008

Simulation based estimating

I'm sat on a train at the moment, from Norwich to Ipswich. This train will take 39 minutes to reach its destination. It'll then take me 7.5 minutes to walk to the office, 1.3 minutes to make a coffee. I'll be ready to work at 08:17:48. Obviously, I'll stop to take a sip of coffee at 08:17:57, 08:18:20 & 08:18:50, and a large, cup emptying slurp at 08:19:20. Other than that though, it'll be working straight through 'till 17:30:00. I will not get peckish and stop to raid the snack machine. There will be no interruptions. Nobody will, at short notice, book a meeting or pitch up at my desk for an impromptu chat about the current status of Project X or Initiative Y. My fiancé will not send me an SMS asking me to pick up a bottle of wine on the way home, and absolutely no recruitment consultants will call. All my days are like this. I never have a bad day, never fail to get my head around the task at hand first time, never struggle to think where to start. All my tasks are predictable, have well defined goals and require no assistance from anybody who might be having a bad day. Of course, in this environment, I deliver what I said would, when I said I would with 100% certainty, every time.

Of course, I'm dreaming.

The reality is that nothing about the average day of the average employee is in any way precise. My train might take 39 minutes, or it might take 41, or if I'm lucky and there's a northerly wind, it might take 38 minutes and 59 seconds. My walk will be marred by traffic lights, and I (shock) will have bad days. If I were to use the fingers of every occupant of this rush hour train to count the number of times I've been asked by a project manager over the last 10 years "How long will this project take? How much will it cost?", I'd probably have no more than two fingers left to type the remainder of this post.

The trouble is, you see, I don't know.

Don't get me wrong, I can estimate things as well as the next guy - it's just counting widgets at the end of the day, but I know I'll be wrong. Of course, project managers are reasonable people, they say things like "Well, we have to be 100% confident in this estimate, so we'll add 10% contingency to your total, ok?" No. Not OK. The issue here isn't that I under-estimate routinely (although that might well be an issue), it's that adding ten percent will not make any estimate 100% confident, and nor will adding thirty, one hundred or even three hundred percent, regardless of how good an estimator I am.

Anyone who remembers the whale and bowl of petunias from Douglas Adams' Hitch-hikers Guide to the Galaxy will remember that according to Adams, the sudden appearance of flora and fauna in deep space is unlikely, but none the less has a real finite probability. Perhaps somewhat shockingly, this sort of thing is a reasonably well accepted side effect of quantum mechanics; this could happen at any time in any place. You could be just tucking into a medium (sorry, Grandé) cappuccino at your local Starbucks when something, anything, appears. House, car, cat, dog, frog, you name it. Now, just to stop you panicking over your caffeinated beverages, the chances of this happening are infinitesimally small, but it serves to illustrate the point that you never know what might get in the way of your project.

So, finally, to the point of my post. I think it's time estimating grew up a bit, and stopped pretending that it knew all the answers. What's needed is a way of estimating the cost and duration of a project that softens the boundaries a bit, and gives a truer picture not of how long a project will take, but how long it might take. Imagine for a second that a programme manager knew that there was a 66% chance that the project would be in by Christmas, and a 88% chance it'd be in by June. He'd probably be much more likely to give the board a sensible message than if all he had was a vague message from you that it could conceivably be done before the turkey gets cold. Equally, those of us who sometimes work on fixed price contracts would have a much better way of assessing the risk of a given project, allowing us to reliably turn a profit while offering a good deal for the client.

It strikes me that there are two approaches that could be taken to this:

Use probability theory to attach a probability distribution to every input parameter in the estimating model, and then carry these through the model and give a final probability distribution at the end. Complex, nasty, not much fun unless you have a PHD in applied statistics.

Make the input parameters to the model fuzzy (more on what I mean by this later) and then run the estimating model over and over again, collect all the answers and build a histogram (chart) out of the results.

I've thought long and hard about this over the last few years, and frankly the former option is beyond the capability of my AS-level statistics. Even if it wasn't though, I'd be recommending the latter, and here's why: It's intuitive. What you're doing is running the project over and over again, and seeing how long it takes. Project leaders can be old and wise without being old and wise; they've already done this project 1000 times (albeit in the mind of a machine), so they've got a good idea how long it might take.

Using this approach, you could build absolutely anything into your model; Productive day averages 6 hours, but varies between 2 and 20 hours on occasion? Sorted. 0.04% chance of aliens abducting your senior developer? No problem, at least for the project. The options are endless.

Nicer still, because this approach uses simulation rather than algebra, we don't need to be too anal about how the parameters are set. If it's easier to say "95% of the time it'll take 5 hours, but 5% of the time it'll take a random value between 8 and 10 hours", then that's fine. We don't have to put together some strange combination of probability distributions that models this; we just run with it. Equally, if you have a set of example data to use as a basis (well, this system has N classes, and it took X long), then these values could be used directly, without having to build a complex model from them first. That said, if the guys doing the estimating do understand probability, then they can use a Poisson distribution to determine how many use cases will be delivered by a week next Friday if they so desire.

Equally, because we're actually running the project, we can apply all sorts of interesting things to the model that would be impossible using a purely statistics driven approach. For example, in an agile project, we can simulate the team size and length of sprints against the simulated sizes of the products to determine the optimum length of a sprint for the project. We could simulate the quality of deliverables based on whether code reviews are expected, and use this to estimate the impact on the length of the test cycle. Obviously, this stuff is a bit trickier to achieve than answering the usual how long/how much question, but it's always good to know there's scope to develop things further in the future.

The architecture underlying this kind of estimating machine is pretty trivial. I'd say, with 100% certainty, you could deliver the underlying engine in 27 hours, 5 minutes. Elephants and petunias not withstanding.