Sunday, June 5, 2011

T-Shirt Sized Estimates

How long would it take you to write a Hello World program in a language you've never used? Minutes, probably. But how many minutes? three? nine? Asking for such precise estimates just doesn't feel right, does it? But if you pose the question like this "Would it take you less than 15 minutes?" it suddenly seems like an entirely reasonable question.

This, in a nutshell, is why T-shirt sized estimates work. There are other reasons, as well, but before we dive into them let's review what T-shirt sized estimates (TSE) are.

Agile development methodology has made the concept of TSE rather popular in recent years. The idea is to give rough estimates to tasks in the form of T-shirt sizes. Small, medium, large, extra-large and so on; you get the point. But what does "small" mean, really? This is pretty open to interpretation and each team may define these sizes in its own terms. Some associate a duration range with each size. e.g. S is less than one hour. M is between one and three hours. Some associate T-shirt sizes with points that are later added to an aggregate estimate. But in essence the idea is simple: instead of arguing over how many minutes or hours a task is going to take, let's just agree that it's small and move on.

Now there's a hidden aspect to this method. The sizes are limited and have an upper bound. No T-shirt is XXXXXXXXL. This is significant. People are terrible estimators of large efforts. We're less terrible with reasonably sized efforts. So using TSE doesn't mean we don't invest any real effort into estimating and just spit-ball "2 years!". It only means we're not fooling ourselves into thinking that giving a very specific estimate would improve our project's total estimate in any way.

In fact, overly specific estimates are detrimental to delivery on time. Estimation is a well researched area in psychology. People aren't just bad at estimating, they're also terrible at knowing how bad their estimate really are. That's why the vast majority of people consider their own intelligence or attractiveness to be above average (which is plainly impossible). Ask people to give a 90% certainty estimate and they'll consistently give a too precise answer. A well known experiment has been widely replicated in many classrooms. Students are asked when queen Victoria was born, and they need to give their answer in the form of a range of years (e.g. 1800-1900) that they consider to be 90% certain to be true. The goal of this experiment is to see how well people estimate their own ignorance/knowledge. Statistically, you would expect students to specify a range so wide that only 10% of the students would actually miss the correct year. But instead a much larger portion of the audience gives a wrong answer by selecting a too-narrow range. We're overconfident in our knowledge and underestimate our ignorance.

When you think about it, efforts estimates are really a form of self-estimation (uh-oh!). People are asked how long it would take them to finish a task. That's why most estimates are overly optimistic. It's tied to a host of psychological factors: over-estimating our own capabilities and skills, wanting to please our superiors (and ourselves) with a smaller estimate, ignoring potential unexpected surprises along the way, etc.

Here's how TSE solve these problems.

First, if you force people to round their answers they're more likely to factor some uncertainty into it. If you internally estimate a task at 40 minutes, but are then asked to choose between a 30 or 60 minute estimate, you'll suddenly remind yourself that 40 minutes doesn't really take into account various overheads or possible interruptions. And it's much easier to justify an over-estimation to yourself when it's someone else who's forcing you to round it up. Rounding up on a small scale is a good thing.

Second, by limiting yourself to small estimates you avoid those big vague predictions which are the root of all evil. If there's no T-shirt size for 3 months, then you have no choice but to actually break your estimate down to smaller chunks. This, in turns, forces you to invest just a bit more time into planning ahead than you might have otherwise. Suddenly you're reminded of holidays and hiring efforts, and integration, and setup time... This, of course, is tied to the first rule of project underestimation: it's not the estimates that are wrong, it's the plan that's incomplete. If you allow yourself overly specific large estimates you also introduce a mental block: you've checked the box. It's "done". You wrote down "4 months" as your estimate for that distant milestone and now you don't have to look at it again until you reach that task. How convenient. If, on the other hand, you only allow yourself to provide smaller estimates, that milestone is now sitting there taunting you, begging to be further elaborated. You may be comfortable with a large estimate that's not based on reality, but if the aggregate estimate is small because you haven't yet taken the time to drill down even a bit to day-scale, well... that's just unacceptable.

Third, it's harder to argue with TSE. Which is another way of saying it's easier to accept a TSE. When your manager sees you've created a well thought out plan in advance that tries to capture all those details that normal intuition misses, he's not going to haggle with you over that one-week estimate for integration time ("one week? no no... it's 4 days at most!"). That's just not going to matter. Everybody knows we're not that good at estimating in that scale, anyway, so as long as an estimate isn't egregiously wrong people just move on. The discussion is suddenly focused on the content of your work plan, instead of the price tag attached to each task. Now the challenge is "who can think of more shit that can go wrong, that you forgot to include" instead of "who thinks this feature can be developed in less time".

I believe TSE needs to be the default method for estimating projects, as long as they're restricted to small sizes. Gigantt uses a variation of TSE. Our estimates aren't actually S/M/L/XL. They are currently: 1h/3h/5h/1d/3d/1w. Estimating a task is a one-click operation. This really reduces the friction of estimating (something people just don't like doing). It's always going to be the preferred, easiest way of estimating in Gigantt. In future versions we'll also add 0-duration estimates, for checklist type tasks (e.g. milestones). We may even add custom estimates (e.g. "4 months"), because we realize not everybody shares our above views on how to properly estimate projects and alternative project management solutions do offer this feature ubiquitously. But even if we do it's certainly going to be the 2nd choice, and we hope most users won't take advantage of this feature at all.