Continuing with the series, this time I want to highlight a very dangerous anti-pattern: using velocity as a performance metric. Before getting into the examples of how it applies to velocity, I want to first explain my view on metrics. I am in favour of metrics and coming up with interesting ways of displaying data (information visualization is a very interesting topic). However, the problem lies in the way that these metrics are used. There are two main types of metrics that I like to categorise as:

Diagnostics Metrics: these are informative measurements that the team uses to evaluate and improve it’s own process. The purpose of collecting them is to gain insight into where to improve, and to track whether the proposed improvements are taking effect. They are not associated to a particular individual or to how much value is being produced. They’re merely informative and should have a relatively short life-cycle. As soon as the process improves, another bottleneck will be identified and the team will propose new metrics to measure and improve that area.

Performance Metrics: these are measurements of how much value your process is delivering. These are the ones you should use to track your organisation’s performance, but they should be chosen very carefully. A good approach is to “measure up”. Value should be measured at the highest level possible, so that it doesn’t fall into one team’s (or individual’s) span of control. People tend to behave according to how they’re measured and if this metric is easy to game, it will be gamed. There should also be just a few of these metrics. An example of one such metric would be a Net Promoter Score (that measures how much your custumer is willing to recommend you to a friend) or some financial metric like Net Present Value (read Software By Numbers if this interests you). As you can see, these are very much outside of a team’s control and to be able to score high on them, they should try and do a good job (instead of gaming the numbers).

Going back to velocity, a very common mistake is to use it as a performance metric instead of diagnostics. Velocity doesn’t satisfy my criteria for a good performance measure. Quite the opposite, it’s a very easy metric to game (as mentioned in my previousposts). When approached as a performance metric, it’s common to see things like:

Comparing velocity between teams:“Why is Team A slower than Team B?” Maybe because they estimate in different scales? Maybe their iteration length is different? Maybe the team composition is different? So many factors can influence velocity that it’s only useful to compare it within the same team, and even then just to identify trends. The absolute value doesn’t mean much.

Measuring individual velocity: as highlighted by Pat, this is a VERY DANGEROUS use of velocity, and it can actually harm your process and discourage collaboration.

A push to always increase velocity: it’s common to have a lower velocity in the beginning of a project, and that it tends to increase after a number of iterations. Inspite of that, I’ve seen teams pushing themselves to improve it when they reach a natural limit (Who doesn’t want to go faster, right?). Velocity measures the capability of your team to deliver and, as such, tends to stabilise itself (if you have a stable process and the number is not being gamed). A Control Chart could help you visualise that. As noted by Deming, in a stable process, the way to improve is to change the process.

It’s important to remember that velocity is a by-product of your current reality (your team, your processes, your tools). You can only improve your process once it’s stable and you know it’s current capacity. Velocity is just a health-check number that will tell your team’s capability. It will not tell you about how much value is being delivered or how fast you’re going. You can deliver a lot of points and make trade-offs on quality which, no matter how you measure it, will impact your ability to go fast in the long run. As uncle Bob says:

“The way to go fast, is to go well”

So let’s stop using velocity to measure performance and look at it as a diagnostic metric to improve our software delivery process.

Continuing with the series on how to misuse velocity, the second anti-pattern I would like to highlight is when teams start making up points. Because the definition of velocity is so simple, it’s easy to game the metric to show what looks like apparent progress. If a team is being measured on velocity (more about this on later posts), it’s quite easy to just start increasing estimates: “If we just double all estimates, the relative sizes stay the same, but our velocity doubles!“. This is an extreme behaviour that would be quickly noticed as a discrepancy, but the same thing could happen in a smaller scale and pass unnoticed.

This problem can not only be originated from the team, but I’ve also seen Project Managers/Scrum Masters coming up with “clever” ways of making up points to count as velocity:

Deciding to split a story to count the partially finished work as complete, and track whatever is left in a separate story (splitting should be business-driven and not tracking-driven: it should only happen when you come up with simpler/incremental ways of delivering value in smaller chunks)

Counting points on technical tasks. I’ve seen a team that spent a lot of effort in an iteration to make up for accumulated technical debt, and did not have a lot of time to work on new stories. The Project Manager decided to come up with a “refactoring card” and gave it a 16 to try and demonstrate how much effort was spent on such refactoring

Counting points for in-release bug fixing. In a team, stories were deemed completed on the first iteration, but bugs started to show up in later iterations, impacting he team’s ability to deliver new functionality. Instead of allowing the decrease in velocity to demonstrate how the lack of focus on quality was impacting the team (bugs should be prevented in the first place, right?), the Project Manager decided to estimate and count points on bugs, which kept velocity apparently constant, when in fact a lot less value was being delivered

The next time you catch yourself asking “Should X count as velocity?”, stop, reflect, and ask instead “Should I worry about X happening at all?”. If you are worried about having to track or show progress on things that should be embedded parts of the process (such as activities to prevent bugs or refactoring), chances are that the problem lies somewhere else. Some of these questions might make as much sense as “Should time spent on retrospectives count as velocity?” or “Should going to the bathroom count as velocity?” :-)

I’m sure that these examples drawn from my personal experience are just a few examples of how to make up points and misuse velocity. What other similar experiences did you have in your own projects?

Dan North wrote an interesting post about the perils of estimation, questioning our approach to inceptions, release planning, and setting expectations about scope. This made me think about the implications of those factors once a project starts, and I came up with some anti-patterns on the usage of velocity to track progress. This is my first attempt at writing about them.

Before we start, it’s important to understand what velocity means. My simple definition of velocity is the total number of estimation units for the items delivered in an iteration. Estimation unit can be whatever the team chooses: ideal days, hours, pomodoros, or story points. The nature of items may vary as well: features, use cases, and user stories are common choices. Iteration is a fixed amount of time where the team will work on delivering those items. Sounds simple? Well… there’s one concept that is commonly overlooked and that’s the source of the first anti-pattern: what does delivered means?

One of the most common anti-patterns I’ve seen is not having a clear definition of done. Lean thinking tells us that nothing is really done until it’s delivering value, which in software means: code running in production and being used by real users. Although I know very few teams who can deploy code to production at the end of every iteration (some even do more than once per iteration), once a story is considered done, it could be potentially shipped, if the business decides so. There shouldn’t be a lot of extra work after that.

Another bad implication of this anti-pattern is that some teams decide to change the definition of done and count half-completed work to show progress. Some of the symptoms to help diagnose if your team is suffering from this anti-pattern are:

“It’s done, but [we need to write the acceptance test/it’s not integrated with the other system/…]”

“It’s done, but not done-done”

It takes a lot of extra work to get the story deployed to production

After finished, the story goes into the next team’s backlog

Hearing terms like “development team velocity” or “test team velocity”

Counting half-points or percentages because “if we don’t count it will look like we haven’t worked”

The solution? Remember that velocity is just a number that provides information for the team to understand and improve it’s process. Forget that you’re tracking it and focus on the entire Value Stream and on what’s really value-added to get things into production. Anything else is just waste. If it’s not done, it’s not done. Accept it, move on, and don’t overcomplicate, because it will only add noise and mask what could have been important information to the team.

In this session, Joshua Kerievsky (the founder of Industrial Logic and author of “Refactoring to Patterns”) shared his experiences about stepping away from using estimations and moving into what he called ‘Micro-releases’. What got me interested is that, although he claimed not to be following the recent kanban/iteration-less techniques, he arrived in something relatively similar through his own experiences.

He started talking about the usual confusion of using different units to estimate stories (using points and people converting it into time-units when estimating, using ideal hours and people interpreting it as real hours, and so on). This confusion is also reflected in the way people interpret the team’s velocity. Instead of using it as a measure of the team’s capacity, it’s easy to make it a performance measure. Improving velocity becomes a synonym to improving productivity, and we still don’t know how to measure productivity in software.

Another usual problem that he raised is “What to do with hang-over stories between iterations?”. Are they considered waste? Should they be counted in the team’s velocity? Why is it a hang-over? Is it because it’s close to be completed or is it actually much bigger than the team expected?

His proposed practice of ‘micro-releases’ is based on a high state of collaboration with the customer: there is a small, prioritized list of what should be done next and the team picks up a small and coherent set of stories and ‘guess’ the next micro-release date based on gut feel. He reports that slipping the end date of a micro-release is not such a big deal, because the next one is always close. In this model, the length of a micro-release varies, but usually between 1-6 days (with some outliers like 15, as I recall from his slides, but it’s not too different than that). Because of that, there’s not a stable measure of velocity and using it becomes useless. Feeding the input list with the next stories is done in a just-in-time fashion, and the value produced by the software is realized earlier, as soon as the micro-release is finished.

It’s important to note that he still values the heartbeat of iterations to gather feedback (like a customer showcase or a team’s retrospective). But he simply detaches the release life-cycle from the iterations. This opinion was reinforced by Ron Jeffries in the audience, when he said they were releasing to production twice per iteration in one of his projects. What an interesting point of view: we usually tend to see a ‘release’ as the result of a set of ‘iterations’, but he is reporting smaller ‘release’ cycles than ‘iterations’. Hmm…

But there are some important caveats that should be in place in order for these techniques to work:

High collaboration with the customer

Ability to break-down stories into small-but-still-valuable chunks: He reckons it’s very important to know how to break down requirements into “stories of the right size”. If they are too big, there’s no way for the team to guesstimate when they will be finished. He thinks this practice is one of the most important to any Agile project (using micro-releases or not).

Easy to deploy: There’s no point in having a new release after 2 days if it takes 3 days to deploy it to production.

Bargaining stories: Alongside breaking stories into small chunks, teaching the customer to bargain for features and avoid what he calls “Feature Fat” (building more that’s actually needed, or gold-plating requirements) is very important (again, this is something you can use regardless of the micro-releases approach).

Since the experiences started with Idustrial Logic’s internal product development, some people argued they are only capable of using that because they have enough experience with XP and Agile, but Joshua claimed he is introducing these ideas with success in their new client projects.

This seems like a common theme in the agile space lately, and you can follow up on the discussions on InfoQ, the XP mailing list, or here. :-)

I’ve found out recently that there’s an interesting timeless statement in the Agile Manifesto that we usually forget: “We are uncovering better ways of developing software…”. We should be open to alternatives and to discuss new ideas!