Blog

Successful software projects often snatch defeat from the jaws of victory.

Most software professionals are familiar with the dreaded scenario: the project that cost you blood and sweat (and possibly tears) over a period of months results in a system that falls over under real world conditions on the first day.

But all the use cases are complete and signed off. All the unit tests pass. The quality assurance team have been playing office golf for weeks.

There are no bugs. There are also no users, because response times can be measured in minutes, not micro-seconds.

And it’s not just performance. You’re lucky if the features mostly meet business requirements, and the bug counts aren’t too high.

How does a software project end up in a bad place?

The answer is: gradually. Things don’t go wrong overnight. The frog doesn’t notice the water slowly reaching boiling point. (There is one exception: if you choose the wrong basic architecture at the start you’re pretty much screwed from day one. This blog post won’t help you with that one.)

Software projects tend to go off course a little each day. So how do you stay on course over a long period of time? You need to measure your location, and the direction you’re heading in. Notice that I didn’t say “accurately measure”. It’s easy to measure lots of things accurately, like bug counts, and story points, and page latency. These are all useless if one tiny bug in the shopping cart code is killing 50% of transactions, and it just looks like you have low conversions.

At nearForm, we believe in establishing a weekly rhythm for our projects. Regardless of the software development methodology, a weekly rhythm gives our projects a healthy heartbeat. We run a weekly demo of the current state of the system to check in with stakeholders and keep ourselves honest. Nothing encourages honesty more than working code.

However, it isn’t enough just to demonstrate functionality every week. You must also measure.

We measure the things that really count: the number of new registrations per day; the number of completed purchases; the overall error rate. What to measure is driven by the real needs of our customers, the things that really matter in growing their business. This is normally expressed in the form of key performance indicators (KPIs). One of the first questions we ask in a new customer engagement is “What are your KPIs?” Then we figure out a way to measure them.

This approach is very effective because it allows you to apply limited resources to maximum effect. There’s no point being feature complete if you haven’t built the right features. KPIs keep everyone focused on the end game, prompting them regularly to ask: why are we doing this? Does this task actually move us forward? The entire week’s activities are measured by their result on KPIs, week after week. The ship stays on course, and there are no nasty surprises.

In practice, there are some challenges to this achieving this ideal, especially from day one. The project may not be live, so there’s no way to capture production KPIs. Or the KPIs themselves may not be known. They may conflict with each other. Accurate measurement may not be possible. It’s easy to come up with challenges to this approach.

Let’s look at this a different way. How much information do you actually need? If greater accuracy won’t change your mind, it’s pretty much useless.

Imagine you’re on a road trip and you’re running low on fuel. You can drive for another 24.198km at exactly 100km/h. The next station is in 12.386km. Should you stop for fuel? Seems like you should, to an accuracy of five significant digits, no less.

What if I frame the question as: “You have about 20km of fuel, and the next station is in about 10km, should you stop?” Is your answer any different? But I’m out by 20% on my numbers!

Rough numbers enable us to arrive at a quick answer that seems obvious. It’s obvious because we have a better feel for driving a car than we do for abstract business metrics.

There’s a deeper question. How are we able to reduce accuracy so much and still get useful results? Because the key thing is not accuracy. It’s uncertainty.

All business is risky, and risk comes from uncertainty. If we can lower the uncertainty to comfortable levels, we can take risks and reap the rewards. In the example above, the uncertainty level of 20% did not make the numbers useless. It was still clear we needed to stop for fuel.

How can we apply this idea to help us build better software? Let’s take the example of loading a product page on a website. We can measure the amount of time it takes to return the page. On a production system, you’ll want to know the average load time, and what percentage of people saw load times over your “acceptable” level. In production, you can measure this completely accurately, because you can capture all the data, from every single user interaction.

In development, you can’t do this. At best, you can run a series of requests against the product page, and measure the response time of the staging server.

This is very far from production. Surely you need to run a full load test at production levels, preferably the week before go live, so that you can get accurate measurements on the final system?

Remember: it’s not about accuracy, it’s about reducing uncertainty. If you measure the load time of the product page on the staging server, under simulated conditions that in no way match production, using a database of test data, you are not wasting your time. You can still reduce uncertainty. Is there a bug that means the page takes 30 seconds to load? Perhaps the system was fine last week, but you’ve just introduced that bug. Perhaps each set of new features, introduced each week, makes the page slightly slower. Will you notice if you just test manually? Perhaps there’s a bug in the caching layer that only manifests after 16 page loads. Will you catch it?

Put in place a simple performance test for page load times, run it each week, and report on the numbers each week. Make no mistake, this process does not give you an accurate measurement of production behavior. But it does reduce uncertainty. It gives you a reasonable proxy for production that will catch all sorts of mistakes. It gives you a way to measure how new features affect system performance. It gives you a way to validate the effectiveness of the caching architecture. It gives you a number to work with.

It may not be an accurate number, but it is much better than flying blind.

In practice, this approach turns out to be very effective.

It prevents almost all nasty surprises (the worst thing that can happen to you in business). It focuses the team on an important aspect of the system: is it fast enough to be usable? It installs a culture of measurement.

All these benefits arise from an inaccurate measurement of a test system that will never run in production. They arise because measurement of any kind, and especially repeated measurement, reduces uncertainty.

At nearForm, we believe in measurement. It is a core part of our culture as a professional services business. Sometimes we have to be quite insistent in bringing this to the table, and making sure that performance numbers are reviewed each week, from the first week. As the weekly demo and the weekly performance numbers become part of the culture of the project itself, the project team gains more confidence and ability. Each team member can see the risk going down, and is encouraged to try more and do more.

You can’t eradicate uncertainty. Almost all software development methodologies try to do this and fail miserably. Instead, embrace uncertainty! Tame it with measurement, and conquer risk.