Wednesday, February 22, 2012

We know that many development teams, especially small teams following Agile development practices, do a poor job of developing secure software. But is it Agile development specifically that is the problem? Many application security experts, especially those working in or for enterprises, think it is. I think they are wrong.

Monday, February 13, 2012

The idea behind the technical debt metaphor is that there is a cost to taking short cuts (intentional technical debt) or making mistakes (unintentional technical debt) and that the cost of not dealing with these short cuts and mistakes will increase over time.

The problem with this metaphor is that with financial debt, we know how much it would cost to pay off a debt off today and we can calculate how much interest we will have to pay in the future. Technical debt though is much fuzzier. We don’t really know how much debt we have taken on – you may have taken on a lot of unintentional technical debt – and you may still be taking it on without knowing it. And we can’t quantify how much it is really costing us – how much interest we have paid so far, what the total cost may be in the future if we don’t take care of it today.

“applications carry on average $3.61 of technical debt per line of code”.

For some reason, the average cost of Java apps was even higher: $5.42 per line of code. These numbers are calculated from running static structural analysis on their customers’ code.

Sonar, an Open Source dashboard for managing code quality, also tries to calculate a technical debt cost for a code base, again using static analysis findings like code coverage of automated tests, code complexity, duplication, violations of coding practices, comment density.

Thinking of technical debt in this way is interesting, but let’s stop pretending that these are hard numbers that we can use to make trade-off decisions. Although the numbers appear precise, they’re arbitrary, guesses. And they assume that technical debt can be calculated by a tool looking at the structure of the code. Unfortunately, dealing with technical debt is not that straightforward.

But if debt is too fuzzy to be measured in detailed cost terms, how do you know what kind of debt is hurting you the most, how do you know when you have too much? Let’s look at different kinds of technical debt, and how much they might cost you, using a fuzzier approach.

$$$ Making a fundamental mistake in architecture or the platform technology – you don’t find out until too late, until you have real customers using the system, that a key piece of technology like the database or messaging fabric doesn’t scale or isn’t reliable, or that you can’t scale out your architecture like you need to because of core dependency problems, or you you made some fundamentally incorrect assumptions on how the system is supposed to work or how customers will use it. Now you have no choice but to start again or at least rewrite big chunks of the system to get it to work or to keep it working, and you don’t have the time to do this properly.

$$-$$$ Error-prone code – the 20% of the code where 80% of bugs are found. Capers Jones says that all big systems have a small number of routines where bugs and problems cluster, code that is hard to understand and expensive and dangerous to change because it was done badly in the first place or it went to hell over time because of the accumulation of short-sighted fixes and changes. Not rewriting this code is one of the most expensive mistakes that developers make.

$-$$ The system can’t be easily tested – because you don’t have good automated tests, or the tests are brittle and slow and keep falling apart when you change the code. Testing costs can make up more than half of the cost of making any change or fix – sometimes testing can take much more time and cost much more than making the fix – and testing costs tend to go up over time as you write more code, as the system adds more interfaces and options.

$-$$ Not taking care of packaging and release and deployment. Relying too much on manual steps and manual checks, leading to mistakes and problems in production, late nights. Like testing, release and deployment costs don’t go away, they just keep adding up incrementally.

$-$$ Code that mysteriously works, but nobody is sure how or why – usually performance-critical or safety-critical low-level plumbing code written by a wizard who has long since left the company. It might be beautiful code, but if nobody on the team understands it, it’s a time bomb – someday, somebody is going to have to change it or fix it, or try to.

$-$$ Forward and backward compatibility adapters and compromises. This is necessary, short-term debt. But the cost rises the longer that you have to maintain these compromises.

$-$$ Out of date libraries and middleware stack – you’ve fallen behind on applying patches and upgrades. Even if the code that you have now is stable, you run some risk of unpatched security vulnerabilities. The longer that this goes on, the further behind you are, the higher the risk – at some point if the software is no longer supported or supportable, and your hand is called.

$-$$ Duplicate, copy-and-paste code. This is one of the bugaboos of technical debt and static analysis tools. Almost everybody has it. But how bad is it, really? The cost depends on how many clones developers have made, how often they need to be changed, how many subtle differences there are between the different copies, and how easily you can find the copies and keep track of them. If the developer who made the copies is still on the team and does a good job of keeping track of all of them, it doesn't cost much if anything.

$-$$ Known, outstanding bugs in code and unresolved static analysis findings. The cost and risk depends on how many bugs and warnings you have, and how nasty they are. But if they are real problems, they should have been fixed by now. Is a bug really a bug if it isn't bugging anyone?

$-$$ Inefficient design or implementation, “throwing hardware at it”, using too much memory or network bandwidth or CPU. Hardware is cheap, but these costs can add up a lot as you scale out.

$ Inconsistent use of programming idioms and patterns – developers either didn’t understand the existing patterns, or didn’t like them and introduced new ones, or didn’t care and just wanted to get their change done. It's ugly, and it can be frustrating for developers. But the real cost of living with the situation is often less than trying to clean it all up.

$ Missing or poor error handling and exception handling. It will constantly bite you in the ass in production, but it won’t cost a lot to at least get it mostly right.

$0.01 Hard coding, magic numbers, code that isn’t standards compliant, poor element naming, missing comments, and code that needs tidying. This is a pain in the ass, but it’s the kind of thing that is easy to clean up as part of standard refactoring work.

$0.01 Out of date documentation – another issue that is commonly considered in technical debt. But let’s be honest, most programmers don’t read documentation anyways. If nobody is using it, get rid of it. If people are using it, why isn’t it up to date?

$0.00 Hand-rolled code that could have and should have been done using built-in language features or libraries, or existing framework or common services. It’s disappointing when somebody recognizes it, but unless this hand-rolled code has lots of bugs in it, it’s a sunk cost, not a cost that is increasing over time.

There are different kinds of debt, with different costs. Figuring out where your real costs are, and what to do about it, isn't easy.

Tuesday, February 7, 2012

Agile methods like Scrum and XP both rely on a close and collaborative relationship and continual interaction with the customer – the people who are paying for the software and who are going to use the system. Rather than writing and reviewing detailed specifications and working through sign-offs and committees, the team works with someone who represents the interests of the customer to define business features and to decide what work needs to be done and when. One of the key problems in adopting these approaches is finding the right person to play the important role of the Customer (XP) or Product Owner (Scrum). I’ll use both terms interchangeably.

The team reviews their work with the Customer, the Customer answers any questions that they have and sets the team’s direction and priorities and makes sure that the team is focused on what is important to the business, owns and manages the requirements backlog, writes acceptance tests and decides when the software really is done.

A team’s success depends a lot on how good the Customer or Product Owner is, their knowledge and experience, their commitment to the project, how they make decisions and how good these decisions are. Mike Cohn in his book Succeeding with Agile explains that Product Owners have to be:

committed to the team and available to answer questions, get the information that the team needs when the team needs it;

an expert in the business: not only do they have to understand the domain, but they also need to understand what’s important strategically to the business and what’s important to the user community;

a good communicator within the team and outside of the team;

a good negotiator so that they can balance the needs of the team and the needs of different stakeholders;

empowered – they must be able to make decisions on behalf of the customer – and they need to be willing to make tradeoffs and tough decisions when they have to.

An effective Product Owner also needs to be detail-oriented so that they can understand and resolve fine details of functionality and write functional acceptance tests. They should understand the basics of software development, at least the boundaries of what is and is not possible, what is hard to do and what isn’t and why so that they can appreciate technical dependencies and technical risks. They need to at least understand the rules of Scrum or XP – what’s expected of them, and how to play the game.

And they should understand the basics of project management and risk management. Because in the end, the Product Owner is the person who decides what gets done and what doesn’t.

The Product Owner is a product manager, project manager and business analyst all rolled into one.

Oh, and… they also need to be “collaborative by choice”, “agile in all things”, and…” fun and reasonable”.

Being in Two places at Once

The Product Owner has to work closely with the team, in touch with what the team is working on, to make sure that they can keep moving forward. But the Product Owner also has to stay involved in the business to understand what is going on and what is important. They have to be both inward-facing (working with the team, planning, holding reviews, attending meetings, prioritizing, helping to manage the backlog, defining requirements, answering questions and clarifying information) and outward-facing (working with the project’s sponsors and with the users of the system, making sure that the business understands what is happening in the project, making sure that they understand business priorities and competitive positioning and business trends and when any of this changes and how this could affect the project). They have to be in two places at once, which is physically impossible if the development team and the business aren’t co-located.

The Product Owner and Politics

There are political risks in the team’s relationship with the Product Owner, and in the Product Owner’s position and influence within their own organization. The Product Owner has to play politics with the team and inside the business, trying to promote the interests of the team and the project, reconciling conflicts between different stakeholders, trying to get and keep stakeholders onside, building coalitions. Their success, and the project’s success, depends on their ability to negotiate these issues.

Scrum assumes that the Product Owner is not only committed and talented and in touch with the business’s strategic priorities and with the concerns and needs of front line workers; but it also assumes that the Product Owner will always put the interests of the business ahead of their own interests. But a good Product Owner is usually an ambitious Product Owner – they are interested in the project as an opportunity to advance their career. Projects effect change, and with every change there are winners and losers. There is the real risk that the Product Owner’s success may put them in conflict with other important stakeholders in the business – by focusing on making their Product Owner happy, the team may be making enemies somewhere else in the organization without knowing it.

A Customer, or Many Customers?

Not only does the Product Owner decide what is important and what is going to get done, they are responsible for the project’s budget, and they are accountable for the project’s success. According to Ken Schwaber’s Agile Software Development with Scrum (the original definition of Scrum) the Product Owner is “the person who is officially responsible for the project”, “the one throat to choke” if the project fails or goes off the rails.

Expecting one person to take on all of this responsibility and this much work is unrealistic. It’s too much responsibility, too much risk, and too much work. For many Product Owners, it’s more than a full-time job – a direct contradiction to Agile values that put people first and emphasize realistic working hours and sustainable pace for the team.

This has been a problem since the beginning of Agile: a few months after the launch of the initial phase of C3 (the first XP project), the Customer representative quit due to burnout and stress, and could not be replaced, and the project was eventually cancelled.

Scrum still demands that the Product Owner role must be played by one person.

This doesn’t make sense, given that another fundamental underlying Agile principle is that people working together collaboratively make better decisions than one person working alone (“the Wisdom of Crowds”). If true, then why are the critical decisions about prioritization and direction and vision for the project made by one person?

The people behind XP eventually recognized that the simple idea of a Customer demanded too much from one person, that the workload and responsibility need to be shared by a Customer team. A Customer team means that you get the advantage of multiple perspectives, people with different specialties and experiences, and you have more help with answering questions and making decisions. It’s more sustainable and practical.

But this comes with its own set of problems:

The development team has to reconcile differences in abilities, differences in understanding, different priorities, biases and political conflicts and personal conflicts within the Customer team.

More time has to be spent explicitly keeping the Customer team itself in synch. There are more chances of mistakes and misunderstandings and dropped balls.

Somebody still has to be in charge, make the important decisions – what Mike Cohn calls “the-buck-stops-here” person. The development team has to know that Customer decisions will not be over-ridden within the Customer team so that they can commit to getting work done.

The Customer in Maintenance

Maintenance and enhancement, where most of us will spend a lot of our careers, shows other problems with the Product Owner idea. First, it’s hard enough to get someone with the talent and drive and selflessness to represent the customer on a high-profile, strategic development project. It’s much harder to get anything close to this same level of commitment and talent for smaller projects, or for ongoing maintenance work. People with this knowledge and ability are likely to be running or supporting some important part of the business, not answering questions and helping to prioritize issues for your maintenance team.

And if you are lucky enough to find someone good, it’s hard to keep them – unlike a project, maintenance work doesn’t have a clear end date, so the team will need to get used to working with different Customers at different times, with different working styles, different agendas and different strengths.

The Product Owner is supposed to act as the single voice for the customer. But for a production system you are more likely to have too many voices, too many different people with different priorities and demands, all working for different parts of the business, all talking to different people on your team, trying to get what they need or want done. Or for some old legacy systems you may run into the opposite problem – nobody wants to take ownership of the system or its data, nobody knows enough or wants to be responsible for making business decisions.

Be your own Customer

All of these challenges don’t mean that you can’t work with the Product Owner model – obviously lots of teams are following Scrum and XP in one way or another. But you need to recognize the limitations and risks of this approach, and be prepared to fill in.

For example, many development and maintenance teams who work in a different location and especially a different timezone from the business fall back on a Customer Proxy: a business analyst or somebody else on the team who can help fill part of the Customer role, and work with people in the business to help answer questions and confirm requirements and priorities. It’s not as efficient as working directly with the business, but sometimes there isn’t a choice.

The Scrum Master or team lead or senior developers or senior testers, whoever is playing a technical leadership role on the team may have to step in and help fill in when the Customer isn’t available, when they are over-worked and can’t keep up, when they don’t understand, when they aren’t qualified to make a decision, or when they don’t care. To reconcile technical and business requirements and the needs of different stakeholders, and make technical and business trade-offs and long-term and short-term trade-offs. To communicate with business stakeholders. To follow-up on outstanding issues and unanswered questions and try to get answers from someone in the business. To write the acceptance tests for the customer – a common problem on Agile teams is that nobody on the business side is willing to help write acceptance tests, but they are willing to jump on the team when something is done wrong.

Be prepared to make more mistakes working this way – you will have to work with imperfect and incomplete information, sometimes you’ll have to make a best guess and go with it, and you will get requirements and priorities wrong. Test everything that you can, and get the product out to the business as quickly and as often as you can, and be prepared for negative feedback. You may not be able to build as close a relationship with the business as you could with a strong and committed Customer.

It’s a compromise, but it’s a necessary compromise that many teams have no choice but to make. As Mike Cohn points out in The Fallacy of One Throat to Choke, in the end it’s the team that fails, not just the Customer.

Source code, the software that we create, is only a means to and end. The software itself has no value, or worse it has negative value, because it creates a drag on your ability to innovate and move forward. The more code that you have, the higher your maintenance costs will be, therefore…

“… the best code of all is the code that's never written.”

Michael Feathers, who has a lot of smart things to say about source code, joined in on this discussion. In The Carrying-Cost of Code he says that

“code is inventory. It is stuff lying around and it has substantial cost of ownership. It might do us good to consider what we can do to minimize it.”

He goes so far as to suggest a goofy thought experiment where “every line of code written disappears exactly three months after it is written”. The point of this would be to get developers and the business to understand that the “costs of carrying code are real, but no one accounts for them”.

Feathers reinforces the valid points about the drag that unmaintained or poorly maintained legacy code has on companies. Writing less code to solve a problem is a good thing – it’s (usually) more efficient and (usually) costs less to maintain a smaller code base. And yes there is a necessary cost to maintaining software and working with existing software and changing it.

But none of this changes the fact that software is an asset

If you build and operate a power plant or a bridge, you have to maintain it – just like software. And like a bridge or a power plant, a newer, more modern, better-designed, more efficient and simpler asset is better than a big, old, complicated, expensive-to-maintain one.

The “software is a liability” argument seems to be that it’s not the software that’s the asset, it’s the “features and options” – the capabilities that the software provides. This is like saying that it’s not the power plant (which a company spent millions of dollars to design and engineer) that’s a valuable asset to a company, it’s the energy that it generates. It’s not the bridge – it’s the ability to drive over water. It’s not the airplane, it’s the ability to fly.

Pretending that software has no value in itself is silly. Try explaining this to accountants (don’t depreciate the airplane, depreciate the ability to fly!) and IP lawyers and to investors who buy software companies for their IP. They all understand that software and the ideas embodied in it are valuable and need to be treated as assets. The ideas themselves are only worth so much, even if they’re patented. But the ideas realized in software, actualized and proven and ready to be used or (better) already being used – that’s where the real value is. And this is the value that needs to be maintained and preserved.

Software is more valuable than other assets

The important difference between software and other assets is that software is much more plastic than other engineering work. Software is “soft” – it can be changed easily and inexpensively and quickly. This makes software more strategically valuable than “hard” assets like a building because software can be continuously adapted and renewed in response to changing situations, and transformed to create new business opportunities.

Software has to be changed to stay useful. The problem is NOT that we HAVE TO maintain software and change it to do things that it was never intended to do, to work in ways that it was never designed to, to do things that we couldn’t imagine a few years ago. This is the opportunity that software gives us – that we CAN do this. This is why Software is Eating the World.

Subscribe to this blog

About Me

I am an experienced software development manager, project manager and CTO focused on hard problems in software development, software quality and security. For the last 20 years I have managed teams building and operating high-performance financial platforms.
My special interest is how small teams can be most effective in building real software: high-quality, secure systems at the extreme limits of reliability, performance, and adaptability. Software that has to work, that is built right, and built to last.
I use this blog to explore ideas and problems in software development that are important to me. To reflect and to find new answers.