Friday, May 01, 2009

Mythbusting, Secure code is less expensive to develop

Conventional wisdom says developing secure software from the beginning is less expensive in the long run. Commonly cited as evidence is an IEEE article “Software Defect Reduction Top 10 List” (published Jan 2001) states, “Finding and ﬁxing a software problem after delivery is often 100 times more expensive than ﬁnding and ﬁxing it during the requirements and design phase.” Many security practitioners borrowed this metric (and others similar) in effort justify a security development life-cycle (SDL) investment because software vulnerabilities can be viewed as nothing more than defects (problems). The reason being is its much easier to demonstrate a hard return-on-investment by saying, “If we spend $X dollars implementing an SDL we will save $Y dollars.” This as opposed to attempting to quantify a nebulous risk value by estimating, " If we spend $X on implementing an SDL, we’ll reduce of risk of loss of $Y by B%.”

The elephant in the room is vulnerabilities are NOT the same as functional problems and the quote from the aforementioned article references research from 1987. That’s pre-World Wide Web! Data predating C#, Java, Ruby, PHP, and even Perl -- certainly reason enough to question its applicability in today’s landscape. For now though lets focus on what a vulnerability is and isn’t. A vulnerability is a piece of unintended functionality enabling an attacker to penetrate a system. An attacker might exploit a vulnerability to access confidential information, obtain elevated account privileges, and so on. A single instance represents an ongoing business risk, not guaranteed to occur, until remediated and may be acceptable according to an organizations tolerance for that risk. Said simply, a vulnerability does not necessarily have to be fixed for an application to continue functioning as expected. This is very different from a functional problem (or bug if you prefer) actively preventing an application from delivering service and/or generating revenue that does (have to be fixed - Thank you Joel!). Functional defects often number in the hundreds, even thousands or more depending on the code base, easily claiming substantial portions of maintenance budgets. Reducing the costs associated with functional defects is a major reason why so much energy is spent evolving the art of software development into a true engineering science. It would be fantastic to eliminate security problems using the same money saving logic before they materialize. To that end best-practice activities such as threat modeling, architectural design reviews, developer training, source code audits, scanning during QA, and more are recommended. The BSIMM, a survey of nine leading software security initiatives, accounted for 110 such different activities. Frequently these activities require fundamental changes to the way software is built and business conducted. Investments starting at six to seven figures are not abnormal, so SDL implementation should not be taken lightly and must be justified accordingly. Therefore, how much of what could we have eliminated upfront at what cost is the question we need to answer.

Our work at WhiteHat Security reveals websites today average about seven serious and remotely exploitable vulnerabilities leaving the door open to compromise of sensitive information, financial loss, brand damage, violation of industry regulation, and downtime. For those unfamiliar with the methods commonly employed to break into a websites they are not buffer overflows, format string issues, and unsigned integers. Those techniques are most often applied to commercial and open source software. Custom Web applications are instead exploited via SQL Injection, Cross-Site Scripting (XSS), and various forms of business logic flaws -- the very same issues prevalent in our Top Ten list and not so coincidentally a leading cause of data loss according to Verizon’s 2009 Data Breach Investigations Report. External attackers, linked to organized crime, exploiting web-based flaws.

To estimate cost-to-fix I queried 1,100+ Twitter (@jeremiahg) followers (and other personal contacts) for how many man-hours (not clock time) it normally requires to fix (code, test, and deploy) a standard XSS or SQL Injection vulnerability. Answers ranged from 2 to 80 hours, so I selected 40 as a conservative value paired with a $100 per hour in hard development costs. Calculated:

To be fair there are outlier websites so insecure, such as those having no enforced notion of authorization, that the entire system must be rebuilt from scratch. Still we should not automatically assume supporting Web-based software is the same as traditional desktop software as the differences are vast. Interestingly even if the aforementioned 100x-less-expensive-to-fix-during-the-design-phase metric still holds true, the calculated estimates above do not seem to be a cause for serious concern. Surely implementing a regimented SDL is orders of magnitude more expensive to prevent vulnerabilities before they happen than the $28,000 required to fix them after the fact. When you look at the numbers in those terms a fast paced and unencumbered development model of release-early-release-often sounds more economical. Only it isn’t. It really isn’t. Not due to raw development costs mind you, but because the risks of compromise are at an all-time high.

Essentially every recent computer security report directly links the rise in cybercrime, malware propagation, and data loss to Web-based attacks. According to Websense, "70 percent of the top 100 most popular Web sites either hosted malicious content or contained a masked redirect to lure unsuspecting victims from legitimate sites to malicious sites." The Web Hacking Incident Database has hundreds more specific examples and when a mission critical website is compromised it is basically guaranteed to surpass $28,000 in hard and soft costs. The potential for down time, financial fraud, loss of visitor traffic and sales when search engines blacklist the site, recovery efforts, increased support call volume, FTC and payment card industry fines, headlines tarnishing trust in the brand, and so on are typical. Of course this assumes the organization survives at all, which has not always been the case. Pay now or pay later, this is where the meaningful costs are located. Reducing the risk of such an event and minimizing the effects when it does is a worthwhile investment. How much to spend is comparable to each organizations tolerance for risk and the security professionals ability to convey it accurately to the stakeholders. At the end of the day having a defensible position of security due care is essential. This is a big reason why implementing an SDL can be less expensive than not.

19 comments:

Interesting. I'd agree with your assertion that if it costs $4,000 to fix an XSS vulnerability, it'll cost the same regardless of whether it's during the design phase or after deployment -- IFF there's only the one instance. Catching the problem early may well prevent it from proliferating, especially if the code is reused elsewhere. If catching it early improves awareness among the developers, it may have a preventive effect. Also, fixing it in production requires another iteration of testing and code release, which adds marginally to the cost (QA people usually don't get paid as much as developers ;-). And finally, fixing it earlier guarantees that there is still money in the budget to do so; project managers tend to use up all available dollars when they believe they've "finished" an application, and in a lot of organizations that same $4,000 is impossible to come by later on; it skews the risk analysis because it's a separate, visible cost. Managers decide that the XSS vulnerability isn't so bad, compared to the risk of having to go back to ask for more money.

So there are still plenty of good reasons to fix things early and often; I don't think the myth is busted just yet.

First, architectural flaws weren't accounted for here. Sometimes, fundamental mistakes are made at the design phase that, once implemented, are much more difficult to undo than a simple XSS vulnerability. This can occur not just in the code, but in assumptions made about the security of the surrounding environment. This is often where a lot of problems go unresolved due to direct and indirect costs (time).

Second: You mention that Frequently these activities require fundamental changes to the way software is built and business conducted. I would suggest that, if properly implemented, an SDL and a Business Security Architecture may often lead to fundamental cost-reducing changes above and beyond security-specific or development-specific changes.

This is interesting analysis, but should probably be expanded to address more types of vulnerabilities and to look at how vulnerabilities cluster in applications.

If you get your fundamental coding idioms wrong, you're going to end up with a whole bunch of XSS or SQLi vulnerabilities. Fixing the first one might take a couple of hours (environment set up, tracking down the issue, testing the fix), but fixing vulnerability 1+N is pretty cheap because you just have to go to the next vulnerability and change the code. Rinse and repeat. It isn't fun work, but at least it is predictable to schedule. Project managers love this.

Business logic issues are a lot more variable to fix. It could be as simple as adding a permissions check before allowing data access or it could require you to change the application requirements and then those changes have to flow down through the development process. Project managers like these less because they could take a lot or a little. Uncertainty and variability are not the project manager's friends.

Architectural issues are the most variable and tend to start large and get enormous with regard to the level of effort required for the fix. I worked on one system where a single vulnerability (weakness in the way authentication worked) required a multi-year change control effort. Yikes! Needless to say, project managers (and line of business managers, executives, boards of directors...) like these the least.

I like the idea of economically modeling remediation efforts, and I think the model should be expanded to look at the mix of vulnerabilities found in applications as well as the mix of applications that organizations have.

At the risk of causing the flashing lights of the Wayback Machine to produce seizures in rats, Kevin Soo Hoo, Andy Sudbury and I did a report in ~2001 on just this subject. ("Tangible ROI Through Secure Software Engineering", an @stake publication). It was the original ROSI paper, as far as I can tell.

We found that most customers fixed nearly all of the quick hit items (easy to fix, high risk) and a lesser proportion of the gnarlier ones (hard to fix, high risk), for a total of about half of the defects we found -- the remainder being lower risk items that customers couldn't or wouldn't fix. We also used the 100x multiplier that was first famously floated in the ~1981 IBM System Sciences app research paper Jeremiah alludes to. Even so -- after accounting for the defect-fix rate we were seeing in the field -- we modeled that the ROI of secure software engineering was between 12-21%. Not bad.

I've got the original paper, and would be happy to shoot it to you for posting.

@shrdlu, code proliferation! That's a really good point. Don't know how to projects those costs on anything close to generically, but hadn't considered it before.

I'd also agree that the Myth is absolutely not busted. Still valuable to raise the discussion since the costs involved are not always straight forward.

@sintixerr, agreed. Architectural / Business Logic Flaw issues can indeed be much more difficult and costly to fix after that fact. I mentioned that in the text, but perhaps it was clear enough. Still these issues could be just as cheap to fix as XSS/SQLi issues, it really all depends on the instance. Couldn't make for reasonable generic calculations.

On the "often lead to fundamental cost-reducing changes", its not that I don't agree, but would be really interested in an case studies written you know of speaking to that. Would link to them in the future.

@Dan, definitely needs more vuln types explored. I'll probably write a follow-up once I get a chance to digest all the feedback. It is time to start assigning some numbers to our (royal we) wisdom to see if the results actually achieve what we think they can.

@Andrew, would love a link to that paper and would link to accordingly. For whatever reason this research has been hard to source.

I agree with other posters - their insight is great and there are lots of statistically unsound assumptions here. The formula for vulnerability fixing cost does not consider the following facts:

1) The 'average cost' figure hides the variance. The average cost across vulnerability type costs does not accurately reflect real typical cost. For example, if you have 9 XSS vulnerabilities and 1 massive authorization problem, your real mean vulnerability fix cost will be way less than you've shown.

2) The 'average cost' to fix depends directly on how much security is in the lifecycle. If developers are trained, the right security mechanism/SME is already available then the cost will be much less per vulnerability.

3) The text here assumes that also all functionality bugs are immediately fixed (false) and that security bugs must be fixed regardless of priority (false).

4) The text here doesn't recognize and can't measure the fact that security problems are often built on top in ways that make it a "feature", aggravating cost even more exponentially in the application's downstream life.

5) Many vulnerabilities can be addressed with a single fix. That murders your 'average cost' rate. Imagine an freshly installed AOP pointcut that installs an authorization check on all business logic - 30 lines of code just fixed 300 security vulnerabilities. Of course, I don't know exactly how you're counting vulnerabilities so this is arguable.

@Jim, and that's the weird dilemma in this kind of calculation. The more maintainable you make the code (ie via ESAPI), the cheaper it is to fix security issues after the fact. However, one must take into consideration that you may not need to fix as many issues after the fact as well.

As had been previously mentioned, your analysis doesn't take into account architectural flaws, and that is where a holistic SDL really pays for itself. XSS and SQLI are implementation flaws, and individually they can be very easy to fix (if you had pervasive XSS and wanted a consistent and unified solution for the product that still provided field specific validation, that might be a bit more expensive), but implementation flaws are easy to avoid from an SDL perspective anyway- you can simply provide a prescriptive list of coding standards (use prepared statements, use whitelist input validation and HTML character encoding, etc). Ideally you would also have some form of source analysis to make sure that the prescriptive standards are followed. Either way, the cost is cheaper than retroactively finding and fixing, even if it isn't the markup asserted.

The really cost is when you need to do more than tweak an implementation. How much would it cost google or microsoft to change the way their web SSO to their services works because of a flaw? It isn't going to take 40 hours of work but hundreds of hours and a crap ton of testing. Design changes and architectural changes are expensive to make and it doesn't really matter what sort of software you write. The only thing that web apps have going for them is much cheaper deployment of fixes.

Jeremiah, I'm a software development manager, not a security expert, so I may come at this from a different perspective. I agree with your point that a vulnerability (a potential problem) is not the same as a bug (a manifested problem, found by the test team or the customer). Most bugs need to be fixed. But so do potential problems found in design and code reviews and by the static analysis tools - bugs that nobody has seen yet but are waiting to happen.

Since I work in financial services, I have to satisfy my customer's (and the regulator's) requirements for performance and reliability and security in addition to the functional requirements of the system. So we manage functional bugs and feature requests, performance and capacity requirements, operations needs and security vulnerabilities through the same risk management and triage process, deciding where to focus resources, deciding what to do next.

If I didn't have to consider security, things would be easier of course. But since I do, I am not going to leave it to the end: that would be risky and as Dan noted, difficult to predict and manage. To me the problem is the same as reliability and performance: if you know you need to hit an aggressive performance target from the start for example, then you need to set the stage and make performance a key driver in architecture and design, construction, testing, and infrastructure; take it into account in everything that you do. Trying to fix a major performance problem after the fact, if you got the requirements or architecture wrong, can be expensive, unpredictable and uncertain. Security is no different. Again, agreeing with Dan, it's easier to get and manage the resources and budget earlier, establishing performance and reliability and security needs and costs upfront, than to have to fight this fight later in the project - or, worse, wait until the system is in production and leave the problem to the maintenance and support team.

So it just makes sense to me to build security in from the beginning, as one of the demands on the project and the team. The tradeoff between features, and security (and performance and reliability and so on), where you are going to spend your money and time, has to be made by the project stakeholders. If it's important to my stakeholders, I am going to manage it from the beginning. It's all the same to me: I gotta do what I gotta do to make the project a success, and starting off right gives me a better chance of success.

@Jim, very well stated, thank you very much for sharing. Great perspective and makes a lot of sense. I guess the only thing I'd add is if "security" is not important to the stakeholders from the beginning, which is not to say it absolutely should be, then that might complicates matters significantly later. Thanks again, you've added a lot of value to the discussion.

I like the idea to "get the order of magnitude right", even if it is just a coarse approximation. However, the the calculation of $4000 per vulnerability is valid for applications only: The fix has to be applied on the few server sites, and that's it.

However, even today not all software is web application software.

On the other end of the spectrum there is software like e.g. Internet Explorer or Adobe Reader. I'm mentioning these two because they are pretty popular, and because both Microsoft and Adobe were taking part in the BSIMM interviews.

Now consider the cost of a rollout of a security fix for this type of software, throughout the world. Even if we say that for the vendor it's still $4000 per vulnerability (neglecting the difficult-to-estimate effect of vulnerabilities on reputation): As a customer, I do feel a difference between a vulnerability which has been removed in the development cycle (invisible to me) and a vulnerability in my local software (forcing me to apply a fix).

For the same type of software, I'd rather live with a difficult-to-exploit functional defect than with a difficult-to-exploit vulnerability. I have control over what I am doing, but not over what the bad guys will do.

I am not so sure that the calculation for the cost of fixing the bug is really accurate either, though no one has really pointed that out.

Fixing a bug costs more than just the time to find and fix it. Most large systems require extensive testing and the involvement of many people outside of the programmer.

This factor can drastically increase the overall cost of the identified flaw. I would probably multiple the values by 3 to 5, at least, to have even a tiny bit of comfort in the amount.

Of course, this still makes it not cost effective, but you cannot consider the cost of fixing a single vulnerability, even if you can identify all the instances of that single vulnerability at one time, which is highly doubtful.

Comparing that cost against a process designed to prevent a wide range of vulnerabilities is not a valid comparison. It is like comparing the cost of a bag of oranges to the cost of the output of an entire orchard. (The cost of a QA check and the cost of preventative pest control.)

The pest control may cost a whole lot more than a few bad bags, but can also eliminate the depredation that is less visible and may not be caught until much later, much like those security flaws that have not been found yet.

The numbers would be dubious no matter what, but you need to add a whole bunch of costs if you really want to say whether an SDL is cheaper than the alternative, in the long run. This has many more factors that are indicated here.

About Me

Jeremiah Grossman's career spans nearly 20 years and has lived a literal lifetime in computer security to become one of the industry's biggest names. He has received a number of industry awards, been publicly thanked by Microsoft, Mozilla, Google, Facebook, and many others for his security research. Jeremiah has written hundreds of articles and white papers. As an industry veteran, he has been featured in hundreds of media outlets around the world. Jeremiah has been a guest speaker on six continents at hundreds of events including many top universities. All of this was after Jeremiah served as an information security officer at Yahoo!