OWASP Top 10 Lists are Art, Not Science

I attended the OWASP Top 10 data capture discussion session today at #OWASPSummit2017 in London.

Andrew van der Stock lead the session and was asking how we could improve the data collection process, and what we thought the obstacles were to people submitting their data.

My answer was that the drama around OWASP “data” collection comes from we project leaders perpetuating the false belief that what we’re collecting is good data, and that what we’re doing with it is science.

It’s not science. Not even close.

The former project leader talked about the quality of the math in the Excel spreadsheet, and said if we had an issue with the results that he welcomed us looking at the formula for a given calculation and making recommendations.

But as I explained, this is still missing the point.

When vendors submit datasets they’re HIGHLY skewed. They’re skewed because they favor one kind of vulnerability because of their business. They’re skewed because their training focused on certain vulnerability types and not others. They’re skewed because they send their data in completely different ways.

And then a small group of smart volunteers do their best with it. They review what’s submitted. They try to normalize it somehow. They try to remove the bias. And they end up with something they call a release candidate.

Nothing wrong with that. That’s the process we have, and the results tend to be helpful.

The problem is we don’t tell people this is what we do. What we tell them is that we are EXPERTS and we have DATA and we use SCIENCE to produce the results!

And so people think that if they had just submitted that one other dataset, or if their competitor wouldn’t have submitted more data, or if they had just had that one intern to submit from their vulnerability database—that it would have somehow ended up different.

It probably wouldn’t.

The truth about these projects (and I’ve been on a few) is that the teams tend to be highly confused about what the goal ultimately is. Are we listing risks? Are we trying to help developers avoid mistakes? Are we helping CISOs prioritize their AppSec program?

What’s the goal with these lists?

The former OWASP Top 10 leader said his summary is,

Helping a CISO understand what to focus on to reduce the most application security risk.

I like that. It’s simple and easy to communicate.

So then someone smart in the session today asked a great question:

If the goal is identifying the top risks for CISOs, why aren’t we using breach data?

But why are we only collecting random sets of vulnerabilities from random vendors? Especially when we’re not able to use that data to actually produce a list anyway. We’re sort of arbitrarily deciding, as a group, what the list should look like and using the data to guide that.

So why not use more inputs?

I use breach “data” to guide my testing methodologies, my risk rating methodologies, and similar systems. This type of input tells us what is actually working for attackers, and what is hurting us most and most often.

That’s good input, and we should be using it for OWASP lists as well.

The obsession with vulnerability data is due to the flawed belief that the data, combined with rock-solid algorithms, are producing the list themselves. They’re not. It’s people. Smart and dedicated people trying to do the right thing. The sporadic and biased data (combined with our pet formulas in Excel) are nothing but jazz hands hiding this fact.

So the problem isn’t the data submission. It’s not the form. It’s not the time you have to submit. It’s not any of that.

The problem is not being honest about how this sausage is made.

It’s a few security experts doing the best they can with some limited data and a spreadsheet. And the results are usually pretty damn good.

So let’s be honest about that. And let’s get more experts to give their opinions.

Summary

Clearly describe the process of building the list, which might include some sort of analysis of what was submitted, but shouldn’t make it sound like it was scientific.

Take more inputs as to what should be on the list. We need input from the field on what people are seeing, and that isn’t always going to come in the form of vulns. If it’s opinion that’s ultimately making up the list, let’s get more opinions.

Let’s also add breach data (and other types of data) to what we look at when arriving at those opinions.

Ultimately the answer here is transparency, and taking ourselves a bit less seriously.

It’s art, and it’s hard, and the people running these projects deserve our admiration. But let’s not pretend it’s something that it isn’t. It creates problems that could be avoided.