January 29, 2012

Why Threat Modelling fails in practice

There's some malignancy in the way the Internet IT Security community approached security in the 1990s that became a cancer in our protocols in the 2000s. Eventually I worked out that the problem with the aphorism What's Your Threat Model (WYTM?) was the absence of a necessary first step - the business model - which lack permitted threat modelling to be de-linked from humanity without anyone noticing.

But I still wasn't quite there, it still felt like wise old men telling me "learn these steps, swallow these pills, don't ask for wisdom."

In my recent risk management work, it has suddenly become clearer. Taking from notes and paraphrasing, let me talk about threats versus risks, before getting to modelling.

A threat is something that threatens, something that can cause harm, in the abstract sense. For example, a bomb could be a threat. So could an MITM, an eavesdropper, or a sniper.

But, separating the abstract from the particular, a bomb does not necessarily cause a problem unless there is a connection to us. Literally, it has to be capable of doing us harm, in a direct sense. For this reason, the methodologists say:

Risk = Threat * Harm

Any random bomb can't hurt me, approximately, but a bomb close to me can. With a direct possibility of harm to us, a threat becomes a risk. The methodologists also say:

Risk = Consequences * Likelihood

That connection or context of likely consequences to us suddenly makes it real, as well as hurtful.

A bomb then is a threat, but just any bomb doesn't present a risk to anyone, to a high degree of reliability. A bomb under my car is now a risk! To move from threats to risks, we need to include places, times, agents, intents, chances of success, possible failures ... *victims* ... all the rest needed to turn the abstract scariness into direct painful harm.

We need to make it personal.

To turn the threatening but abstract bomb from a threat to a risk, consider a plane, one which you might have a particular affinity to because you're on it or it is coming your way:

⇒ people dying⇒ financial damage to plane
⇒ reputational damage to airline⇒ collateral damage to other assets⇒ economic damage caused by restrictions⇒ war, military raids and other state-level responses

Lots of risks! Speaking of bombs as planes: I knew someone booked on a plane that ended up in a tower -- except she was late. She sat on the tarmac for hours in the following plane.... The lovely lady called Dolly who cleaned my house had a sister who should have been cleaning a Pentagon office block, but for some reason ... not that day. Another person I knew was destined to go for coffee at ground zero, but woke up late. Oh, and his cousin was a fireman who didn't come home that day.

Which is perhaps to say, that day, those risks got a lot more personal.

We all have our very close stories to tell, but the point here is that risks are personal, threats are just theories.

Let us now turn that around and consider *threat modelling*. By its nature, threat modelling only deals with threats and not risks and it cannot therefore reach out to its users on a direct, harmful level. Threat modelling is by definition limited to theoretical, abstract concerns. It stops before it gets practical, real, personal.

Maybe this all amounts to no more than a lot of fuss about semantics?

To see if it matters, let's look at some examples: If we look at that old saw, SSL, we see rhyme. The threat modelling done for SSL took the rather abstract notions of CIA -- confidentiality, integrity and authenticity -- and ended up inverse-pyramiding on a rather too-perfect threat of MITM -- Man-in-the-Middle.

We can also see from the lens of threat analysis versus risk analysis that the notion of creating a protocol to protect any connection, an explicit choice of the designers, led to them not being able to do any risk analysis at all; the notion of protecting certain assets such as credit cards as stated in the advertising blurb was therefore conveniently not part of the analysis (which we knew, because any risk analysis of credit cards reveals different results).

Threat modelling therefore reveals itself to be theoretically sound but not necessarily helpful. It is then no surprise that SSL performed perfectly against its chosen threats, but did little to offend the risks that users face. Indeed, arguably, as much as it might have stopped some risks, it helped other risks to proceed in natural evolution. Because SSL dealt perfectly with all its chosen threats, it ended up providing a false sense of false security against harm-incurring risks (remember SSL & Firewalls?).

OK, that's an old story, and maybe completely and boringly familiar to everyone else? What about the rest? What do we do to fix it?

The challenge might then be to take Internet protocol design from the very plastic, perfect but random tendency of threat modelling and move it forward to the more haptic, consequences-directed chaos of risk modelling.

Or, in other words, we've got to stop conflating threats with risks.

Critics can rush forth and grumble, and let me be the first: Moving to risk modelling is going to be hard, as any Internet protocol at least at the RFC level is generally designed to be deployed across an extremely broad base of applications and users.

Remember IPSec? Do you feel the beat? This might be the reason why we say that only end-to-end security cuts the mustard, because end-to-end implies an application, and this draw in the users to permit us to do real risk modelling.

It might then be impossible to do security at the level of an Internet-wide, application-free security protocol, a criticism that isn't new to the IETF. Recall the old ISO layer 5, sometimes called "the security layer" ?

But this doesn't stop the conclusion: threat modelling will always fail in practice, because by definition, threat modelling stops before practice. The place where users are being exposed and harmed can only be investigated by getting personal - including your users in your model. Threat modelling does not go that far, it does not consider the risks against any particular set of users that will be harmed by those risks in full flight. Threat modelling stops at the theoretical, and must by the law of ignorance fail in the practical.

Risks are where harm is done to users. Risk modelling therefore is the only standard of interest to users.

I've several times mentioned being called in as consultants to small client/server startup that wanted to do payment transactions on the server; they had also invented "SSL" they wanted to use and it is now frequently called "electronic commerce".

Somewhat as a result, in the mid-90s, we were invited to participate in the x9a10 financial standard working group which had been given the requirement to preserve the integrity of the financial infrastructure for *ALL* retail payments. We did some amount of detailed, end-to-end, thread & vulnerability studies of all the different kinds of retail payments (skimming, evesdropping, data breaches, point-of-sale, internet, unattended, face-to-face, etc).

The result was the x9.59 financial transaction standard. It did nothing to address/prevent the skimming, evesdropping, data breaches and other kinds of threats ... where the attacker uses the information to perform fraudulent financial transactions. What it did do was make the information useless to the crooks. Some past referenceshttp://www.garlic.com/~lynn/x959.html#x959

Now the major use of "SSL" in the world today has been hiding transaction information (during transmission) as countermeasure to attackers harvesting the information for fraudulent financial transactions. x9.59 doesn't stop or prevent the harvesting ... it just eliminates the usefulness of the information to the (eliminates their ability to use the information for fraudulent financial transactions) ... and therefor eliminates the need to hide transactions information and also eliminates that use of SSL

later we consult with the two people from ellison's conference room meeting (now responsible for "commerce server"). now part of the electronic commerce was something called payment gateway ... handled financial transactions between webservers and and payment networks ... some past postshttp://www.garlic.com/~lynn/subnetwork.html#gateway

the gateway was replicated for no-single-point-of-failure with multiple connections into different parts of the internet backbone ... as well as multiple layers of hardening against lots of kinds attacks. One of the scenarios was that hardening required casting lots of things in concrete as countermeasure to various kinds of attacks (involving corruption of service) while at the same time being able to agile change and adapt when there were various kinds of failures.

One of the problems I've repeatedly seen with threat modeling is people take it too personaly. That is their bias is towards what frightens them for whatever reason not what actually might cause them harm. We see this in driving-v-flying, where people would rather drive 1000Km rather than fly, because they feel that they cannot be blown out of the sky by terrorists. But a look at the normalised risk of driving-v-flying would suggest they have no real perception of the the actual probabilities of injury or death involved.

Another problem I've seen often is being to specific about individual threats as opposed to the class of threat they fall into. One result is no action is taken, because individual threats are below some threashold, but if the individual threats are amalgamated into a class of threat the threashold for action may be easily surpassed.

Another asspect of using classes of risk rather than individual risks is "getting mitigation for free", if you look at any business there is a reasonable risk of fire, but for the norm of businesses the risk of a bomb is very very small. However the mitigation for both is almost identical "inform the authorities and evacuate" thus a tiny change in the proceadure covers both eventualities. As you mentioned 9/11 it shows this actually happening for real. In one business they very regularly carried out the "fire drill" and practiced the evacuation. Now few if any businesses had the threat of "aircraft strike" in their assessments and this business did not either. However when an "aircraft strike" did happen the well practiced fire drill evacuation was followed and the survival rate for that company was a lot lot higher than for similarly placed businesses that did not do regular fire drill practice.

The use of classes of threat can also be a good indicator of when your mittigation strategy is wrong. As mentioned by one of the other commentors sometimes you have to have a serious re-think and thus a compleate change of direction.

But that asside when it comes to ICT security there is a very real difference between threats to tangable physical objects and threats to intangible information objects. And all to often you see the wrong analysis being used to assess the risk. That is underlying axioms (assumptions) that only hold in the physical world get applied without question to the information world.

The most obvious example of this is the difference in theft. You usually know when a physical object has been subject to theft because you nolonger have it in your possession. Not so with an information object, when it is subject to theft what is actually stolen is a copy of the information, therefore you still have the original form of the information so remain blissfully ignorant of the theft untill some other event acts as an indicator that a theft might have occurred.

The next most obvious is locality or distance, for the theft of tangable objects you have to be local to the object to steal it. With intangable objects you can be around the other side of the globe, because you coopt the victims systems into acting as the local thief.

Then there are "force multipliers" in the physical world you are constrained by physical limitations one of which is how much physical work you can do. The physical world solution to this problem is to use energy to build machines that use more energy to do greater levels of work. Thus the limit on what an individual can do in the physical world is constrained by their access to energy and the manufacture of force multiplying tools. In the inatangable world force multipliers are just information, the cost of duplicating them is minimal and from an attackers point of view virtually zero because they don't pay for the energy or systems involved in the duplication or operation of the force multiplier.

Failure to recognise the axioms derived from the physical world and their coresponding limitations when applied to the information world will lead to the failur of any measures that rely on the limitations imposed by those axioms.

How could we meaningfully model individual users' risks when we develop standard protocols or COTS software? These will be used in a variety of contexts with a broad range of risk profiles. I think threat modeling makes sense in this context, be it only as a set of documented design assumptions that the individual user could match against a risk profile. If done properly (I just hypothesize that it could be done properly), a threat model may even constitute a risk model, only at a higher level of abstraction, specifying the "average" or "typical" risks that prevail in the user population. If we interpret the relationship between threats and risks this way, the crucial question becomes how much variance there is in the risk profiles and how we can find out how close to or how far from the threat model an individual user is.