Tag: Linux

Today’s security story is people turning security off. For me, the fact that it’s even a story is the story. This particular story is covered in The Register, who explain (to nobody’s surprise) that some of the patches to fix issues identified in CPU’s (think Spectre, Meltdown, etc.) can actually slow down the applications running on them. The problem is that, in some cases, they don’t slow them down a little bit, but rather a lot. By which I mean up to 50%. And if you’ve bought expensive hardware – or rented it [1] – then you’d generally prefer it if it runs your applications/programs/workloads quickly, rather than just half as fast as they might run.

And so you turn off the security patches. Your decision: fine.

No, stop: this isn’t what has happened.

The mythical “you”, the person running the workload, isn’t the person who makes the decision, in most cases, because it’s been made for you. This is the real story.

Linus Torvalds, and a bunch of other experts in the Linux kernel[2], have decided that although the patch that could make your workloads secure is available, the functionality that does it should be “off” by default. They reason – quite correctly, in my opinion – that the vast majority of people running workloads, won’t easily be able to turn this functionality on themselves

They also reason – again, correctly, in my opinion – that most people will care more about how quickly their workloads run than about how secure they are. I’m not happy about this, but that’s the way it is.

What I worry about is the final step in the logic to making the decision. I’m going to quote Linus:

“Have you seen any actual realistic attacks for normal human users?” he asked. “Things where the kernel should actually care? The JavaScript thing is for the browser to fix up, not for the kernel to say ‘now everything should run up to 50 per cent slower.'”

I get the reasoning behind this, but I don’t like it. To give some context, somebody came up with an example attack which could compromise certain workloads, and Linus points out that there are better ways to fix this attack than fixing it in the kernel. My concerns are two-fold:

although there may be better places to fix that particular attack, a kernel-level fix is likely to fix an entire class of attacks, meaning better protection for users who are using any application which might include an attack vector.

pointing out that there haven’t been any attacks yet not only ignores the fact that there is a future out there[3] but also points malicious actors in the direction of a likely attack vector.

Now, I know that the more dedicated malicious actors are already looking for these things, but do we really need to advertise?

What’s my fix?

I don’t have one, or at least not an easy one.

Somebody, somewhere, needs to decide whether security is turned on or off. What I’d honestly like to see is an easier set of controls to allow people to turn on or off security, and to understand the trade-offs when they do that. The problems with that are:

the trade-offs are often much more complex than just “fast and insecure” or “slow and secure”, and are really difficult to explain.

in order to make a sensible decision about trade-offs, people need to understand risk. And people are awful at understanding risk.

And there’s a “chicken and egg problem”[7] here: people won’t understand risk until they are offered the chance to make decisions, but there’s little incentive to offer them complex decisions unless they understand risk.

My plea? Where possible, expose risk, and explain what it is. And if you’re turning off security-related functionality, make it easy to turn back on for those who need it.

1 – a quick heads-up: this is what “deploying to the cloud” actually is.

2 – what sits at the bottom of many of the workloads that are running in servers.

3 – hopefully. If the Three Minute Warning[4] sounds while you’re reading this, you may wish to duck and cover. You can come back to it later[6].

4 – “… sounds like this …”[5].

5 – 80s reference.

6 – or not. See [3].

7 – for non-native English readers, this means “a problem where the solution requires two pieces, both of which are dependent on each other”.

Last week, Bloomberg published a story detailing how Chinese state actors had allegedly forced employees of Supermicro (or companies subcontracting to them) to insert a small chip – the silicon in the title – into motherboards destined for Apple and Amazon. The article talked about how an investigation into these boards had uncovered this chip and the steps that Apple, Amazon and others had taken. The story was vigorously denied by Supermicro, Apple and Amazon, but that didn’t stop Supermicro’s stock price from tumbling by over 50%.

I have heard strong views expressed by people with expertise in the topic on both sides of the argument: that it probably didn’t happen, and that it probably did. One side argues that the denials by Apple and Amazon, for instance, might have been impacted by legal “gagging orders” from the US government. An opposing argument suggests that the Bloomberg reporters might have confused this story with a similar one that occurred a few months ago. Whether this particular story is correct in every detail, or a fabrication – intentional or unintentional – is not what I’m interested in at this point. What I’m interested in is not whether it did happen in this instance: the clear message is that it could have happened, and it could be happening now.

I’ve written before about State Actors, and whether you should worry about them. There’s another question which this story brings up, which is possibly even more germane: what can you do about it if you are worried about them? This breaks down further into two questions:

how can I tell if my systems have been compromised?

what can I do if I discover that they have?

The first of these is easily enough to keep us occupied for now [1], so let’s spend some time on that. First, let’s first define six types of compromise, think about how they might be carried out, and then consider the questions above for each:

supply-chain hardware compromise;

supply-chain firmware compromise;

supply-chain software compromise;

post-provisioning hardware compromise;

post-provisioning firmware compromise;

post-provisioning software compromise.

This article doesn’t provide sufficient space to go into detail of these types of attack, and provides an overview of each, instead[2].

Terms

Supply-chain – all of the steps up to when you start actually running a system. From manufacture through installation, including vendors of all hardware components and all software, OEMs, integrators and even shipping firms that have physical access to any pieces of the system. For all supply-chain compromises, the key question is the extent to which you, the owner of a system, can trust every single member of the supply chain[3].

Post-provisioning – any point after which you have installed the hardware, put all of the software you want on it, and started running it: the time during which you might consider the system “under your control”.

Hardware – the physical components of a system.

Software – software that you have installed on the system and over which you have some control: typically the Operating System and application software. The amount of control depends on factors such as whether you use proprietary or open source software, and how much of it is produced, compiled or checked by you.

Firmware – special software that controls how the hardware interacts with the standard software on the machine, the hardware that comprises the system, and external systems. It is typically provided by hardware vendors and its operation opaque to owners and operators of the system.

Compromise types

See the table at the bottom of this article for a short summary of the points below.

Supply-chain hardware – there are multiple opportunities in the supply chain to compromise hardware, but the more hard they are made to detect, the more difficult they are to perform. The attack described in the Bloomberg story would be extremely difficult to detect, but the addition of a keyboard logger to a keyboard just before delivery (for instance) would be correspondingly more simple.

Supply-chain firmware – of all the options, this has the best return on investment for an attacker. Assuming good access to an appropriate part of the supply chain, inserting firmware that (for instance) impacts network performance or leaks data over a wifi connection is relatively simple. The difficulty in detection comes from the fact that although it is possible for the owner of the system to check that the firmware is what they think it is, what that measurement confirms is only that the vendor has told them what they have supplied. So the “medium” rating relates only to firmware that was implanted by members in the supply chain who did not source the original firmware: otherwise, it’s “high”.

Supply-chain software – by this, I mean software that comes installed on a system when it is delivered. Some organisations will insist in “clean” systems being delivered to them[4], and will install everything from the Operating System upwards themselves. This means that they basically now have to trust their Operating System vendor[5], which is maybe better than trusting other members of the supply chain to have installed the software correctly. I’d say that it’s not too simple to mess with this in the supply chain, if only because checking isn’t too hard for the legitimate members of the chain.

Post-provisioning hardware – this is where somebody with physical access to your hardware – after it’s been set up and is running – inserts or attaches hardware to it. I nearly gave this a “high” rating for difficulty below, assuming that we’re talking about servers, rather than laptops or desktop systems, as one would hope that your servers are well-protected, but the ease with which attackers have shown that they can typically get physical access to systems using techniques like social engineering, means that I’ve downgraded this to “medium”. Detection, on the other hand, should be fairly simple given sufficient resources (hence the “medium” rating), and although I don’t believe anybody who says that a system is “tamper-proof”, tamper-evidence is a much simpler property to achieve.

Post-provisioning firmware – when you patch your Operating System, it will often also patch firmware on the rest of your system. This is generally a good thing to do, as patches may provide security, resilience or performance improvements, but you’re stuck with the same problem as with supply-chain firmware that you need to trust the vendor: in fact, you need to trust both your Operating System vendor and their relationship with the firmware vendor.

Post-provisioning software – is it easy to compromise systems via their Operating System and/or application software? Yes: this we know. Luckily – though depending on the sophistication of the attack – there are generally good tools and mechanisms for detecting such compromises, including behavioural monitoring.

Table

Compromise type

Attacker difficulty

Detection difficulty

Supply-chain hardware

High

High

Supply-chain firmware

Low

Medium

Supply-chain software

Medium

Medium

Post-provisioning hardware

Medium

Medium

Post-provisioning firmware

Medium

Medium

Post-provisioning software

Low

Low

Conclusion

What are your chances of spotting a compromise on your system? I would argue that they are generally pretty much in line with the difficulty of performing the attack in the first place: with the glaring exception of supply-chain firmware. We’ve seen attacks of this type, and they’re very difficult to detect. The good news is that there is some good work going on to help detection of these types of attacks, particularly in the world of Linux[6] and open source. In the meantime, I would argue our best forms of defence are currently:

for supply-chain: build close relationships, use known and trusted suppliers. You may want to restrict as much as possible of your supply chain to “friendly” regimes if you’re worried about State Actor attacks, but this is very hard in the global economy.

for post-provisioning: lock down your systems as much as possible – both physically and logically – and use behavioural monitoring to try to detect anomalies in what you expect them to be doing.

1 – I’ll try to write something on this other topic in a different article.

2 – depending on interest, I’ll also consider a series of articles to go into more detail on each.

3 – how certain are you, for instance, that your delivery company won’t give your own government’s security services access to the boxes containing your equipment before they deliver them to you?

4 – though see above: what about the firmware?

5 – though you can always compile your own Operating System if you use open source software[6].

There is a view that because Open Source Software is subject to review by many eyes, all the bugs will be ironed out of it. This is a myth.

Writing code is hard. Writing secure code is harder: much harder. And before you get there, you need to think about design and architecture. When you’re writing code to implement security functionality, it’s often based on architectures and designs which have been pored over and examined in detail. They may even reflect standards which have gone through worldwide review processes and are generally considered perfect and unbreakable*.

However good those designs and architectures are, though, there’s something about putting things into actual software that’s, well, special. With the exception of software proven to be mathematically correct**, being able to write software which accurately implements the functionality you’re trying to realise is somewhere between a science and an art. This is no surprise to anyone who’s actually written any software, tried to debug software or divine software’s correctness by stepping through it. It’s not the key point of this post either, however.

Nobody*** actually believes that the software that comes out of this process is going to be perfect, but everybody agrees that software should be made as close to perfect and bug-free as possible. It is for this reason that code review is a core principle of software development. And luckily – in my view, at least – much of the code that we use these days in our day-to-day lives is Open Source, which means that anybody can look at it, and it’s available for tens or hundreds of thousands of eyes to review.

And herein lies the problem. There is a view that because Open Source Software is subject to review by many eyes, all the bugs will be ironed out of it. This is a myth. A dangerous myth. The problems with this view are at least twofold. The first is the “if you build it, they will come” fallacy. I remember when there was a list of all the websites in the world, and if you added your website to that list, people would visit it****. In the same way, the number of Open Source projects was (maybe) once so small that there was a good chance that people might look at and review your code. Those days are past – long past. Second, for many areas of security functionality – crypto primitives implementation is a good example – the number of suitably qualified eyes is low.

Don’t think that I am in any way suggesting that the problem is any lesser in proprietary code: quite the opposite. Not only are the designs and architectures in proprietary software often hidden from review, but you have fewer eyes available to look at the code, and the dangers of hierarchical pressure and groupthink are dramatically increased. “Proprietary code is more secure” is less myth, more fake news. I completely understand why companies like to keep their security software secret – and I’m afraid that the “it’s to protect our intellectual property” line is too often a platitude they tell themselves, when really, it’s just unsafe to release it. So for me, it’s Open Source all the way when we’re looking at security software.

So, what can we do? Well, companies and other organisations that care about security functionality can – and have, I believe a responsibility to – expend resources on checking and reviewing the code that implements that functionality. That is part of what Red Hat, the organisation for whom I work, is committed to doing. Alongside that, we, the Open Source community, can – and are – finding ways to support critical projects and improve the amount of review that goes into that code*****. And we should encourage academic organisations to train students in the black art of security software writing and review, not to mention highlighting the importance of Open Source Software.

We can do better – and we are doing better. Because what we need to realise is that the reason the “many eyes hypothesis” is a myth is not that many eyes won’t improve code – they will – but that we don’t have enough expert eyes looking. Yet.

* Yeah, really: “perfect and unbreakable”. Let’s just pretend that’s true for the purposes of this discussion.

** …and which still relies on the design and architecture actually to do what you want – or think you want – of course, so good luck.

*** nobody who’s actually written more than about 5 lines of code (or more than 6 characters of Perl)