Posted
by
samzenpus
on Friday August 07, 2015 @07:57AM
from the protect-ya-neck dept.

jfruh writes: Security researcher Christopher Domas has demonstrated a method of installing a rootkit in a PC's firmware that exploits a feature built into every x86 chip manufactured since 1997. The rootkit infects the processor's System Management Mode, and could be used to wipe the UEFI or even to re-infect the OS after a clean install. Protection features like Secure Boot wouldnt help, because they too rely on the SMM to be secure.

AMD really it was about tightening up communication's between the C.P.U. and ram by having the Memory controller on die (L2 Cache level of the 2nd core of the am2 athlon x2 processor but it must have been there before that because of the single core processors before dull core became a thing.) so it could effect amd computers back to 2005 ish. does that even sound right?

Design flaw my ass. I bet it was there deliberately and everybody knows who originally requested it. I just love the good ol US of A.

From the article linked:

"To exploit the vulnerability and install the rootkit, attackers would need to already have kernel or system privileges on a computer. That means the flaw cant be used by itself to compromise a system, but could make an existing malware infection highly persistent and completely invisible."

This doesn't let an outsider break into the system; it is a flaw that only is useful if you have already compromised the machine.

Like Windows, Linux is a complex rambling Swiss cheese and privilege escalations are pretty common.

Lean security protocols need to come first, which is why Qubes OS [qubes-os.org] is based on a Type 1 hypervisor (Xen). An attacker can try to use an exploit (like in OP) all they want in an untrusted domain, but they aren't going to get access to the hardware (or the other VMs, unless the user has done something to specifically expose those VMs to the attack).

Lean security protocols need to come first, which is why Qubes OS [qubes-os.org] is based on a Type 1 hypervisor (Xen). An attacker can try to use an exploit (like in OP) all they want in an untrusted domain, but they aren't going to get access to the hardware (or the other VMs, unless the user has done something to specifically expose those VMs to the attack).

This assumes there is a security layer that is free of exploitable bugs and that there is no way to influence the lower security layers in a way that can exploit bugs in those layers.

That's a very big assumption unless the security layer you are talking about AND all lower security layers are all so simple that the code can be proven bug-free by inspection.

SMM, a.k.a. Ring -1, has been present for a long time, and does what the name says, it allows for things like emergency power-shutdown handling ("you have 50ms to sync system state before we can't guarantee power quality any more"). Yes, it's Ring -1, and you have to be careful how you misuse it, but the fact that it works as documented is hardly a new security flaw, this was documented as a security concern at least 15 years ago.

Just read the WP, it points out an ancient APIC compatibility hack that allows you to escalate from Ring 0 to Ring -1 (SMM). So in other words if you're already running at Ring 0 to start with, you can get into SMM. Sounds like an example of what Raymond Chen [msdn.com] calls an "other side of the airtight hatchway" attack, you already have to have complete system privs in order to carry out a privileged attack.

My understanding is that SMM is used, before all the TCG stuff about Secure Boot, etc., basically to control fans and shut down the system if the temperature is too high. And also to make USB keyboards appear as PS/2 hardware to DOS.

Are those functions really so expensive that they couldn't be offloaded to hardware on a chipset instead of trying to have the main CPU in your system act like it's own hardware watchdog?

To act as a mouse visible to DOS, it has to interact with the system interrupt tables. Remember the TSR days of old? You're putting stuff into main memory to have it executed whenever a certain interrupt happens. Which memory? Well, you need at least the USB Host Controller areas, plus something in low memory if you want it available to the BIOS.

Controlling fans, monitoring temperature, issuing safe shutdown commands etc.? Again all happens by talking to the main processo

Because that core would STILL NEED to interface with main memory just the same. It would still need to access the same hardware as the main processor does. It would still need to operate at the level it requires to do those operations such that they are visible to the main processor - and that's what SMM does!

All you've done is replace an in-die kind of SMM with an external chip that needs more complicated routing, all kinds of interactions with main memory (at DMA speed, no less) and peripheral buses, et

My understanding is that SMM is used, before all the TCG stuff about Secure Boot, etc., basically to control fans and shut down the system if the temperature is too high. And also to make USB keyboards appear as PS/2 hardware to DOS.

Intel uses the chip of the keyboard to fix an issue with memory managment, don't want to mess with the keyboards.

My understanding is that SMM is used, before all the TCG stuff about Secure Boot, etc., basically to control fans and shut down the system if the temperature is too high. And also to make USB keyboards appear as PS/2 hardware to DOS.

Intel uses the chip of the keyboard to fix an issue with memory managment, don't want to mess with the keyboards.

Just read what was posted and the reply, it's more so important in a DOS environment as it's a gap in the first meg of accessible memory that's the issue.

They remap the LAPIC to overlap the SMM memory region which makes data loads of the SMM code fetch values from the LAPIC registers instead of from memory.Here [blackhat.com] you can find the slides and the whitepaper of the Black Hat conference talk.

System Management Mode is a feature. It's meant to render separate processors unnecessary for tasks like temperature management and system specific keyboard shortcuts. These functions need to work even if an unsupported or no operating system is running. Consequently SMM behaves almost like a separate processor. That's not a flaw, that's necessarily so.

The problem isn't SMM per se. It's that there is no way to be sure what code is executing in SMM, because there is no way to guarantee which firmware the system is running. Basic firmware should be in ROM (not flash. Read Only Memory.) And it should only do one thing: Load the actual firmware from a removable medium, like a micro SD card. With all writable storage in the system accessible to external inspection, there would at least be a chance to find and reliably remove infections.

System Management Mode is a feature. It's meant to render separate processors unnecessary for tasks like temperature management and system specific keyboard shortcuts. These functions need to work even if an unsupported or no operating system is running. Consequently SMM behaves almost like a separate processor. That's not a flaw, that's necessarily so.

Well, the purpose of SMM mode is way back in ancient history, when PCs used DOS.

Back then "Power Management" was actually done by the system firmware - it to

It's all part of the bizarre non-design of the PC. The bootloader was always given far too much responsibility, compare to real computers that actual designers and you never see a boot system so bloated as the PC. There should never be a "need to work even if an unsupported or no operating system is running" feature.

"To exploit the vulnerability and install the rootkit, attackers would need to already have kernel or system privileges on a computer."

You know, even without this particular SMM attack vendor, a hacker who already has system level privileges on your PC renders your PC totally insecure, besides he also can... rewrite BIOS or various firmware components of your PC to allow his code to survive an HDD wipe.

The article is (as expected) light on details since this is newly disclosed. I've had machines where the BIOS would require confirmation from a connected PS/2 keyboard before certain changes were written. Added a need for physical access in order to write anything to SMM. All the terms have changed but it seems the same principle here. If I can update the firmware, I can keep a machine compromised forever.

Why is all the stuff broke? Why does all the stuff have holes in it? Why isn't there any stuff that isn't broke?

Because it's too complicated. There are too many possible failure modes and many of them can't be seen without a large effort to see them. About the only thing that might eliminate the holes is formal proofs, but that requires not only a complete revamp of how we code but makes coding itself immensely more difficult.

ARM processors from now on. All this stuff is broke.

ARM processors are just as broke as everything else. There's just fewer people looking to uncover the holes.

ARM processors are just as broke as everything else. There's just fewer people looking to uncover the holes.

Fewer yes, but some are looking [blogspot.fr].
The bug in SnapDragon TrustZone implementation described in the previous link has been fixed BTW. Now what percentage of SnapDragon based smartphones in the field include the fix is anyone guess.

Robert P. Colwell _The Pentium Chronicles_, p159-160:"For most of the Pentium design project, the floating point divider was exactly the same as the 486's. But late in the Pentium project, upper management requested that the entire project search for ways to make the die smaller....the engineers working on the floating point divider did... an idea to save some space in a lookup table and one of them performed an analytical proof... That proof turned out to be flawed, but the insidious side effect of having

Formal proofs (of correctness, I assume) can't eliminate bugs or security flaws, though they are a cost-inefficient way to reduce bugs. A formal proof is only solving the same problem in two different languages (one the language of the formal proof), and diffing the result. It's not better or worse than any other static analysis tool, per se. It certainly won't help at all when the component is insecure by design, which is so often the problem. (Why does a document format need a way to execute arbitrary

You haven't understood the point.The formal proof abstracts away implementation details which are irrelevant for correctness. For this reason, It is much simper to understand than the actual implementation. And it does not solve the same problem.

The proof may show that the implementation performs a certain function according to a certain specification. Knowing this rules out a lot of bugs in the implementation. E.g. a sorting function can be shown to return a correctly sorted list. Once you have a formal proof of this property you do not need to worry about any pesky implementation details of this sorting function anymore. The actual implementation could really complicated because it is highly optimized and has many special cases which may make it

If it can spot a buffer overrun during the sort (even though the correct result is achieved), then at least some value is added - but there are several static analysis tools.

Otherwise, you're just writing the same code in 2 different languages - one high level and one low level, and proving they are functionally equivalent. (In which case, why not just switch to the high level language for production). The whole idea just seems like a high-level language someone was too lazy to write a compiler for, so al

Qubes OS uses a Type 1 hypervisor to simplify and harden system security against such vulnerabilities. The privileged parts of the system are kept relatively small and aren't used for any user applications. All apps and even some drivers (like NICs) are assigned to VMs, which the user can give different trust/risk designations and color codes.

Because isolating hardware is considered part of the solution, Qubes systems need IOMMU hardware to operate securely. But this high degree of isolation is what elimina

Why is all the stuff broke? Why does all the stuff have holes in it? Why isn't there any stuff that isn't broke? ARM processors from now on. All this stuff is broke.

To a computer there is no difference between "good instructions" and "bad instructions". Any ability to update or improve existing code is also a vector for getting infected by malicious code. You can either allow updates and risk infection, or you can hard code the firmware and disallow updates, but then you're stuck with whatever the firmware is at the outset.

It's not broke. It's just upgradable. Unless you have solid protocols to control who can upgrade and what upgrades are applied, you are at ri

"ARM processors from now on" BWAHAHAHAHAHAHAHA. Good one. ARM is a joke compared to Intel. No company spends more on chip research and design than intel. Further ARM is supported by a ton of REALLY REALLY insecure operating systems. iOS and Android are both far more leaky by default than this exploit.

Three questions: 1) Is it possible to fix this with a downloadable firmware patch? 2) Will such a patch be forthcoming from Intel and/or AMD? 3) Until then, is there any way to protect my x86 machines, other than the obvious "avoid suspicious files" approach?

I did. 1) not stated [Intel is working on firmware patches; but to what extent? for every x86 processor ever made since 1997?]; 2) not stated; 3) not stated. I was hoping somebody here would have some more detailed information.

In the talk he said it was Sandy Bridge and older. Ivy Bridge/Haswell/Broadwell/Sky Lake are not affected. Ivy Bridge was apparently released in 2012 - https://en.wikipedia.org/wiki/... [wikipedia.org] But 1997-2012 is still a decent window of time.
In the talk he also said that it's un-patchable (it's not, the SMI handler can check whether the APIC overlaps the SMM range and change it)
He also said SMM controls every instruction from the boot. It doesn't. Maybe on the crappy Acer netbooks that he said he was using for tests. But on enterprise grade systems from Dell, Lenovo, or HP, they use "protected range registers" to stop SMM from being able to write to the code in the firmware.
It's a good find, but he's got a lot to learn about firmware still.

SMM was a "nice" idea in more timid times. It let unscrupulous vendors emulate missing hardware features with (usually poorly written) firmware. I had quite enough head-banging when trying to implement realtime audio I/O on systems that turned out to emulate sound blaster and other industry standards.

Simple way to avoid the problem on Macs... don't load BootCamp, and you won't have SMM on the systems you load under bootcamp.

Mac OS X itself doesn't use SMM. Instead, it uses a PE (Platform Expert) module that loaded as part of the OS, which knows in detail about the hardware platform it's going to be running on. Without bootcamp, there's not even ACPI support, since power management is implemented in a much more discrete level of steps than the 4 which ACPI provides.

I've been saying for years that computers should have a hardware reset button or (for chips) a pin that restores them to a known factory state. If the button is pressed or the pin is set during initial power-on from a cold boot, the factory reset occurs. Any "infected" code will never get a chance to take control before the reset is finished.

Obviously now I'm going to have to extend that recommendation to any system or subsystem - including the CPU - which can be reprogrammed or save state in a way that su