Matthew Garrett

Subverting security with kexec

Kexec is a Linux kernel feature intended to allow the booting of a replacement kernel at runtime. There's a few reasons you might want to do that, such as using Linux as a bootloader[1], rebooting without having to wait for the firmware to reinitialise or booting into a minimal kernel and userspace that can be booted on crash in order to save system state for later analysis.

But kexec's significantly more flexible than this. The kexec system call interface takes a list of segments (ie, pointers to a userspace buffer and the desired target destination) and an entry point. The kernel relocates those segments and jumps to the entry point. That entry point is typically code referred to as purgatory, due to the fact that it lives between the world of the first kernel and the world of the second kernel. The purgatory code sets up the environment for the second kernel and then jumps to it. The first kernel doesn't need to know anything about what the second kernel is or does. While it's conventional to load Linux, you can load just about anything.

The most important thing to note here is that none of this is signed. In other words, despite us having a robust in-kernel mechanism for ensuring that only signed modules can be inserted into the kernel, root can still load arbitrary code via kexec and execute it. This seems like a somewhat irritating way to patch the running kernel, so thankfully there's a much more straightforward approach.

The beauty of this approach is that it doesn't rely on any kernel bugs - it's using kernel functionality that was explicitly designed to let you do this kind of thing (ie, run arbitrary code in ring 0). There's not really any way to fix it beyond adding a new system call that has rather tighter restrictions on the binaries that can be loaded. If you're using signed modules but still permit kexec, you're not really adding any additional security.

But that's not the most interesting way to use kexec. If you can load arbitrary code into the kernel, you can load anything. Including, say, the Windows kernel. ReactOS provides a bootloader that's able to boot the Windows 2003 kernel, and it shouldn't be too difficult for a sufficiently enterprising individual to work out how to get Windows 8 booting. Things are a little trickier on UEFI - you need to tell the firmware which virtual→physical map to use, and you can only do it once. If Linux has already done that, it's going to be difficult to set up a different map for Windows. Thankfully, there's an easy workaround. Just boot with the "noefi" kernel argument and the kernel will skip UEFI setup, letting you set up your own map.

Why would you want to do this? The most obvious reason is avoiding Secure Boot restrictions. Secure Boot, if enabled, is explicitly designed to stop you booting modified kernels unless you've added your own keys. But if you boot a signed Linux distribution with kexec enabled (like, say, Ubuntu) then you're able to boot a modified Windows kernel that will still believe it was booted securely. That means you can disable stuff like the Early Launch Anti-Malware feature or driver signing, or just stick whatever code you want directly into the kernel. In most cases all you'd need for this would be a bootloader, kernel and an initrd containing support for the main storage, an ntfs driver and a copy of kexec-tools. That should be well under 10MB, so it'll easily fit on the EFI system partition. Copy it over the Windows bootloader and you should be able to boot a modified Windows kernel without any terribly obvious graphical glitches in the process.

And that's the story of why kexec is disabled on Fedora when Secure Boot is enabled.

[1] That way you only have to write most drivers once[2] The address section finds the address of the sig_enforce symbol in the kernel, and the value argument tells the dummy code what value to set that address to. --load-preserve-context informs the kernel that it should save hardware state in order to permit returning to the original kernel. --mem-max indicates the highest address that the kernel needs to back up. /bin/true is just there to satisfy the argument parser.

Wouldn't it be possible to check the signature of any code that is uploaded via kexec? In fact any time a memory page is made executable is a good time to check that the code is authorized for execution in the particular context.

Yay - self modifying code :). I'd expect the kernel command line to be signed as well, though - it's an obvious attack vector.

Actually, this got me thinking - chunks of the kernel can get paged out to disk (presumably after the signature is verified). Is the signature checked again when the vmm pages it back in? Or could I try writing to /dev/hda to subvert things?

There's a huge amount of additional complexity required to support paging of kernel components, and getting it wrong results in hilarious deadlocks. Nobody thought it was worth the effort back when the kernel took up a significant portion of your RAM - it's even less interesting to implement now.

If you've verified a signature on the entire kexec image before kexecing it, which should be entirely plausible to do, then purgatory shouldn't matter: you'd only sign kexec images that don't do naughty things.

This appears to me as another nice argument for authenticated/trusted/TPM-based boot over secure/restricted boot.

In authenticated boot, you can test your boot chain a posteriori and decide at the end if it meets your security policy.In secure boot, you need to restrict the capabilities for reaching insecure system-state a priori and for every boot step regarding your security policy.

This also leads to being able to choose and switch policies in the first case, whilst in the second case you can have only one single policy forever.

Feel free to contact me on that topic if you'd like to discuss it further. Cheers, Andreas Fuchs at Fraunhofer SIT

Yes, if you have some means of performing attestation then this attack can be identified. But you need some mechanism for performing that attestation, which is a far from solved problem.

The policy case is actually an interesting one. The only policy imposed by Secure Boot itself is the firmware→bootloader handoff. The bootloader is free to impose any policy it wants, and in fact Shim takes advantage of that - it transitions the root of trust from the firmware keys to a separate key database. As long as you're willing to put up with some bridge code, you can impose any policy you want.

I went ahead and added support for measuring files to TPM PCRs to Shim so it now measures the next boot loader prior to execution. It's hardwired right now but was easy enough to implement. It helps extend the chain of trust upwards and enable further attestation.

SecureBoot spec requires that no unsigned code is executed before ExitBootServices() is called.

The difference between before and after call is that EfiBootServicesCode & EfiBootServicesData memory types are unloaded and become available to be freely used by the EfiLoaderCode/EfiLoaderData. I don't see a way to set the memory maps before ExitBootServices, and it seems that BootServices update the memory map and one needs to retrieve latest/current one with GetMemoryMap(). As the key to current MemoryMap is required parameter that needs to match for ExitBootServices code to succeed. After ExitBootServices(), hell breaks loose and one can execute unsigned code. And if the signed binary that does ExitBootServices() is actually somehow rouge or executes unsigned code, its signature can be revoked.

Above seems to rely on a linux kernel (the one that is used as a bootloader) signed by the Trusted key. Is there such a kernel/signature available?

On Ubuntu, no unsigned code is executed before ExitBootServices() call. It is called before kernel is loaded. Why bother with kexec, when one can boot unsigned kernel? =) That was an a design decision, to make sure, for now, that one can easily boot custom / modified kernels.

Not entirely true, grub in Ubuntu will only call ExitBootServices() if the kernel it's loading is unsigned. If the kernel is signed, it's booted with the boot services and will call ExitBootServices() itself once it's done initializing.

What does ExitBootServices() have to do with anything? If Ubuntu allow you to launch entirely unsigned kernels then that seems very much like Ubuntu's problem - it's trivial to just use the Ubuntu bootloader to compromise Fedora, for instance. I'd recommend against this design decision.

The owner of the system should always be able to load arbitrary code into the kernel, and the solution we've implemented permits that. Arbitrary privileged userspace shouldn't be able to load arbitrary code into the kernel unless the owner of the system has explicitly permitted that.

About Matthew

Power management, mobile and firmware developer on Linux. Security developer at CoreOS. Member of the Linux Foundation Technical Advisory Board and the Free Software Foundation board of directors. Ex-biologist. @mjg59 on Twitter. Content here should not be interpreted as the opinion of my employer.