Tuesday, 20 October 2009

System Management Mode is Evil

System Management Mode (SMM) is an evil thing. It's a feature that was introduced on the Intel 386SL and allows an operating system to be interrupted and normal execution to be temporarily suspended to execute SMM code at a very high priviledge level. It's normally configured at boot-time by the BIOS and the OS has zero knowledge about it.

The System Management Interrupt (SMI) causes the CPU to enter system management mode, usually by:

The processor being configured to generate a SMI on a write to a predefined I/O address.

Signalling on a pre-defined pin on the CPU

Access to a predefined I/O port (port 0xB2 is generally used)

SMM has been used to catch chipset errors, handle system failures such as CPU overheating, perform fan control, emulate hardware, and even run rootkits(!)

SMM cannot be masked or overridden which mean an OS has no way of avoiding being interrupted by the SMI. The SMI will steal CPU cycles and modify CPU state - state is saved and restored using System Management RAM (SMRAM) and apparently the write-back caches have to be flushed to enter SMM. This can mess up real-time performance by adding in hidden latencies which the OS cannot block. CPU cycles are consumed and hence Time Stamp Counter (TSC) skewing occurs relative to the OS's view of CPU timing and generally the OS cannot account for the lost cycles. One also has to rely on the SMM code being written correctly and not interfering with the state of the OS - weird un-explicable problems may occur if the SMM code is buggy.

By monitoring the TSC one can detect if a system is has entered SMM. In face, Jon Masters had written a Linux module to do this.

Processors such as the MediaGX (from which the Geode was derived) used SMM to emulate real hardware such as VGA and the Sound Blaster audio, which is a novel solution, but means that one cannot reliably do any real-time work on this processor.

The worrying feature of SMM is that it can be exploited and used for rootkits - it's hard to detect and one cannot block it. Doing things behind an OS's back without it knowing and in a way that messes with critical timing and can lead to rootkit exploits is just plain evil in my opinion. If it was up to me, I'd ban the use of it completely.

For those who are interested at looking at an implementation of a SMM handing code, coreboot has some well written and commented code in src/cpu/x86/smm. Well worth eyeballing. Phrack has some useful documentation on SMM and generating SMI interrupts, and mjg59 has an article that shows an ACPI table that generates SMI interrupts by writing to port 0xb2.

"SMIs are particularly problematic since they switch the processor into system management mode (SMM), which has a high context switch cost (1). The transition to SMM is also invisible to the OS and may involve large amounts of processing before resuming normal operation. This can lead to bad behavior like video playback skipping, network packet loss due to timeouts, and missed deadlines for OS timers, which require high precision."