Hello internet dudes. I am privileged to be presenting my EFI rootkit research at Black Hat USA this year in scorching Las Vegas. If you’re going to be at the conference come along and check out my talk on Thursday July 26, and/or hit me up on twitter. I’ll be in town for DEF CON as well, of course.

I’ll be talking about some of the same stuff that I talked about at SyScan - how EFI can be used in a Mac OS X rootkit, how the kernel payload can work, etc - but I’ll also be talking about and demonstrating a pretty sweet new attack, so stay tuned. I’ll upload the slides for the presentation and the white paper as soon as I’ve finished presenting and have had 2 beers.

Update: I tweeted the links to the materials shortly after my talk, but here they are: Slides / Paper.

Got an Mac? I need your help. I want to find out a bit more about the hardware that’s in various Intel Macs – specifically about built-in PCI devices with onboard expansion ROMs. You can help me out by sending me the output of lspci -vv on your Mac. The only catch is you need to install a kernel extension to do it. But don’t worry, it’s software written by the CoreBoot dudes! They’re trustworthy, I swear!

EDIT2: I’ve had a couple of people having trouble building DirectHW. I’ve uploaded binaries here and here if you don’t want to build/have trouble building from source (and trust me).

The mission is dangerous, but if you’re ready, willing and able, this is what you need to do:

You’ll now have those two disk images created at the paths displayed at the end of the make processes. Install the packages in each of the DMGs, and then load the kernel extension:

$ sudo kextload /System/Library/Extensions/DirectHW.kext

EDIT: Oh I forgot, please turn off the energy saver setting (on laptops) to automatically switch graphics or whatever, so the non-integrated graphics card is powered on. You might need to log out and back in before it is powered on.

SyScan 2012 was a blast. I talked shit about EFI rootkits, which was pretty fun. My slides are uploaded here if you’re interested.

A couple of highlights for me were Brett Moore’s talk about process continuation (I’m kinda surprised IE didn’t crash spontaneously and ruin his demos), Alex Ionescu’s talk about ACPI 5.0 rootkits (Alex lost his laptop on the way over and had to rewrite his talk AND demos - still nailed it), and Stefan Esser’s talk about the iOS kernel heap (crossover with OS X kernel is very interesting to me). Oh and the chilli crab.

I’ll definitely be making an effort to get over to Singapore for SyScan 2013. Thomas Lim knows how to put on a con/party.

KXLD doesn’t like us much. He has KPIs to meet and doesn’t have time to help out shifty rootkit developers. KPIs are Kernel Programming Interfaces - lists of symbols in the kernel that KXLD (the kernel extension linker) will allow kexts to be linked against. The KPIs on which your kext depends are specified in the Info.plist file like this:

Those bundle identifiers correspond to the CFBundleIdentifier key specified in the Info.plist files for “plug-ins” to the System.kext kernel extension. Each KPI has its own plug-in kext - for example, the com.apple.kpi.bsd symbol table lives in BSDKernel.kext. These aren’t exactly complete kexts, they’re just Mach-O binaries with symbol tables full of undefined symbols (they really reside within the kernel image), which you can see if we dump the load commands:

Damn. The allproc symbol is the head of the kernel’s list (the queue(3) kind of list) of running processes. It’s what gets queried when you run ps(1) or top(1). Why do we want to find allproc? If we want to hide processes in a kernel rootkit that’s the best place to start. So, what happens if we build a kernel extension that imports allproc and try to load it?

What do we do?

There are a few steps that we need to take in order to resolve symbols in the kernel (or any other Mach-O binary):

Find the __LINKEDIT segment - this contains an array of struct nlist_64’s which represent all the symbols in the symbol table, and an array of symbol name strings.

Find the LC_SYMTAB load command - this contains the offsets within the file of the symbol and string tables.

Calculate the position of the string table within __LINKEDIT based on the offsets in the LC_SYMTAB load command.

Iterate through the struct nlist_64’s in __LINKEDIT, comparing the corresponding string in the string table to the name of the symbol we’re looking for until we find it (or reach the end of the symbol table).

Grab the address of the symbol from the struct nlist_64 we’ve found.

Parse the load commands

One easy way to look at the symbol table would be to read the kernel file on disk at /mach_kernel, but we can do better than that if we’re already in the kernel - the kernel image is loaded into memory at a known address. If we have a look at the load commands for the kernel binary:

We can see that the vmaddr field of the first segment is 0xffffff8000200000. If we fire up GDB and point it at a VM running Mac OS X (as per my previous posts here and here), we can see the start of the Mach-O header in memory at this address:

gdb$x/xw0xffffff80002000000xffffff8000200000: 0xfeedfacf

0xfeedfacf is the magic number denoting a 64-bit Mach-O image (the 32-bit version is 0xfeedface). We can actually display this as a struct if we’re using the DEBUG kernel with all the DWARF info:

The mach_header and mach_header_64 structs (along with the other Mach-O-related structs mentioned in this post) are documented in the Mach-O File Format Reference, but we aren’t particularly interested in the header at the moment. I recommend having a look at the kernel image with MachOView to get the gist of where everything is and how it’s laid out.

This isn’t the load command we are looking for, so we have to iterate through all of them until we come across a segment with cmd of 0x19 (LC_SEGMENT_64) and segname of __LINKEDIT. In the debug kernel, this happens to be located at 0xffffff8000200e68:

The useful parts here are the symoff field, which specifies the offset in the file to the symbol table (start of the __LINKEDIT segment), and the stroff field, which specifies the offset in the file to the string table (somewhere in the middle of the __LINKEDIT segment). Why, you ask, did we need to find the __LINKEDIT segment as well, since we have the offset here in the LC_SYMTAB command? If we were looking at the file on disk we wouldn’t have needed to, but as the kernel image we’re inspecting has already been loaded into memory, the binary segments have been loaded at the virtual memory addresses specified in their load commands. This means that the symoff and stroff fields are not correct any more. However, they’re still useful, as the difference between the two helps us figure out the offset into the __LINKEDIT segment at which the string table exists:

The n_un.nstrx field there specifies the offset into the string table at which the string corresponding to this symbol exists. If we add that offset to the address at which the string table starts, we’ll see the symbol name:

The n_value field there (0xffffff8000cb5ca0) is the virtual memory address at which the symbol’s data/code exists. _allproc is not a great example as it’s a piece of data, rather than a function, so let’s try it with a function:

Update: It was brought to my attention that I was using a debug kernel in these examples. Just to be clear - the method described in this post, as well as the sample code, works on a non-debug, default install >=10.7.0 (xnu-1699.22.73) kernel as well, but the GDB inspection probably won’t (unless you load up the struct definitions etc, as they are all stored in the DEBUG kernel). The debug kernel contains every symbol from the source, whereas many symbols are stripped from the distribution kernel (e.g. sLoadedKexts). Previously (before 10.7), the kernel would write out the symbol table to a file on disk and jettison it from memory altogether. I suppose when kernel extensions were loaded, kextd or kextload would resolve symbols from within that on-disk symbol table or from the on-disk kernel image. These days the symbol table memory is just marked as pageable, so it can potentially get paged out if the system is short of memory.

I hope somebody finds this useful. Shoot me an email or get at me on twitter if you have any questions. I’ll probably sort out comments for this blog at some point, but I cbf at the moment.