Introduction

This is a write-up for Pwnable 250 level from Ghost in the Shellcode capture the flag competition. Basically a return-to-libc attack will be described; we will also describe the steps for solving the mentioned CTF level using the original binary from the competition.

Low-level

The assembly code, responsible for checking the indices can be viewed below.

As you can not see, there is no check code for the index when we’re doing a read operation.

For the write operation there is checking using the instruction jle. But jle instruction is used for comparing signed integers. The instruction jbe should be used in this case which compares unsigned integers. You can find more on this wiki article. Probably the original code looks something like this:

int i;
i = atoi(str);
if (i > 9) {
error();
exit();
}
do_stuff;

One way to correct the above code is to have an unsigned comparison or check for negative values. Both would work in this case, but then we couldn’t solve this level :-).

In short, the index checking is broken. We can use any index for the read operation and for the write only negative indices. When you can write anything to any address of a program, the rest is just implementation.

The exploit

As explained in the previous section we can modify almost any address from our vulnerable program. In order to choose a right way to exploit the vulnerability, we should gather more information about the environment.

The short answer is no - there is no RWE section in the binary. We cannot modify a memory that will be executed later. Maybe we can put our exploit in some region and then make this region executable. This means that we should be able to call mprotect or mmap. But we’ll have to do this, without injecting code, but only by changing non-executable data - e.g. stack values. One idea is to use a return-oriented-programming (ROP) approach, but as you will see in a future section, because our program doesn’t use mprotect or mmap (from libc), calling those functions means that we will have to figure out the offsets of those functions in libc first - if we do this, we can have a more straightforward approach by calling system function directly.

Is ASLR enabled?

It is safe to assume that ASLR is enabled. But because we will use some sort of ROP, we don’t care too much about this right now.

Where shall we write?

In order to modify the flow control of the program by only changing non-executable memory, we will have to find an indirect jump and change the value from that specific address. GOT is the starting point for this.

The idea that comes to our mind is: we will write (override) an address of function which is called later from the GOT. The GOT table is always at the same place in the memory (it resides in the binary) but recall, that we’re writing relatively to a buffer (the workspace table). So the next question that comes in our mind is:

Do we know the address of the buffer?

There are three cases where the buffer might be located:

on the stack. If ASLR is enabled, figuring out its address can be done by reading an old %ebp, which is possible because we can read parts of the memory relative to the buffer address;

on the heap. This is harder to get. But if our buffer is on the heap, and we can alter structures that are used internally by the malloc function (and we can, because the negative offset write) there is a way of exploiting. We can do something like in the case of double-free vulnerability - but it would be a tedious job;

declared global (.bss or .data section). The address of the buffer is the same as in the binary, no runtime hazards.

Probably because pwn250 is not the hardest level, the buffer is in the .data section.

Because our buffer is in .data section and we can use negative indices for read and write, we have a good control over the memory below our buffer. Moreover, you can see in the IDA screenshot above, that there’s a math variable. The program is capable of switching from one operation (addition) to another one (multiplication) it does so by changing a pointer to a function. The pointer is in the .bss section.

I know at this point, one might argue that the authors of the program used this pointer to facilitate the problem solving - it’s true I wouldn’t argue against this - it’s just a game.

So let’s state our idea: we will override a pointer to a function which is called later; the function will be called whenever the math function is called.

Neat! But what are those numbers? We wrote at position -2147483634 value 286331153. The second number is the instruction pointer at which we want to jump with the math function. The first number is computed as follows

the base of our buffer (values) is at a fixed address 0x804c040

the address at which we want to write is 0x804c078

we need to write at position values+0x38

giving a positive index (0x38/4) will give an upper bound error

the negative index is -(2^31 - (0x38/4)) == -2147483634

you can test this by computing 2^33 + 0x804c040-4*(2^31 - (0x38/4)) - because of the way the buffer is addressed (4 bytes values, scaled addressing) the overflow is ignored and the index value wraps around. We need to do wrap around only when we try to access a value above the base address of the vector.

The instruction pointer is the value that we wrote, 0x11111111 in decimal is 286331153, so we’ve managed to modify the flow of the program by doing a write, and we’ve managed to do so in a predictable way.

Second PoC

We are in the following state: we’ve managed to make our program to jump at any location. But where to jump? Because we don’t have any possibility of injecting code, we should rely on the available code. Available code means, our code and the dynamic libraries code which are mapped in our address space.

Let’s inspect again our binary to see what is used from shared libraries.

Hmm, nothing useful, nothing to execute, nothing to modify the mappings. But hey, if you have access to those functions from libc and because the loader maps the libc to our address space then it means that we have access to other functions from libc, the problem is that we don’t know where they are. A wild idea appears, if we knew where one of the function from libc is, we can compute the rest of them by adding some offsets. There are two problems with this idea: how do we find the offset of a used function and how do we compute the offset of an unused function.

finding the address of a used function is simple, we can use the GOT and read the value of the pointer which has been already filled in by the loader. Because of the lazy linking, we only have to be careful to choose a function which has been previously called. We will choose recv for this purpose.

finding the relative offset of the function that we want to jump to (e.g. system) is difficult. This offset depends on the version of libc that is used on the target system. To make things simple, we will focus first on exploiting locally - meaning that we have access to our libc file. To compute the offset we only have to find the function entries in libc.

The offset is -613104, note that it depends on the version of libc, hence the exploit isn’t too reliable. Let’s focus though on exploiting locally and postpone the computation of the remote offset. We will write at the same address as in PoC1 but we will write the value of system function i.e. address_of_recv_function+OFFSET.

$ telnet localhost 31337
readInput position to read from:
-32Value at position -32: -1217696784
writeInput position to write to:
-2147483634Input numeric value to write:
-1218309888Value at position -2147483634: -1218309888
mathResult of math: -1

Reading from -32 it’s equivalent of reading -32*4 bytes before our buffer. 0x804c040-32*4 is 0x804bfc0, this is the recv GOT entry. -1218309888 is -1217696784-613104.

Because we’re using system function the first parameters are set accordingly (sh -c) but the actual command ((char **)$ecx)[2]) is empty. You can have a look at execve syscall parameters and the calling convention for it. Here we’re very lucky, the command that is passed to system is our buffer with values, the initial table. Let’s recap our approach:

get the address of recv function via GOT

set the pointer of math function to system by adding an offset to recv function address

set the parameters in the workspace table

trigger the exploit by using the math function

profit

Getting some output

The only problem was that the communication socket was number 4 and the output went to file descriptor 1, but running the command with >&4 2>&4 appended, did the trick for us.

The offset, the Achilles’ Heel of the exploit

Well, the exploit worked locally, but remote it didn’t.

Recall that when computing the offset of system function in respect to recv function, we were able to access the libc that was used on the target system. A few ideas appeared:

try different offsets by gathering as many libcs as possible from well known distros. After one hour of trying all the libc binaries from Ubuntu I start to wonder if I’m on the right track.

try random values - this didn’t work at all and it was time consuming (I was already tired and my thinking was bad)

get a copy of in use libc - this is a problem, because we cannot do open, in the best case, we might do some send over the socket using as buffer input the libc mapping.

hope for the best, and use another challenge (which we already exploited) and download that libc file and hope that this system has the same one.

try to do a more intelligent search by matching function entries (push %ebp, mov %esp, %ebp etc.), this would require too much work.

use some magical tool/table that I’m not sure it exists.

We used a previous level and was able to download the libc, this libc was identical with the one that was in use by the current challenge, so we were able to compute the offset for the remote system.

I don’t know of any method of doing a reliable return-to-libc attack without knowing the addresses of some functions. Maybe there’s a method of getting all the symbols after knowing the libc base, that would be neat.

Conclusion

We’ve presented a way of doing a return-to-libc attack, even though this is a primitive return-to-libc approach, we used a function from libc. We also had to compute the offset of that function using the address of another function - this makes the exploit unreliable.

In the end, it boils down to have the right skill for using the right tools, it’s nothing fancy.