Introduction to buffer overflows

Before starting

I know that buffer overflow is not a new hot topic from last week but it's so
enormous that I really wanted to do something about it.

Thanks to the Most Expansive One-Byte
Mistake, the NUL-byte defining
the end of strings opens a whole new world.
By taking advantages of dummy functions like
strcpy, we will be able
to exploit a famous security flaw.
This security hole is called buffer overflow and it will be the topic of this
paper.

I'm writing these words more as a reminder than a fully-documented expert
whatever paper, but I hope it will help you.

The program above will be our example all along this paper.
It simply asks an username to display it (useless I know).

Here, the problem is that the function strcpy doesn't check if the source is
bigger than the destination.
Therefore, the buffer overflow comes up.

But first, we have to remove some protections.
Current operating systems and compilers add some security routines to prevent
this problem.
So, to be able to exploit our buffer overflow, I will remove them.

But don't think that thanks to these protections, this security flaw is not
exploitable anymore! (see More?)
Our case is just an simple example to understand the process, but some buffer
overflows are so well done that the protections don't prevent them.

GNU_STACK represents the stack of the program and as you can read, it's
executable E.
If it wasn't, we couldn't have applied this method here.
But other ones can do that and maybe I'll write something about later.

pset arg 'cyclic_pattern(300)' is a PEDA command which will generate a string
300 characters long.
The wonderful thing here is that we don't have to brute force the length of the
input to find when the program crashs because the command pattern_search will
tell us everything!

Looking at the stack, we can see that EBX have been overwritten, EDI too
and so EBP, ESP and last but not least, EIP!

Well, in fact, we didn't overwrite EIP but ESP, which will redirect
EIP.
If your knowledges about the stack are light, you just have to remember one
thing.

When a program enters in a function, it has to remember where it comes from. So
it will push on ESP the return offset.
Then, when it leaves the function (when it finds a RET instruction), EIP
will take the value from ESP.

But here, EIP doesn't contain a valid offset but 0x64413963.
Knowing what's on the stack, we have to find how long should be the input to
crush ESP.

As it is shown above, EBX, EDI and EBP are overwritten by the As and
then, EIP is overwritten by the Bs.

To exploit the buffer overflow, we will redirect the EIP to a shellcode which
will open a new shell.
This one will have the same rights as the owner (thanks to chmod +s) of the
program (in our case root).

There are plenty of solutions where writing shellcodes. For instance you can
save it in an environment
variable.
Here, I will directly write it into the buffer and redirect the EIP on its
beginning.

Shellcode

As we know, after writing 268 bytes in the buffer, the next 4 bytes will
overwrite the EIP.
It's here that we will write the address pointing to the beginning of the
buffer.
We will pad the first bytes with '\x90' (i.e. NOP instruction), write the
shellcode, pad with some NOPs again and finally pad with the offset several
times.

I write 63 NOPs after the shellcode in order to pad the buffer but also to
align the offset saved on the stack.
As you can see below, if I only write 62 NOP_s, the _EIP will not point where
we want.
The last line of the shellcode is the offset repeated 22 times.
If you're asking why it is written backwards, it's because the OS uses
little-endian encodage so the most
significant byte is on the right.

I asked some friends about why our exploit works in GDB but not outside.
The reason is that GDB adds some stuff when it debugs programs.
For instance, it will reserve some space for the local variables, more than
they actually need.

Moreover, if we haven't deleted the protection from the operating system, the
offset in the shellcode would have been wrong too, because the ones of the
stack would have been randomized by the ASLR method.
Now you understand why I had to disable it exploiting the buffer overflow.

Fix the shellcode

You might be still wondering why I wrote 100 NOPs at the beginning of the
buffer and now it's time to tell you.

Here, we are just dealing with the padding from GDB (ASLR disabled).
So one solution is to pad the beginning of the buffer in order to have a bigger
area where to point and change the offset to point in the middle.
Then, we will expand our shellcode by writing the offset a little bit further.

With these modifications, we will prevent the buffer from translating and/or
expanding.