Last time I wrote about the strcpy() function and said that it’s unsafe. But why exactly is it unsafe? Let us see the details of what is going on under the hood when strcpy() is called. To do so, we will dive down to the machine level and have a look at what is happening in the stack memory.

In the C language, strings have no explicit length. Instead, the length is determined by a terminating NUL character. Therefore, strcpy() copies bytes until it sees a zero byte:

Consider the following function, which includes the common programming mistake of assuming that the input will fit into the buffer:

void func(char *input) {
char buf[256];
strcpy(buf, input);
}

Now let’s examine what happens at the machine level when this baby executes. When a subroutine is called, the CPU pushes the address of the next instruction onto the stack. This address is the return address. To return from a subroutine, the return address is popped off the stack and loaded as the current instruction pointer. Thus the program jumps back and resumes execution at the point right after the subroutine was called. Using a stack allows for nested subroutine calls.
Local variables and function parameters are placed on the stack as well. This is great because now when a subroutine ends, the local variables go out of scope as the stack frame is ‘cleaned up’ (in fact, the data is still there, but the stack pointer is moved).
This amounts to the following picture of a stack frame when we are executing the func:

You can see that an attacker can overwrite the return address when he supplies an input string that is longer than buf. Not only can he overwrite the return address, he can insert a specially crafted mini-program into buf. What many exploits do is put the return address as the address of buf so that the program will jump back to execute the payload in buf.

To prove that this actually works, we can do a little experiment. Write a small program that fills a buffer with garbage and overwrites the return address as described above. Guess what, it doesn’t work! The program gets killed by a run-time check, cleverly inserted by the compiler. Here is a postmortem stack trace:

As you can see, the compiler inserts a run-time check for the size of the buffer. On top of this, a second run-time check is made that checks the integrity of the stack. This technique is known as inserting a ‘stack canary’ and can be observed by studying a disassembly of our func below. Here you can see that 288 bytes (or in hexadecimal notation, 0x120) of space is taken from the stack for local variables. This is more than the 256 we actually requested. The remaining 32 bytes are used for the stack canary.
Then it calls strcpy_chk() rather than strcpy(). Finally, the stack canary is checked, and may result in stack_chk_fail() being called. Otherwise, the stack frame is cleaned up and the function returns normally.

So, the compiler does a lot of work for us in order to prevent simple buffer overflows. The canary is initialized with a random value before main() runs, so it practically can not be defeated. But beware, an attacker may still influence the program behavior in different ways and deliberately not touch the canary.

Other techniques that help prevent buffer overflow attacks are ASLR (address space layout randomization) and DEP (data execution prevention) or NX pages (non-executable memory pages). Although they make it more difficult, these too can be circumvented by trickery.

Mind ye that all of this is plain impossible if only you (or the high-level language itself!) always properly check the size of the buffer and array bounds. It is something that C normally does not do for you, so be mindful that the compiler will not always be able to save you from writing insecure code.