Stupid Smart Pointers in C

Managing memory in C is difficult and error prone. C++ solves this with smart pointers like std::unique_ptr and std::shared_ptr. This article demonstrates a proof-of-concept (aka stupid) smart pointer in C with very little code. Along the way we'll look at the layout of the 32-bit x86 call stack and write assembly in a C program.

In C, heap memory is allocated with a call to malloc and deallocated with a call to free. It is the programmer's responsibility to free allocated memory when no longer in use. Otherwise, memory leaks grow the program's memory usage, exhausting valuable system resources.

Each return must free everything previously allocated. The list of calls to free grows for every additional resource allocated. There are ways to organize this to reduce some redundancy. But the root of the problem remains: the lifetime of the allocated resource is bound to where f returns. Whenever f returns, we need to guarantee all of these resources are freed.

A nice solution in C is described in Eli Bendersky's article: Using goto for error handling in C. This uses the goto statement and places all free calls at the end of the function.

The smart pointer will only consist of one function, free_on_exit, to free the passed pointer when the current function returns. This will allow us to rewrite our above example without any calls to free.

Wherever f returns, it frees everything allocated before. But how can we possibly implement free_on_exit? How can we know when f returns and free all previous allocations? The trick is to manipulate the call stack. Instead of f returning to its original caller, we can manipulate the stack to have it return to our own custom function.

Let's refresh on what the call stack looks like. The layout of the call stack depends on the architecture. We'll use 32 bit x86 as our target architecture (which has a simpler layout and calling conventions than 64 bit). Eli Bendersky has another great article, Where the top of the stack is on x86, with more depth, but the following is a brief overview.

Here's an example of what the stack looks like when function main calls function sum in 32 bit x86 architecture.

During a function call, the caller and callee split the responsibilities of what data to push onto the stack. The caller main is responsible for saving the current eip, but the callee f is responsible for saving the current ebp.

But how can the stack be modified in a C program? One way is to use assembly to obtain stack addresses, and then change the values they point to. The following uses inline assembly to change a function's return address.

Once we have the value of ebp in base, we can use it just like any pointer.

*(base + 1) = (int) hijacked;

Since base is of type int* adding one increments the address by the size of an int (4 bytes in this case). Therefore, this line changes the saved eip on the stack from main to the address of the function hijacked.

Note, after we return from hijacked there's an error (yours may be a segmentation fault). Next we'll see how to fix that error.

The example before ended with an error. When hijacked returns, there isn't an address to pop off of the stack, so it jumps to an invalid address.

The caller is responsible for pushing the return address. When we jump directly to hijacked we bypass the usual call convention.

Instead we want hijacked to return back to the original return address in main. To do so we can use a pure assembly function to avoid the typical function call and return sequence of a compiled C function.

.section .text
.globl trampoline
.type trampoline, @function
trampoline:
# call hijacked. This pushes the address of the next instruction.
# when hijacked returns, we jump directly to the address in eax.
# eax contains the returned value of hijacked.
call hijacked
jmp %eax

trampoline.S

This assembly function named trampoline bypasses the usual call sequence generated by compiling a C function. Instead of popping a return address to return to, we jmp directly to the value stored in eax. The value returned by hijacked is stored in eax. We modify hijacked and f as follows:

The free_on_exit above is only a single-use function. If called multiple times, it only frees the pointer passed in the most recent call. Fortunately, it's only another small step to make free_on_exit work with any number of repeated calls.

To do so we can store a list of tracked pointers for each function call. Stack these lists, and each time a new function calls free_on_exit, add a new stack entry. When do_free is called, it frees the list of pointers on the top most entry of the stack.

At the risk of including too much code in this article, here is the full implementation in under one hundred lines of code:

In this article we've shown how to build a simple and incomplete smart pointer on an 32 bit x86 architecture. We've looked at the call stack, hijacked return addresses, and written some assembly in the process.

I recently discovered the implementation of free_on_exit won't work if called directly from main if gcc aligns the stack. In this case, main adds padding between the saved eip and the saved ebp, (example). I think this can be fixed some tweaking, and will update this article when it is fixed.