April 1, 2014

Recently, there was a challenging situation I had faced. At first sight it looked like a common debugging problem that can be solved with some experiment but the more time I spent on it the more difficult the situation looked like.

The situation was the following. The below instruction reads memory.

00400000+006026de mov eax,dword ptr [ebp+4]

What is the EIP of the instruction that writes [ebp+4]? This is all I wanted to know that stage.

Note, while looking at the instruction it looks like ebp+4 reads a stack address -- it reads actually a heap address.

First I was looking at the function if I can find the instruction in it that writes [ebp+4]. It wasn't there so I investigated the caller functions, and their callers, and so on. Again, it wasn't there but noticed something. The functions passed a pointer to a context as a parameter containing many variables including [ebp+4].

At this point I had a good reason to believe the situation looked difficult because the context is likely to set by an initializer that may be on a completely different code path to the one I was investigating.

You may ask why I didn't use processor breakpoint too see what instruction writes [ebp+4]. It was a heap address kept changing on every execution and the address was not known to put breakpoint on.

I could have gone back to the point when the structure is allocated, and I could have set a breakpoint relative to the base address and see what code writes [ebp+4]. That sounded good and I would have gone to that direction if I hadn't had a better idea.

I thought I could write a PinTool that tracks write and read memory accesses. It adds all instructions writing memory to the list. When the instruction that reads memory is reached the program searches the list for instructions wrote that address. Of course this has to be thread safe.

It took me a day to develop the PinTool and find the EIP that writes [ebp+4].