Patching Binaries: Strings

Patching Binaries will be a series of articles about how to extract information and modify program behavior. It focuses on the Mac Mach-O executable format for the x86-64 architecture, but the techniques are similar for other formats.

One of the first things to look at in a binary, when trying to determine information about it, is the constant strings saved within it. In Mach-O binaries these can be found in section __cstring of segment __TEXT.

Compile the example program “test.cc” as “test” and run it:

% ./test
std::string example.
And a char* string, too.

To list a programs’ strings there are at least two programs to help us do that; otool and
strings.

Keep in mind that this is a very simple example and nothing akin to a real-world example. It just illustrates the technique of finding constant strings in a binary that can be used in different ways.

Lastly we will briefly see how the offsets to the constant strings relate to the machine code instructions of the executable. We will do this by using the debugger LLDB to find the spot where the password of the previous example is loaded into memory.

From the above excerpt it is shown that the "Password: " text is loaded at 0x10000070c, and that the input of std::cin at 0x100000724 is compared with the real password "AK9FJ31P" on 0x100000743. But let’s look closer at that last address:

We know that "AK9FJ31P" resides at offset 1f33 but here it uses 17e9, why? To understand this it is necessary to know that it uses relative addressing. So the leaq (load effective address) is invoked at 743 and if we add that to 17e9 then we get 1f2c. That’s still not the correct offset! However, looking at the assembly code we see that it uses 7 bytes, and now the equation fits: 743+7+17e9=1f33!

Note that we used an unstripped binary for this example, which means it still contains useful information. However, with a real-world example it will most likely be stripped of symbols.

Let’s remove the symbols:

% strip password

Now when we try to hook up on the main() it is evident that we can’t because the symbol is not known:

The backtrace shows two function calls in our password executable. Taking a look at frame 8 shows the same location we used previously to argument about the address of the constant string being loaded: