This
presented a dilemma. Because the binary was statically linked, this
meant all functions were compiled into the program rather than being
dynamically linked by the system. Because the binary was stripped, the
symbols from the binary were removed.

Symbols and
dynamically linked functions can both be used to identify the names of
functions in an assembly listing. This would allow us to read
significant portions of the code in assembly intermixed with named calls
to known functions, making the task of reverse engineering significantly
easier.

Using
‘strings’ on the binary revealed the following useful piece of
information.

@(#) The Linux C library 5.3.12

It was
decided to try identifying the library functions by byte comparisons of
binaries of from libc5.3.12. A copy of libc-5.3.12.bin.tar.gz was
retrieved, and extracted. An arbitrary object file extracted from the
libc package, socket.o was used to test this theory.

Using the
command ‘objdump –xd socket.o’, the following output was printed for the
libc function ‘socket’:

0: 55 push %ebp

1: 89 e5 mov %esp,%ebp

3: 83 ec 0c sub $0xc,%esp

6: 53 push %ebx

7: 8b 55 0c mov 0xc(%ebp),%edx

a: 8b 4d 10 mov 0x10(%ebp),%ecx

d: 8b 45 08 mov 0x8(%ebp),%eax

10: 89 45 f4 mov
%eax,0xfffffff4(%ebp)

13: 89 55 f8 mov
%edx,0xfffffff8(%ebp)

16: 89 4d fc mov
%ecx,0xfffffffc(%ebp)

19: ba 01 00 00 00 mov $0x1,%edx

1e: 8d 4d f4 lea
0xfffffff4(%ebp),%ecx

21: b8 66 00 00 00 mov $0x66,%eax

26: 89 d3 mov %edx,%ebx

The byte
values on the left were entered into the hex searcher in Ultra-Edit, and
were found at and only at location 0xECF4. This meant we had a hex match
for the function, from the information that had been provided by objdump.
By looking at the header output of ‘objdump’ against our binary image,
we could see that this location was in the .text section. This was clear
because the location 0xECF4 was contained between File Offsets between
the .text (0x00000090) and libc_subinit (0x0001f5cc) sections. This
memory location would be relocated to 0x08048090+0xECF4-0x90 (VMA +
bytematch – File off.) when the binary image was loaded into memory.

Sections:

Idx Name Size VMA LMA
File off Algn

0 .init 00000008 08048080 08048080
00000080 2**4

CONTENTS, ALLOC, LOAD, READONLY,
CODE

1 .text 0001f53c 08048090 08048090
00000090 2**4

CONTENTS, ALLOC, LOAD, READONLY,
CODE

2 __libc_subinit 00000004 080675cc 080675cc
0001f5cc 2**2

CONTENTS, ALLOC, LOAD, READONLY,
DATA

Breaking
out ‘calc.exe’ in scientific mode we find that this gives the address
0x8056cf4. Using ‘objdump –dx the-binary | grep 0x8056cf4’ back on our
Linux machine we found the following:

8048262: e8 8d ea 00 00
call 0x8056cf4

8048906: e8 e9 e3 00 00 call
0x8056cf4

8048fa9: e8 46 dd 00 00 call
0x8056cf4

8049213: e8 dc da 00 00 call
0x8056cf4

8049657: e8 98 d6 00 00 call
0x8056cf4

8049ac7: e8 28 d2 00 00 call
0x8056cf4

8049e22: e8 cd ce 00 00 call
0x8056cf4

804a602: e8 ed c6 00 00 call
0x8056cf4

804ebf8: e8 f7 80 00 00 call
0x8056cf4

804eee6: e8 09 7e 00 00 call
0x8056cf4

8055312: e8 dd 19 00 00 call
0x8056cf4

8063baf: e8 40 31 ff ff call
0x8056cf4

806435e: e8 91 29 ff ff call
0x8056cf4

We were on
our way to producing an unstrip tool! This allowed us to identify
arbitrary functions in the program by their byte signatures. At this
point we began to automate this process in perl, the eventual result of
these efforts can be found in the archive accompanying this text.

The output
of this newly produced disassembly, containing almost all used libc
functions replaced into the text, was then manually reviewed, and edited
with comments and notes being taken. These notes can also be found in
the accompanying archive. As the binary was not packed or encrypted,
this was an arduous but relatively simple task.

During the
entire process of disassembly the binary was not executed. The
disassembly tool, tld.pl was created specifically for the reverse
challenge, during the period of the challenge.