Monitoring Function Calls

Overview

There are tools to monitor the system calls an application makes, but how
about monitoring your own written functions - inside the program itself?
What if we want to check when a function is entered, which arguments is the
function called with, when the function exits, and what the returned value is?
This article presents a proof-of-concept tool to achieve this without
modifying the application's code.

While the gcc compiler will instrument the code for us, some of the details
left to the programmer are both compiler-version dependent and CPU-dependent
- namely retrieving the function arguments and return values. Thus, the
discussion here is based on experiments with gcc compiler suites 4.1
and 4.2, Intel processors, and binutils 2.18.

Code instrumentation

We want to address the following points:

when a function/method is entered and exited

what the call arguments were when the function is entered

what the return code was when the function is exited

optionally, where the function was called from

The first one is easy: if requested, the compiler will instrument functions
and methods, so that when a function/method is entered, a call to an
instrumentation function is made, and when the function is exited, a similar
instrumentation call is made:

This is achieved by compiling the code with the -finstrument-functions flag.
The above two functions can be used, for instance, to collect data for
coverage or for profiling. We will use them to print a trace of
function calls. Furthermore, we can isolate these two functions and the
supporting code in an interposition library of our own. This library can
be loaded when and if needed, thus leaving the application code
basically unchanged.

Since these two instrumentation functions are aware of addresses, and we
actually want the trace to be readable by humans, we need also a way to
resolve symbol addresses to symbol names: this is what libtrace_resolve()
does.

Binutils and libbfd

First, we have to have the symbol information handy. To achieve this,
we compile our application with the '-g' flag. Then, we can map addresses
to symbol names. This would normally require writing some code
that is aware of the ELF format.

Luckily, there is the binutils package, which comes with a library that
does just that - libbfd - and with a tool - addr2line. addr2line is a good
example of how to use libbfd, and I have simply used it to wrap around libbfd.
The result is the libtrace_resolve() function. For details, please refer to
the README in the code accompanying this article.

Since the instrumentation functions are isolated in a stand-alone module,
we tell this module the name of the instrumented executable through an
environment variable (CTRACE_PROGRAM) that we set before running the program.
This is needed to properly init libbfd to search for symbols.

Note: binutils is a work in progress. I have used version 2.18. It does
an amazingly good job, although function inlining affects its precision.

Stack Layout

To address the first point, the work has been architecture-agnostic.
(Actually, libbfd is aware of the architecture, but things are hidden
behind its API.) However, to retrieve function arguments and return
values, we have to look at the stack, write a bit of
architecture-specific code, and exploit some gcc quirks. Again, the
compilers I have used were gcc 4.1 and 4.2; later or previous versions
might work differently. In short:

x86 dictates that the stack grows down

GCC dictates how the stack is used - a "typical"
stack is depicted below.

each function has a stack frame marked by the ebp (base pointer)
and esp (stack pointer) registers.

In an ideal world, the code the compiler generates would make sure that
upon instrumenting the exit of a function the return value was set and the
CPU registers were pushed on the stack (to ensure the instrumentation
function does not affects them). Then, it would call the instrumentation
function, and finally pop the registers. This sequence of code would ensure
we always get access to the return value in the instrumentation function.
The code generated by the compiler is a bit different...

Also, in practice, many of gcc's flags affect the stack layout and
registers usage. The most obvious ones are:

-fomit-frame-pointer. This flag affects the stack offset where the
arguments are to be found.

The optimization flags (e.g., '-Ox'); each of these flags aggregates
a number of optimizations. These flags did not affect the stack, and,
quite amazingly, arguments were always passed to functions through
the stack regardless of the optimization level. One would have
expected that some arguments would be passed through registers - in
which case getting these arguments would have proven difficult or
even impossible. However, these flags did complicate recovering the
return code. Note that on some architectures, these flags will "suck in"
the -fomit-frame-pointer optimization.

In any case, be wary: the flags you use to compile your application
may hold hidden surprises.

Function arguments

In my tests with the compilers, all arguments were invariably passed
through the stack. Hence, this is trivial business, affected to a small
extent by the -fomit-frame-pointer flag - this flag will change the offset
at which arguments start.

How many arguments does a function have; how many arguments are on the stack?
One way to infer the number of arguments is based on its signature
(for C++, beware of the "this" hidden argument), and this is the technique
used in __cyg_profile_func_enter().

Once we know the offset where the arguments start on the stack and how many
of them there are, we just walk the stack to retrieve their values:

Note how the return code is moved into the ebx register - a bit unexpected
since, traditionally, the eax register is used for return codes - and then
the instrumentation function is called. Good to retrieve the return value,
but to avoid the ebx register getting clobbered in the instrumentation
function, we'll save it upon entering the function, and restore it when we
exit.

When the compilation is done with some degree of optimization (-O1...3;
shown here is -O2), the code changes:

Make sure you use binutils 2.18 or you miss some important header files (Debian Etch
currently only has binutils 2.17). You can try the code without installing binutils 2.18, the
Makefile already accesses the binutils build directory (just change the path to wherever
you unpacked the sources).
Please note that the code was intended to be used on the IA32 32-bit Intel platform.
We tried to run it on a x86_64 system with some modifications but decided to leave
it that way. If you port the examples to the AMD x86_64 platform,
please send patches to the author.
-- René
]

Resources

Aurelian is a software programmer by trade. Sometimes he programmed
Windows, sometimes Linux and sometimes embedded systems. He discovered
Linux in 1998 and enjoys using it ever since. He is currently settled
with Debian.