This might sound silly, but I've heard of many programmers that claim they
do not need a debugger. They simply don't create bugs. Well, one thing
is sure - either they've no idea what they are saying, or they just never
put their code to real test. Or maybe they're indeed as gifted as they
claim. Unfortunately, most of us tend to have bugs in our code. We could
use printing commands to test our code, or we could use a debugger. Many
times our code might seem to work correctly, because we didn't test it
under enough scenarios. Other times we know there's a bug, but by just reading
the code we don't notice it is there. Thus, we should develop a habit
of launching a debugger when we get into trouble. It shouldn't come instead
of making an effort to write correct code, to add many tests in the code
for invalid function arguments, NULL pointers, etc. But when we're in
trouble, it's probably our best shot.

The explanations given here are specific to the "gdb" debugger, since there
are no real standards regarding the activation and usage of debuggers, but
once you know what features to expect from a debugger, it's not too
hard to adapt your knowledge to different debuggers.

Before invoking the debugger. make sure you compiled your program (all its
modules, as well as during linking) with the "-g" flag. Otherwise, life will
be tough. Lets compile the "debug_me.c" program,
and then invoke "gdb" to debug it:

gcc -g debug_me.c -o debug_me
gdb debug_me

Note that we run the program from the same directory it was compiled in,
otherwise gdb won't find the source file, and thus won't be able to show
us where in the code we are at a given point. It is possible to ask gdb to
search for extra source files in some directory after launching it, but
for now, it's easier to just invoke it from the correct directory.

Once we invoked the debugger, we can run the program using the command
"run". If the program requires command line parameters (like our debug_me
program does), we can supply them to the "run" command of gdb. For
example:

run "hello, world" "goodbye, world"

Note that we used quotation marks to denote that "hello, world" is
a single parameter, and not to separate parameters (the debugger assumes
white-space separates the parameters).

The problem with just running the code is that it keeps on running until
the program exits, which is usually too late. For this, breakpoints are
introduced. A break point is a command for the debugger to stop the
execution of the program before executing a specific source line.We can set
break points using two methods:

Specifying a specific line of code to stop in:

break debug_me.c:9

Will insert a break point right before checking the command
line arguments in our program (see the file supplied with this
tutorial).

Specifying a function name, to break every time it is being called:

break main

this will set a break point right when starting the program (as the
function "main" gets executed automatically on the beginning of any
C or C++ program).

Note that you won't always get the warnings i got - it just goes to show
you how lousy my system setup is. In any case, these warnings are not relevant
to our code, as we do not intend to debug any shared libraries.

Now we want to start running the program slowly, step by step. There are two
options for that:

"next" - causes the debugger to execute the current command,
and stop again, showing the next command in the code to be
executed.

"step" - causes the debugger to execute the current command, and if it
is a function call - break at the beginning of that function. This is
useful for debugging nested code.

Now is your time to experiment with these options with our debug program, and
see how they work. It is also useful to read the debuggers help, using the
command "help break" and "help breakpoints" to learn how to set several
breakpoints, how to see what breakpoints are set, how to delete breakpoints,
and how to apply conditions to breakpoints (i.e. make them stop the program
only if a certain expression evaluates to "true" when the breakpoint is
reached).

Without being able to examine variables contents during program
execution, the whole idea of using a debugger is quite lost. You can print
the contents of a variable with a command like this:

print i

And then you'll get a message like:

$1 = 0

which means that "i" contains the number "0".
Note that this requires "i" to be in scope, or you'll get a message such as:

No symbol "i" in current context.

For example, if you break inside the "print_string" function and try to
print the value of "i", you'll get this message.

You may also try to print more complex expressions, like "i*2", or
"argv[3]", or "argv[argc]", and so on. In fact, you may
also use type casts, call functions found in the program, and whatever your sick
mind could imagine (well, almost). Again, this is a good time to try this
out.

Once we got into a break-point and examined some variables, we might also
wish to see "where we are". That is, what function is being executed now,
which function called it, and so on. This can be done using the
"where" command. At the gdb command prompt, just type "where",
and you'll see something like this:

This means the currently executing function is "print_string", at file
"debug_me.c", line 7. The function that called it is "main". We also see which
arguments each function had received. If there were more functions in the call
chain, we'd see them listed in order. This list is also called "a stack
trace", since it shows us the structure of the execution stack at this point
in the program's life.

Just as we can see contents of variables in the current function, we can
see contents of variables local to the calling function, or to any other
function on the stack. For example, if we want to see the contents of
variable "i" in function "main", we can type the following two commands:

frame 1
print i

The "frame" command tells the debugger to switch to the given stack frame
('0' is the frame of the currently executing function). At that stage, any
print command invoked will use the context of that stack frame. Of-course, if
we issue a "step" or "next" command, the program will continue at the top
frame, not at the frame we requested to see. After all, the debugger cannot
"undo" all the calls and continue from there.

It might be that we'll want to debug a program that cannot be launched
from the command line. This may be because the program is launched from
some system daemon (such as a CGI program on the web), and we are too lazy
to make it possible to run it directly from the command line. Or perhaps
the program takes very long time to run its initialization code, and starting
it with a debugger
attached to it will cause this startup time to be much much longer. There are
also other reasons, but hopefully you got the point. In order to do
that, we will launch the debugger in this way:

gdb debug_me 9561

Here we assume that "debug_me" is the name of the program executed, and
that 9561 is the process id (PID) of the process we want to debug.

What happens is that gdb first tries looking for a "core" file named "9561"
(we'll see what core files are in the next section), and when it won't find it,
it'll assume the supplied number is a process ID, and try to attach to it.
If there process executes exactly the same program whose path we gave to
gdb (not a copy of the file. it must be the exact same file that the process
runs), it'll attach to the program, pause its execution, and will let us
continue debugging it as if we started the program from inside the debugger.
Doing a "where" right when we get gdb's prompt will show us the stack trace
of the process, and we can continue from there. Once we exit the debugger,
It will detach itself from the process, and the process will continue
execution from where we left it.

One of the problems about debugging programs, has to do with Murphy's law:
A program will crash when least expected. This phrase just means
that after you take the program out as production code, it will crash. And
the bugs won't necessarily be easy to reproduce. Luckily, there is some
aid for us, in the image of "core files".

A core file contains the memory image of a process, and (assuming the program
within the process contains debug info) its stack trace, contents of variables,
and so on. A program is normally set to generate a core file containing
its memory image when it crashes due to signals such as SEGV or BUS. Provided
that the shell invoking the program was not set to limit the size of this core
file, we will find this file in the working directory of the process (either
the directory from which it was started, or the directory it last switched to
using the chdir system call).

Once we get such a core file, we can look at it by issuing the following
command:

gdb /path/to/program/debug_me core

This assumes the program was launched using this path, and the core file is
in the current directory. If it is not, we can give the path to the core
file. When we get the debugger's prompt (assuming the core file was successfully
read), we can issue commands such as "print", "where" or "frame X". We can not
issue commands that imply execution (such as "next", or the invocation of
function calls). In some situations, we will be able to see what caused
the crash.

One should note that if the program crashed due to invalid
memory address access, this will imply that the memory of the program
was corrupt, and thus that the core file is corrupt as well, and thus
contains bad memory contents, invalid stack frames, etc. Thus, we should
see the core file's contents as one possible past, out of many probable
pasts (this makes core file analysis rather similar to quantum theory. almost).

It is now probably time to go play around with your programs and your debugger.
It is suggested to try "help" inside gdb, to learn more about its commands.
Especially the "dir" command, that enables debugging programs whose source code
is split over several directories.

Once you feel that gdb is too limiting, you can try out any of various
graphical debuggers. Try to check if you have "xxgdb" installed - this is
a graphical interface running on top of gdb. If you find it too ugly,
you can try out "ddd". Its main advantage over xxgdb is that it allows you to
graphically view the contents of pointers, linked lists and other complex
data structures. It might not be installed on your system, and thus you'll need
to download it from the network.

If you're running on a SunOs or Solaris environment, there is a program named
"gcore", that allows taking the core of a running process, without stopping it.
This is useful if the process is running in an infinite loop, and you want to
take a core file to keep aside, or you want to debug a running process without
interrupting it for too long.