Introduction

When we use a debugger to debug an application, we expect that it will allow us
to step through the source code, set breakpoints in source files, inspect values
of various variables, including even complex user-defined types. But we also know
that a typical executable is nothing more than a sequence of raw bytes (mostly
machine language instructions, plus various headers and tables that help
the operating system to run the executable). When the executable is loaded
by the operating system and running, it uses additional pieces of memory
for various purposes (stack, heap, etc.). But again, these are all raw bytes.
How can the debugger know what source file and line corresponds to the currently
executing CPU instruction? Or how can it know what address in stack memory corresponds
to local variable X in function Y? The answer is debug information, the glue that
links the expressions of a high level programming language to the raw bytes of
the running program.

Now lets stop and try to determine what kinds of information about the program
are needed by the debugger to do its job. For simplicity, lets limit our discussion
to the existing Microsoft debuggers and Intel x86 platform. The following table lists
all kinds of debug information that I am currently aware of.

Kind of information

Description

Public functions and variables

This kind of information describes functions and variables that are visible across
several compilation units (source files). For every function or variable, debug information
stores its location and name.

Private functions and variables

This kind of information describes all functions and variables, including those that are not visible across
several compilation units (therefore it also describes static functions, static and local
variables, function parameters). For every function or variable, debug information
stores its location, size and name.

Source file and line information

This kind of information maps every line of every source file to the corresponding
location in the executable. (Of course it is possible that a source line does not have
the corresponding location at all (e.g. a comment line). Such lines are not present
in debug information).

Type information

For every function or variable, debug information can store additional information about
its type. For a variable or a function parameter, this information will tell the debugger
whether it is an integer, or a string, or a user defined type, and so on. For a function,
it will tell the number of parameters, calling convention, and the type of the function’s
return value.

FPO information

For a function compiled with FPO (frame pointer omission) optimisation, debug information
also stores some data that helps the debugger to determine the size of the function’s stack
frame, even when the frame pointer is not available. Without this information, debuggers
would not be able to show correct call stacks when debugging optimised applications.

Edit and Continue information

This kind of information helps Visual Studio system to implement Edit and Continue feature.

The term “location” can have different meanings.
For functions, it is always the address of the first byte of the function.
For global and static variables, it is the address of the first byte of the variable
in memory. For local variables and function parameters, it is usually the offset of
the first byte of the variable from some predefined place in the function’s stack frame.
Other types of locations are also possible (e.g. register, TLS slot, metadata token).

By this time we already know what kinds of information debuggers use. Now we will explore
how debug information is stored. In the past decade, Microsoft development tools made use
of several different formats of packaging debug information. Here we will talk about COFF,
CodeView and Program Database formats, which are the most widely used. For every format
we are going to discuss, the following characteristics are especially interesting:

What kinds of debug information can be stored using this format?

Where is the debug information stored (in the executable itself, or separately)?

Is the format documented or not?

COFF

This is the oldest of all debug information formats discussed here. It can contain
only three kinds of debug information – public functions and variables, source file
and line information, and FPO information.
COFF debug information is always stored in the executable itself, it cannot be stored
in a separate file.
The format is documented; the documentation can be found in
Microsoft Portable
Executable and Common Object File Format Specification.

CodeView

This is a newer, and more complex, format. It can store all available kinds of debug information,
except for Edit and Continue data.
CodeView information is usually stored in the executable itself, but it can also be extracted from
the executable and stored in a file (usually with .DBG extension).
CodeView format is partially documented; the documentation can be found in
Visual C++ 5.0 Symbolic Debug Information Specification document available on MSDN.

Program Database

This is the newest debug information format of the three. It can store all possible kinds
of debug information, including Edit and Continue data. It also supports incremental linking,
which is not possible with other formats.
Program Database information is always stored separately from the executable, in a file
(usually with .PDB extension).
Program Database format is not documented. Instead, special programming interfaces
(DbgHelp and DIA) are available to work with it.
There are currently two versions of Program Database format. The first version, often
called PDB 2.0, is used by Visual Studio 6.0. The second and newer version (called PDB 7.0)
is used by Visual Studio.NET. PDB 7.0 format is not backward compatible – Visual Studio 6.0
debugger cannot read it.

Build process

The build process of a typical executable consists of two steps – compiling and linking.
First, the compiler parses source files and generates machine instructions, which are stored
in object files. Next, the linker combines all available object files into the final executable.
In addition to object files, the linker can also be asked to use libraries (which are actually
collections of object files). The whole process is shown in the following figure.

If we want to produce debug information for the executable, we also have to perform two steps.
First, we ask the compiler to produce debug information for every source file. Next, we ask
the linker to combine debug information for every file into the final set of debug information
for the executable. This process is shown in the following figure.

By default, compiler and linker do not produce debug information. Therefore at every step
we should be able to specify proper compiler and linker options to let them know that
we want debug information to be generated. We can also specify what kinds of debug information
should be produced, what debug information formats should be used, and where to store
the resulting debug information.

Below we will discuss the corresponding compiler and linker options of the two popular
development systems – Microsoft Visual C++ 6.0 and Microsoft Visual C++.NET (2002 and 2003).

Compiler

We can ask the compiler to produce debug information for a source file using one
of the following options: /Zd, /Z7, /Zi, /ZI (all options can be also configured in the IDE).
/Zd option asks the compiler to produce debug information in COFF format and store it in
the resulting object file.
/Z7 option asks the compiler to produce debug information in CodeView format and store
it in the resulting object file.
/Zi option asks the compiler to produce debug information in Program Database format and
store it in a separate file with .PDB extension.
/ZI option is almost identical to /Zi, but the resulting debug information produced
with /ZI will also contain Edit and Continue data.
The name of the .PDB file used by /Zi and /ZI options is VC60.PDB by default, but
it can be changed using /Fd compiler option.

Summary information is collected in the following table.

Option

Format

Storage

Contents

/Zd

COFF

.OBJ file

Public functions and variables

Source file and line information

FPO information

/Z7

CodeView

.OBJ file

Public functions and variables

Private functions and variables

Source file and line information

Type information

FPO information

/Zi

Program Database

.PDB file

Public functions and variables

Private functions and variables

Source file and line information

Type information

FPO information

/ZI

Program Database

.PDB file

Public functions and variables

Private functions and variables

Source file and line information

Type information

FPO information

Edit and Continue data

Linker

The following options can be used to configure the linker to produce debug information
for the executable: /debug, /debugtype, /pdb, /pdbtype (all options can be configured in the IDE).

/debug option asks the linker to generate debug information for the executable.
If this option is not specified, debug information is not produced and other
options have no effect.

/debugtype option allows to specify the format of the generated debug information.
The following uses of the option are possible:

/debugtype:coff

use COFF format

/debugtype:cv

use CodeView or Program Database format (depends on /pdb option)

/debugtype:both

use both COFF and CodeView/Program Database formats

One important limitation of /debugtype:coff option is that the resulting debug
information will not contain file and line information, even if it was included
in the debug information produced for object files. If you need source file and
line information, use /debugtype:cv or /debugtype:both.

/pdb option tells the linker whether CodeView or Program Database format should be used.
/pdb:none asks the linker to use CodeView format, while /pdb:filename asks the linker
to use Program Database format and also specifies the name of the .PDB file.
If /debugtype:coff option is also specified, /pdb option has no effect.

/pdbtype option makes sense only if debug information for one or more object files
and libraries is also stored in a .PDB file. If that’s the case, /pdbtype:sept option
will ask the linker to leave the debug information in the original .PDB files,
and do not copy it into the final .PDB file for the executable. As a result,
the link process will complete faster, but all .PDB files will be needed to successfully
debug the executable. If it is not desirable to keep debug information separated into
several .PDB files, /pdbtype:con option should be used. With this option, the linker
will copy the contents of all .PDB files into the final .PDB file for the executable.

If you have difficulties to understand all available combinations of linker options,
I must say that I had them too. To make it simpler to understand how the options
should be used, I prepared some summary data in the following table.

/debugtype

/pdb

Format

Storage

coff

/pdb:none (has no effect)

COFF

In the executable

coff

/pdb:filename (has no effect)

COFF

In the executable

cv

/pdb:none

CodeView

In the executable

cv

/pdb:filename

Program Database

In .PDB file

both

/pdb:none

COFF and CodeView

In the executable

both

/pdb:filename

COFF and Program Database

COFF information in the executable, Program Database information in .PDB file

Compiler

We can ask the compiler to produce debug information for a source file using one of
the following options: /Zd, /Z7, /Zi, /ZI (all options can be configured in the IDE;
/Zd option is not supported by Visual C++ 2005).

/Z7 option asks the compiler to produce debug information in CodeView format
and store it in the resulting object file.
/Zd, /Zi and /ZI options ask the compiler to produce debug information in Program Database
format and store it in a file with .PDB extension. The difference between these three options
is in the contents of the generated debug information (see summary table below for details).
The name of the .PDB file used by /Zd, /Zi and /ZI options is VC70.PDB, VC71.PDB or VC80.PDB by default
(depending on the version of Visual Studio), but it can be changed using /Fd compiler option.
Note also that these new versions of the compiler do not produce debug information in COFF format.

Summary information

Option

Format

Storage

Contents

/Z7

CodeView

.OBJ file

Public functions and variables

Private functions and variables

Source file and line information

Type information

FPO information

/Zd

Program Database

.PDB file

Public functions and variables

Source file and line information

FPO information

/Zi

Program Database

.PDB file

Public functions and variables

Private functions and variables

Source file and line information

Type information

FPO information

/ZI

Program Database

.PDB file

Public functions and variables

Private functions and variables

Source file and line information

Type information

FPO information

Edit and Continue data

Linker

The following options are used to configure the linker to produce debug information for
the executable: /debug, /pdb, /pdbstripped (all options can be configured in the IDE).

/debug option asks the linker to generate debug information for the executable.
If this option is not specified, debug information is not produced and other options
have no effect. The format of the resulting debug information is always Program Database,
and therefore it is always stored in a file with .PDB extension. By default, the linker
uses the executable’s file name as the name for the .PDB file. The .PDB file can include
all available contents of debug information.

/pdb option allows to specify a non-default name for the .PDB file.

/pdbstripped option allows to produce an additional .PDB file, whose contents are restricted to include only the following:

public functions and variables

FPO information

Note that COFF and CodeView formats are no longer supported by the linker.

The process of producing debug information for static libraries is much simpler than for executables,
because there is no link step. Regardless of the compiler version, we can use one
of the available /Z* options (/Zd, /Z7, /Zi, /ZI) to ask the compiler to produce debug information
for the static library.

One important consideration is where to store the debug information. When /Z7 or /Zd option is used,
debug information will be stored in the resulting .LIB file. When /Zi or /ZI option is used,
debug information is stored in a separate .PDB file (its name can be specified using /Fd option).

If we generate debug information for an executable, how will the executable’s size be affected?
The answer depends on the place where we store the debug information, which in turn depends
on the debug information format we use.

Using COFF or CodeView formats usually means that debug information is stored in the executable.
In that case, the size of the executable can grow significantly (an executable with debug information
can become two times bigger than the same executable without debug information, or even more).

If Program Database format is used, debug information is stored in a separate file.
The size of the executable is almost unaffected; it increases only by several hundred
bytes, because the executable now contains a small header that helps debuggers
to located the file with debug information.

One more step is needed to avoid unnecessary inflation of the executable. A side effect
of using /debug linker option is the change of the default /opt:ref option to /opt:noref.
As a result, enabling debug information generation will disable size optimisations performed
by the linker. To turn size optimisations on again, it is necessary to enable /opt:ref option
explicitly.

When we use CodeView format, the linker always stores debug information in the executable.
But with the help of a small utility called Rebase, it is possible to extract debug information
from the executable and put it into a separate file with .DBG extension.

Rebase is included with Visual Studio. It can be used for various purposes, but when it comes
to extracting debug information, its command line is pretty simple:

rebase –b BaseAddr –x SymbolDir [-p] ExeName

Option

Description

-b BaseAddr

Specifies the new base address of the executable. If you do not want to change the base address,
specify the same address as the one currently used by the executable.

-x SymbolDir

Specifies the directory where the .DBG file will be stored. It is also possible to specify . (dot),
which means the current directory.

-p

If this option is used, the .DBG file will contain only the following kinds of debug information:

public functions and variables

FPO information

Other kinds of debug information are discarded.

For example, the following command will extract debug information from the DLL and put it
into .DBG file in the current directory:

When we decide about using a particular debug information format in our application,
we must ensure that our favourite debugger can understand it. The following table
lists the most popular debuggers and their support of the debug information formats.