Functionality

This section is used to collect notes about basic functionality we expect the debugger to have, as well as more advanced features. Its purpose is to draw the landscape of what should and could be, so as to have an indication of the extend of the work and the potential complexities we may need to have to deal with. This helps us to come up with a better design.

Basic functionality

The debugger needs to support kernel debugging as well as user space debugging. Both kernel and user space debugging requires the support for:

Multi-threading.

Multiple load modules (kernel modules or shared libraries).

Live debugging (remote protocol/ptrace).

Retrospective debugging (core file).

Stack unwinding (backtraces).

Source correlation (DWARF).

Disassembly.

Stepping and/or running the target.

Modification of state when live debugging.

Advanced features

Multiple concurrent debugging sessions, with cross-session event handling. This can be used to test inter-process relationships and/or communications.

Cross-debugging, with each session a different ABI, OS and/or architecture.

Remote debugging; such as over a serial connection, but possibly also Firewire, USB and/or ethernet.

More elaborate event handling (above and beyond break- or watchpoints). Things that come to mind are syscall tracing so that one can stop the debuggee on executing a certain syscall.

Injection of code that can test hypotheses and trigger events. This goes above and beyond conditional break- and watch-points.

Snapshots. This allows saving of debugging sessions that can later be continued.

Scripting.

Expressions, preferably using the same syntax and semantics as the source.

Components

The following sections describe the components or basic building blocks that a debugger can make use of.

ELF (libelf)

The fundamental file format is ELF. It's used for executables, shared libraries, kernel, modules and core files for both user space and kernel. At this time only sparc64 does not use ELF based kernel core dumps, normalizing on ELF means that sparc64 needs to change kernel core file formats. It also means that platforms that don't support kernel core files should implement them as ELF files. Note that amd64 uses relocatable object files as modules. While these are ELF, relocatable object files are not designed to be used as load modules. This probably needs special treatment.

The ElfToolChain project has created a BSD licensed libelf implementation.

DWARF (libdwarf)

Source level debugging requires debug information to be present that maps raw machine-level entities to source-level definitions and declarations. The de facto standard for this is DWARF. It is expected that significant work is required to do the source correlation. A BSD licensed DWARF library would be ideal, but lacking that a LGPL library should be a good solution to start off with.

Unwinding (libunwind)

Disassembly of machine code (libdisasm)

Use an API that is based on VLIW hardware. Scalar or superscalar processors are just a special case of which the number of operations per instruction is 1. Therefore, disassembly of an instruction at a given address can return something like the following:

Length of the instruction in bytes (can be fixed or variable)

Number of independent operations

For each operation:

Length of operation in bits

Bit offset of operation within instruction

Format string

The API needs callback functions so that the disassembler can fetch the instruction bytes as well as possibly obtain register contents. For this it seems logical to use the proc_services API.

disasm API:

uint64_t disasm_get_ipmask() - returns the mask needed to get the address of the intruction, rather than the address of the operation. It is assumed that the IP or PC register will hold the address of the operation. On ia64 for example, the IP contains the slot number in the lowest 4 bits. The mask is used to filter-out the operation (or slot number on ia64) so that the address of the instruction (or bundle on ia64) is left. This is the address that the disassembler uses.

struct instr *disasm(uint64_t) - ...

User interface

The following sections collect thoughts about a user interface and the set of commands the user may want to see implemented. This covers both basic functionality as well as advanced features.

Views

While debugging, the user may want to view (inspect) any of the following:

Machine level information:

disassembly

register contents

threads

backtrace

raw memory dump

OS level information:

memory map

load modules

open files

syscall trace

shared memory

environment

Source level information:

source code

source files

static/global functions

local/static/global variables

TLS storage

datatypes

Debugger information:

breakpoints/watchpoints

sessions

Actions

While debugging, the user may want to perform any of the following actions: