Debugging Memory on Linux

All programs use memory, even ones that
do nothing. Memory misuse results in a good portion of fatal
program errors, such as program termination and unexpected
behavior.

Memory is a device for handling information. Program memory
is usually associated with the amount of physical memory a computer
has but can also reside on secondary storage, such as disk drives,
when not in use. Memory for users is managed by two devices: the
kernel itself and the actual program using calls to memory
functions such as malloc().

Kernel Memory

The operating system kernel manages all the memory
requirements for a particular program, or instances of a program
(because operating systems can execute several instances of a
program simultaneously). When a user executes a program, the kernel
allocates an area of memory for the program. This program then
manages the area of memory by splitting it into several
areas:

Text—where only the read-only parts of the program
are stored. This is usually the actual instruction code of the
program. Several instances of the same program can share this area
of memory.

Static Data—the area where preknown memory is
allocated. This is generally for global variables and static C++
class members. The operating system allocates a copy of this memory
area for each instance of the program.

Memory Arena (also known as break space)--the area
where dynamic runtime memory is stored. The memory arena consists
of the heap and unused memory. The heap is where all user-allocated
memory is located. The heap grows up from a lower memory address to
a higher memory address.

Stack—whenever a program makes a function call,
the current function's state needs to be saved onto the stack. The
stack grows down from a higher memory address to a lower memory
address. A unique memory arena and stack exists for each instance
of the program.

Figure 1. Memory Associated with an Instance of a Program

User Memory

User-allocatable memory is located in the heap in the memory
arena. The memory arena is managed by the routines malloc(),
realloc(), free() and calloc(). They are part of libc. However, it
is possible to substitute these functions with another
implementation that may provide better performance for a particular
use. See sidebar for a list of alternate memory functions.

On Linux systems, programs expand the size of the memory
arena in precalculated increments, usually one memory page in size
or aligned with a boundary. Once the heap requires more than what
is available in the memory arena, the memory routines call the
brk() system call that requests additional memory from the kernel.
The actual increment size can be set by the sbrk() call.

To view the current stack and memory arena of any process,
look at the contents of /proc/<pid>/maps for a particular
process, where pid is the process id (see Listing 1).

Each time new memory is allocated with malloc(), a little
more memory is obtained than requested. The memory routines use
this extra memory for maintenance. To obtain the real amount of
memory allocated for user manipulation, use the function call
malloc_usable_space(). The real memory chunk is usually eight bytes
larger.

The structure of a memory chunk has the size of the chunk
prepended and added to the end of the chunk (see Figure 2). The
size value also has a bit flag that indicates whether the memory
management system maintains the memory chunk immediately before the
current one.

Figure 2. The Memory Chunk Structure

The memory routines in GNU libc use bins to store memory
chunks of similar size to assist in improving performance and
preventing fragmented memory areas, where you have unused memory
gaps throughout the memory arena. These memory routines are also
threadsafe. Though these routines are quick and stable, there may
be areas of possible improvement, such as speed and memory
coverage.