Wednesday, 4 December 2013

Debugging with Registers (Part 1)

Registers are very important in debugging, and can provide information you otherwise might have thought to have been quite cryptic to understand and decipher. In this blog post, and future blog posts, I'm going to explain what these registers mean, and how they can help us with our debugging efforts.

General Purpose Registers

Let's begin, with a simple overview of the different types of register, and what they mean, and their general purpose. Let's view all the current register states for the last running context with the r command.

We can see all the currently saved registers for the last context, in fact this is the last instruction called, before the crash took place. From eax to edi, these are general purpose registers, and these registers can be used to store any form of data - memory addresses, data and floating points.

These general purpose registers are typically 32-bits long, and registers eax, ebx, ecx, edx can be divided by 2 to treated as a single 16-bit register called AX, BX, CX, DX. This 16-bit register, can be divided further to form two 8-bit registers called: AH and AL; BH and BL; CH and CL; DH and DL. The rax, rbx, rcx, rdx, rsi, rdi, rip, rsp and rbp are used for 64-bit processors when operating in 64-bit mode (sub section of Long Mode), and the registers from r8 to r15 are used as additional general purpose registers.

The H and L refer to high and Low. These sub register types still refer to the same 32-bit register, they are mainly used to remain compatible with older 16-bit instructions, and to hold small data types.

Now, let's move onto the more important registers, which are the eip, esp and ebp registers. The eip register is used to hold the current instruction pointer, the esp register is used to hold the current call stack or run time stack pointer and the ebp register is used to hold the base pointer of the stack.

This is illustrated with the example below:

Difference between Real Mode and Protected ModeWhen the CPU is running in Protected Mode, only certain programs can access memory, and therefore areas of memory can be isolated from malicious purposes. Real Mode enables all programs access to memory. At boot, the CPU is running in Real Mode, this is to allow backwards compatibility with older x86 processors. For the CPU, to switch to Protected Mode, it must set the PE (Protection Enable) bit to 1 in the CR0 register. I'll explain these special registers later, but for now just remember that CR0 register contains a bit called PE. The GDT (Global Descriptor Table) and the LDT (Local Descriptor Table) are two descriptor tables which must be filled, and loaded in order to access and use Protected Mode.

The Global Descriptor Table is used to load segment descriptor registers (cs, ss, ds, es, fs, gs) for all programs, whereas the Local Descriptor Table is used to hold segment descriptors for a specific task or program. A task is defined as a unit of execution, thus can apply to a entire program or single instruction. In Protected Mode, these segment registers are not used to form part of a physical address, but instead are used to index into the GDT and the LDT.

Real Mode enables access to all the hardware, I/O addresses and memory in the system.In Real Mode, all the memory is organised into 16-bit segments. The physical address is calculated by using value in the segment register, and then multiplying it by 16 and then adding a 16-bit offset, which in turn forms a 64K segment. Each descriptor table entry is 8-bytes, and therefore will appear as 00h, 08h, 10h in WinDbg. Each Base Address is stored in a Segment Descriptor Cache.

However, there is another mechanism which occurs, when deciding which descriptor table to use (GDT or LDT), and that is the TI bit (Table Indicator). If the TI bit is set to true (1) then use the LDT, otherwise if the bit is set to false (0), then use the GDT.

The LDTR (Local Descriptor Table Register) stores the linear address of the LDT, and the GDTR (Global Descriptor Table Register) stores the linear address of the GDT. Protected Mode introduces the concept of CPU Ring Levels and paging.

Segment Registers

I've discussed these before, but I'm going to go over these briefly again, just to keep everything together in one post, and to explain the dg command properly, since I've learned what each part of the command means.

I'll begin with explaining the dg command, which is used to display the segment descriptors for a selector.

The Sel field indicates the selector which has been loaded into the segment register, in this case we are looking at the gs segment register.

The Base field indicates the beginning of a segment in physical memory, this a 32-bit value. The Limit address indicates the end of the segment.The Type is split into three separate parts, the first part indicates that kind of information the segment register holds, and this is Data, this can be split into two further sub groups which show that the Data Segment is Writable and Readable (RW), Data Segments are always going to be readable. The Ac field shows that the segment has been accessed, this is set each time a access is made, so the CPU knows wherever to cache the segment to the disk.The Pl field shows DPL of the segment, or the Descriptor Privilege Level, it can range from 0 to 3 (CPU Rings).

The Gran field means the granularity of a segment, The Pres field shows wherever the segment is in memory, or cached to the disk. I'm assuming P means the segment is Present.The Long field indicates wherever the segment is used in Long Mode, Nl means 32-bit and Lo for 64-bit.

Check this article for more information about the different fields, I've had to try and substitute the letters for the bit values shown in the article. I'm going to list the general uses of the segment registers here: