Tutorial – Line by Line – M3’s Boot Loader

In this tutorial, which I refer to as Line by Line, we’ll walk through the source code for the M3 boot sector code – explaining things line by line. If you manage to stay away through our walkthrough, you will be rewarded with a link to the entire source file.

Again – our boot loader has four main goals:

Load the kernel code into memory.

Enable the A20 gate so that we can address memory above 1MB.

Enter 32-bit protected mode.

Hand over control to the kernel.

Remember, this code is for Nasm. So without further ado, let’s start the walkthrough:

; M3 Boot Sector/Loader
; Assembled from various sources around the internet by Peter C. de Tagyos
; March 2009
;
; For more information, visit the M3 Design and Development blog:
; https://m3os.wordpress.com
;
; It is assumed that the target PC sports at least an 80386 processor.
; I mean, c'mon - it's 2009! I don't want to waste time or code on anything less.
;

We’ll start by defining some constant values that we’ll use in the code. First, we’ll define the address that we want to load our kernel code to. This is the address that we will jump to when the boot loader has finished its work. The next two constants define segment descriptor IDs for the code and data segments. These are set up in the GDT (see the data section of the code at the very bottom) and are used to reference segments while in protected mode.

The first chunk of code here loads the kernel from disk into memory. To achieve this, it employs two different modes of BIOS interrupt 13h – reset and read sector. The first four lines following the reset_drive label handle the resetting of the drive. Basically, we call int 13h mode 0 until the drive indicates success (by having a non-zero value in AH).

The next three lines set up the destination address that we’ll be writing the kernel code to -> 0000:1000
The last chunk of code uses int 13h mode 02 to read 8 sectors off the disk. We’ll start reading at sector 2, since the first sector on the disk is taken up by this boot sector code. When all the parameters to the interrupt are setup correctly, we call the interrupt to do the work.

; Now we want to enable the A20 gate on the system bus. This will allow
; us to address memory above 1MB, and will get us ready to enter
; protected mode.

call enable_A20

Once we’ve gotten the kernel loaded into memory, we’ll work on the next task – enabling the A20 gate. We need to do this before we enter protected mode. We call the routine to enable the gate here. We’ll talk about what happens inside the routine when we get to that code.

; OK - now we want to enter protected mode. We need to turn off the
; interrupts, load up the Global Descriptor Tables that we've declared
; down in the data section of our code, and set the first bit of the
; CR0 register. Finally, we'll need to do a far jump to clear out the
; prefetch queue.
; ref: http://www.osdever.net/tutorials/brunmar/tutorial_02.php

It’s kind of surprising to see how simple it is to get the processor into 32-bit protected mode. You need to disable interrupts, so that an interrupt doesn’t disturb you during this sensitive code, then load the GDT descriptor. The GDT is defined in the data section of the code, down at the bottom. Next, you need to set bit 1 of the CR0 register. This is what tells the processor to run in protected mode. Once you’ve got that bit set, you need to clear the instruction prefetch queue. To do this, execute a far jump. The address used here looks a bit different, because to make it a far jump, we need to include the segment descriptor ID. According to the GDT we defined, this is 0x08. I realize now that we have a constant defined for this, so we could (and probably should) use that here. Oh well. See, code reviews are always helpful!

This is the enable_A20 routine. It will enable the A20 gate on the system bus, which will allow us to address memory over 1MB (something that is obviously very useful these days). Basically, the code enables the gate by interacting with the keyboard controller. By sending special data values to the keyboard controller port at 0x64, and waiting for specific data values to be returned by the controller, the A20 gate is enabled. The wikipedia reference indicated in the source code is especially helpful when figuring out exactly what’s going on.

;==============================================================================
; 32-bit code - used once CPU is in protected mode
;==============================================================================

[BITS 32]

clear_q:
; The first order of business is to set up the registers properly
; Set up the stack and data segments, and point at the screen buffer

; According to the GDT we defined, the Code Segment descriptor is 8 (0x08)
; and the data/stack segment descriptor is 16 (0x10). Point all segment
; registers except for cs at the data segment.

This code is 32-bit, since when jump to this code once protected mode is enabled. The first thing that we want to do is set up the segment registers properly with the correct segment descriptor IDs. The GDT that we have defined sets up a flat memory model with only two segments, a code segment and a data segment. Therefore, we set up all the segment registers, except for the code segment register, with the ID for the data segment.
Next, we designate some stack space at the top of memory, and point the stack pointer register at it.
That’s it for the boot loader. It’s job is nearly done. The last order of business is to hand over control to the kernel. We jump to the address in the code segment where we’ve loaded our kernel code.

We’ve defined all the data used by our boot sector at the bottom of our code. The most important thing here is the GDT or Global Descriptor Table. As you can see from the comments, this particular GDT sets up two segments, a code segment and a data segment. Each of these segments starts at offset 0 and has access to the entire linear address space (all 4GB of it). We also define a null segment – this is a requirement imposed upon us by the processor. You can read more in the Intel document I referenced above in the comments.

Then we set up our descriptor table, which describes the size of the GDT and contains the address where it starts. Lucky for us, Nasm figures all this out for us based on the labels that we’ve used. If you remember from the code that sets up protected mode, we hand this descriptor table to the lgdt command.