x86 History and Memory Models

In case you have any questions or suggestions, you can leave comments
HERE . Thanks!

8080

Released in 1974, Intel 8080 is a 8-bit CPU, which means it processes 8 bits of
information at one time. However, it has 16 address lines coming out, which
means it can access or 64K bytes.

The operating system most used with 8080 processor was CP/M-80 (CP/M short for
Control Program for Microcomputers).

8086, 8088, 80186

Intel 8086 (1978), 8088 (8086 variant, 1979), 80186(1982) are called 16-bit
because their internal registers are 16 bits in size. These processors had 20
address lines, numbered A0 to A19; with these, the processor can access
bytes or 1 megabyte.

However, to make 8086 backward compatible with CP/M-80 softwares run on 8080,
the 8086 was set up in a way such that a program could take a portion of 64K
byte segment within the megabyte of memory and run entirely inside it as if it
runs in a 8080 system.

The design was achieved by using segment registers, they are

CS, code segment register, it contains the segment address of the code
segment of the currently executing instruction

DS, data segment register

SS, stack segment register

ES, extra segment register

FS, GS, only in 386 and later x86 CPUs

All segment registers are 16 bits in size, regardless of the CPU type. This is
also true for 32-bit CPUs.

A segment in 8086 may start at any memory address that is a multiple of 16
(paragraph) and have a fixed size of 64K bytes. Note that two segments could
overlap with each other, since there is no memory protection mechanism yet.

Since the internal address registers of these processors only had 16 bits. To
access a 20-bit address space, an external memory reference was made up of a
16-bit Offset address added to a 16-bit Segment number, shifted 4 bits so as
to produce a 20-bit physical address. The resulting address is equal to Segment
* 16 + Offset.

80286

The intel 80286 had a 24-bit address bus and was able to address up to
16 MB of RAM, compared to 1 MB for its predecessor.

The 286 was the first of the x86 CPU family to support protected mode. All
the x86 CPUs before 80286 worked in real mode1.

A20 Line: A historical bug is now a routine

As we said earlier, the 80286’s predecessor, 8086, 8088, 80186 processors had
only 20 address lines, numbered A0 to A19 and can access 1 megabyte.

When IBM designed the IBM PC AT machine, it decided to use the new
higher-performance Intel 80286 microprocessor, since it
could address up to 16 megabytes of system memory in protected mode.

The 80286 CPU was supposed to emulate an 8086’s behavior in real
mode, which is its mode at system start up, so it could run
operating systems and programs that were not written for protected mode (again
for compatibility reason).

However, the 80286 had a bug: it failed to force the A20 line to zero in real
mode.

For its predecessors, the segment:offset address will be rounded to the begining
of the memory address when the value segment*16+offset is larger than 0xFFFFF
(1M-1), so that the combination F800:8000 will be wrapped around to address
0x00000h, the memory’s first byte. But in 80286, for example, would no longer
point to the physical address 0x00000000 but the correct address 0x00100000.

Enabling the Gate-A20 line is one of the first steps a protected
mode x86 operating system does in the boot up process, often before control has
been passed onto the kernel from the bootstrap (in the case of Linux, for
example)2.

Intel no longer supports the A20 gate starting
with Haswell. Page 271 of the Intel System
Programmers Manual Vol. 3A from June 2013 #cite_note-states:
“The functionality of A20M# is used primarily by older operating systems and
not used by modern operating systems. On newer Intel 64 processors, A20M# may
be absent.”3

Protected Mode

As we remembered in the real mode, each logical address points directly into
physical memory location, every logical address consists of two 16 bit parts:
The segment part of the logical address contains the base address of a segment
with a granularity of 16 bytes.

In protected mode, the segment_part is replaced by a 16-bit segment
selector, in which the 13 upper bits (bit 3 to bit 15) contain the index of
an entry inside a descriptor table. The next bit (bit 2), Table Indicator
(TI) specifies whether the operation is used with the GDT or the LDT. The lowest
two bits (bit 1 and bit 0) of the selector, Request Privilege Level (PRL) are
combined to define the privilege of the request, where the values of 0 and 3
represent the highest and the lowest priority, respectively. This means that the
byte offset of descriptors in the descriptor table is the same as the 16-bit
selector, provided the lower three bits are zeroed.

In protected mode, the segmentation registers aforementioned, i.e. CS, DS, SS are
now holding the segment selectors instead of the segment addresses in the real
mode.

The Global Descriptor Table(GDT) is an array of Segment Descriptors. It is a
data structure used by Intel x86-family processors starting with the 80286 in
order to define the characteristics of the various memory areas used during
program execution, including the base address, the size and access privileges
like executability and writability.

The first segment descriptor of the GDT is always set to 0. This ensures that
logical addresses with a null Segment Selector will be considered invalid, thus
causing a processor exception.

A segment descriptor is 8 bytes long. Suppose the GDT table is
at 0x00020000 (the value stored in the gdtr register) and
the index specified by the Segment Selector is 2, the address of the
corresponding Segment Descriptor is 0x00020000 + (2 × 8), or 0x00020010.
After the Segment Descriptor is located, the segment base address and limit can
be retrieved into corresponding registers.

To enter protected mode, the GDT must first be created with a minimum of three
entries: a null descriptor, a code segment descriptor and data
segment descriptor. In an IBM-compatible machine, the A20 line(21st address
line) also must be enabled to allow the use of all the address lines so that the
CPU can access beyond 1 megabyte of memory (Only the first 20 are allowed to be
used after power-up, to guarantee compatibility with older software written for
the Intel 8088-based IBM PC and PC/XT models). After performing those two steps,
the PE bit must be set in the CR0 register and a far jump must be
made to clear the pre-fetch input queue.

BIOS uses Real Mode, since there is no GDT nor LDT yet in the memory at
startup, and the code that initializes the GDT, LDT and paging tables must run
in read mode.

An example of simple boot loader4, you will see the parts about GDT, A20,
etc:

80386

When 80386(1986-2007) and later Intel CPUs came out, they can access 4GB of
memory (16-bit segment register, 32-bit index register and 32-bit instruction
pointer), but again, to maintain backward compatibility with 8086 and 8088,
they are able to limit themselves to what the older chips could perform.

The 80386 added a 32-bit architecture and a paging translation unit, which
made it much easier to implement operating systems that used virtual memory. It
also offered support for register debugging5.

For 386 and its successors, there are control registers that changes or
controls the general behavior of a CPU.

The CR0 register is 32 bits long on the 386 and higher processors. On x86-64
processors in long mode, it (and the other control registers) is 64 bits long.
CR0 has various control flags that modify the basic operation of the processor.
The two bits related with segmentation and paging are bit 0 PE and bit 31
PG6.

When Protected Mode Enable (PE) is set to 1, system is in protected mode;
otherwise system is in real mode. This is usually done in the boot loader before
kernel has been loaded.

If PG = 1, enable paging and use CR3 register else paging will be disabled.
This is performed after kernel has been loaded.

Typically, the upper 20 bits of CR3 become the page directory base
register(PDBR), which stores the physical address of the first page
directory entry.

Regular Paging in Hardware

The physical address of the Page Directory in use is stored in the control
register CR3.

Since the 80386, the Intel processors paging unit handles pages of size 4KB.

The 32 bits linear address is divided into 3 parts:

Directory, the most significant 10 bits, so there are entries
(Page Tables) in a Page Directory; each entry is 4 bytes.

Table, the 10 bits after Directory part; it means there are
entries in a page table; each entry is 4 bytes.

Offset, the least significant 12 bits, so that it can specify
different addresses, thus a page is 4KB.

Extended Paging

Starting with Pentium, Intel added to x86 4MB pages instead of 4KB.

80486

Just as in the 80386, a simple flat 4 GB memory model could be implemented by
setting all “segment selector” registers to a neutral value in protected mode,
or setting (the same) “segment registers” to zero in real mode, and using only
the 32-bit “offset registers” (x86-terminology for general CPU registers used as
address registers) as a linear 32-bit virtual address bypassing the segmentation
logic. Virtual addresses were then normally mapped onto physical addresses by
the paging system except when it was disabled. (Real mode had
no virtual addresses.) Just as with the 80386, circumventing memory
segmentation could substantially improve performance in some operating systems
and applications.

On a typical PC motherboard, either four matched 30-pin (8-bit) SIMMs or one
72-pin (32-bit) SIMM per bank were required to fit the 486’s 32-bit data bus.
The address bus used 30-bits (A31..A2) complemented by four byte-select pins
(instead of A0,A1) to allow for any 8/16/32-bit selection. This meant that the
limit of directly addressable physical memory was 4 gigabytes as
well,(230 32-bitwords = 232 8-bit words)7.