Segmentation is a mechanism in x86 for memory management. In this article, I'll cover why x86 need segmentation and more details on segmentation in x86.
Why we need segmentation in x86 Architecture
To trace back the origin of segmentation, we need to talk about the ancestor - 8086 first. Several design decisions related to the x86 instruction set architecture can be traced back to the 8086 and it is one of the reasons why several ugly software hacks have been developed for backward compatibility purposes.
As we know, the data bus width of 8086 is 16 bit. One unique feature of 8086 was that it had 20 address lines, the address bus being multiplexed with the 16-bit data bus. This means the address space it could access is up to 220 bytes = 1 MB. However, being a 16-bit processor, the 8086 has a Program Counter register (IP) of size 16 bits and several general-purpose registers, some of which have special usage in memory addressing. If one of these registers were used to address memory, it would restrict the total amount of addressable memory to just 216 bytes = 64 KB.
To use 16 bit registers to address 20 bit memory address space, genious engineers of Intel thought of a method called segmentation. This addressing method is also called real mode of x86.
Real mode, originating from 8086, is an addressing mode that is compatible for all x86 processors. As we just said, the memory space in real mode is 220 bytes = 1 MB. This 1 MB address space is divided into several 'segments', and each segments contains 64 KB memory space, which could be addressed by one 16-bit register in 8086. We choose a starting point of the segment we want, and assign an offset of 16 bit, then we could access any address in 1 MB memory space. And that's what segment regisers in x86 for. In 8086, there're 4 segment registers: CS, DS, ES, SS, each is 16 bit and in which stores the starting address of this segment. The logical address of x86 real mode is
Segment Base : Segment Offset
The address bus is 20 bits wide though, which means we need to give it a 20-bit address. To achieve this, the logical address is converted as follows into a linear address:
Segment Base * 16 + Segment Offset
which involves shifting the Segment Selector by 4 bits to the left and adding the offset to it. The base address of any segment thus always has its lowermost 4 bits set to 0.
There're several motivations for the emergence of protected mode in x86. Frist of all, according to moore's law, the physical memory capacity is becomes not only larger and larger but also cheaper and cheaper. 1 MB addess space is too small to keep pace with physical memory capacity growth. Second, in real mode, there's no memory protection. One program could access the whole 1 MB address space. Thus it may intervene data or code of another program, which is not safe. In addition, there is no support for multitasking and any program can execute any CPU instruction which may lead to unintended consequences (for example, a program bringing the whole system to a halt).
From 80286, all x86 processor's are work in protect mode when OS kernel image is loaded into memory. In 80286, address bus width is 24 bit, supporting memory space up to 224 bytes = 16 MB. In 80386, both address bus width and data bus width are 32 bit, supporting memory space up to 232 bytes = 4 GB. Here's the reason why it's non-sense to add physical memory over 4GB for a machine whose address bus width is 32 bit. However, the processor need to compatible with previous ones for the sake of maintenance. Thus, from 80386, real mode and protected mode are both supported in Intel x86 processors, and they has a alias called IA-32.
In protected mode, segmentation is quite different with that of real mode. As we know, although general purpose registers, index and pointer registers are extended to 32 bit from 80386 or even 64 bit from Pentium Pro, segment registers are still 16 bit. It's not realistic to store the starting point into a 16 bit register because the address is 32 bit wide. Thus, it uses a level of indirection. There' s a set of table called Global Descriptor Tables (GDT) or Local Descriptor Tables (LDT) which store the base address and other information of a segment. The segment registers stores one 16 bit field called Segment Selector that contains the offset pointing to certain entry called Segment Descriptor in GDT or LDT. Thus the information of a given segment could be found.
Global & Local Descriptor Table
The Global Descriptor Table (GDT) is a table in memory that defines the processor's memory segments. The GDT sets the behavior of the segment registers and helps to ensure that protected mode operates smoothly. More about GDT and LDT
Segment Descriptor is a 8-byte field, as an entry of either GDT or LDT, which contains important information of a segment such as base address, segment limit, access permission etc. Here's the examples of segment descriptors:
Segment Selector is a 16-bit field, stored in segment registers, specifying the which Segment Descriptor corresponds to the segment. In Protected Mode, the segment selector consists of 3 parts:
- index, which is an index into the GDT and thus points to an entry of the GDT called a segment descriptor. - table indicator, which indicates whether the selector points to a GDT or an LDT. - requestor privilege level (RPL), which is the privilege level of the CPU when the selector is loaded.
Bits 0 to 13 of a segment selector specify the index, as described above. This is an index into the GDT, which means specifying 0 for the index will access the first segment descriptor, 1 will access the second descriptor and so on. The index is multiplied by 8 to get the address of the descriptor in the table.
Every time the CPU need to get an address, it first need to get the segment selector in segment register, and then use the offset in selector to visit GDT or LDT which are stored in main memory. As it discussed in my previous article about Cache, accessing main memory is too slow and performance of pipelining will severely affacted. Thus, each segment register is equipped with an accompanying little non-programmable register called "shadow register" caching the segment descriptor of the current segment. Whenever a segment selector is loaded in one of the registers, the corresponding segment descriptor is accessed from the GDT in memory and loaded into this shadow register. Future references to the segment descriptor will use the contents of this shadow register and memory access could be avoided.
Calculate an linear address in Protected Mode Segmentation
Step 1: Examines the TI field of the Segment Selector, in order to determine which Descriptor Table stores the Segment Descriptor. This field indicates that the Descriptor is either in the GDT (in which case the segmentation unit gets the base linear address of the GDT from the gdtr register) or in the active LDT (in which case the segmentation unit gets the base linear address of that LDT from the ldtr register).
Step 2: Computes the address of the Segment Descriptor from the index field of the Segment Selector. The index field is multiplied by 8 (the size of a Segment Descriptor), and the result is added to the content of the gdtr or ldtr register.
Step 3: Adds to the Base field of the Segment Descriptor the offset of the logical address, thus obtains the linear address.
Segmentation is first used to solve the problem in 8086 that address bus width is larger than data bus width. With the growth of x86, now segmentation becomes an important method in memory management. There's another meomry management method called paging, I'll cover paging in the next post in this category. However, Linux use segmentation in a very limited way. This remind me of a very classic philosophy in Unix: providing mechanism instead of policy. x86 provide two mechanism of memory management segmentation and paging, while how the Operating System implement is the policy. In the category of Linux, I'll post another article about segmentation and paging in Linux.