REN

Ph.D. in Computer Science at Rutgers University

Segmentation and Paging in Linux

Segmentation and Paging are two mechanisms provided by processor to manage physical and virtual memory. This post will discuss on how Linux Kernel uses these two mechanisms to manage RAM.


Segmentation and Paging in Hardware

I've posted two blogs on segmentation and paging of Intel IA-32 architecture. Segmentation was introduced in early x86 since the data bus width is 16 bits while the memory bus width is 20 bits. Paging mechanism seperates physical memory address space from user and programmer so that they have an illusion that their own programs could use the whole physical memory. To learn more on hardware segmentation and paging, click on them!


Segmentation in Linux

Segmentation has been included in x86 microprocessors to encourage programmers to split their applications into logically related entities, such as subroutines or global and local data areas. However, Linux uses segmentation in a very limited way. In fact, segmentation and paging are somewhat redundant, because both can be used to separate the physical address spaces of processes: segmentation can assign a different linear address space to each process, while paging can map the same linear address space into different physical address spaces. Linux prefers paging to segmentation for the following reasons:

Linux Kernel has four main segments: User Code, User Data, Kernel Code, Kernel Data. The corresponding Segment Selectors are defined by the macro __USER_CS, __USER_DS, __KERNEL_CS, __KERNEL_DS. By setting the base address of the four segments to be 0x00000000, the offset now coincides to be the same as linear address. Another important consequence of having all segments start at 0x00000000 is that in Linux, logical addresses coincide with linear addresses; that is, the value of the Offset field of a logical address always coincides with the value of the corresponding linear address.

   

The picture above shows the GDT in Linux. In uniprocessor systems there is only one GDT, while in multiprocessor systems there is one GDT for every CPU in the system. Each GDT includes 18 segment descriptors and 14 null, unused, or reserved entries. Unused entries are inserted on purpose so that Segment Descriptors usually accessed together are kept in the same cache line. Remember, except for the very early boot stage, everything is running on x86 protected mode, so the segment registers stores the selectors of each segment, and gdtr and ldtr stores the base linear address of GDT and LDT respectively.

For more information on Linux GDT and LDT, please go for a terrific book: Understand the Linux Kernel 3rd.


Paging in Linux

As we just discussed, Linux skipped segmentation mechanism by set every segment's base to be 0x00000000 and limit to be 0xFFFFF. So paging is the mechanism adopted by Linux. As we know, Linux serves as an universal operating system which supports multiple platforms. Paging mechanisms may vary from different platforms; Thus Linux is able to provide an abstract paging scheme to fit different architectures.

We've talked in detail on x86 and x64 paging mechanism in my previous posts. For 32-bit regular paging, 2 levels of paging is sufficient, for PAE-enabled, we need 3 levels. Up to version 2.6.10, the Linux paging model consisted of three paging levels. Starting with version 2.6.11, a four-level paging model has been adopted. The four types of page tables are called: Page Global Directory, Page Upper Directory, Page Middle Directory, Page Table

   

The advantage of using a four-level paging model is flexibility. Four-level paging coudl meet most of existing architectures like IA-32, x64, SPARC, ARM, MIPS, ALPHA64. For IA-32 without PAE enabled, two-level paging is sufficient; So the Page Upper Directory and the Page Middle Directory are folded by setting the number of entries in them to 1 and mapping these two entries into the proper entry of the Page Global Directory.

For detailed analyses and code on Linux paging, please go throught ULK. I promise once I have time, I'll finish them.