Virtual Memory Management on Linux on X86

xiaoxiao2021-03-06  36

Virtual Memory Management on Linux on X86

Author: Zhou wake 2002-09-30 06:02:00 from: http: //www.china-pub.com

Foreword Linux supports many hardware running platforms, commonly available: Intel X86, Alpha, SPARC, etc. For some features that cannot be universal, Linux must be implemented in accordance with the characteristics of the hardware platform. The purpose of this article is to briefly explore how Linux implements how virtual memory management functions in X86 protection mode. To simplify and convenient, this paper is limited as follows: X86 processor is 80486 and the following processor, X86 works in protection mode, does not use physical memory extensions (using 32BITS physical address), does not use extended pages (page size 4K ). Anything that is not related to the qualified mode is slightly skewer. The content that is not related to the hardware platform in virtual memory management of Linux is also slightly somed. The Linux kernel source code version quoted in this article is Linux 2.2.5. The segmentation mechanism of the X86 and the paging mechanism I. X86's segmentation mechanism of the corresponding system structure X86 is to divide the linear address space of the X86 into many small space-segment (segment), using these segments to store (record) code And data, by protecting the segment to provide a protection of data or code. Depending on the role of each segment and the storage content, X86 is divided into three types of process segments (code segments, data segments, and stack segments) and two types of system segments: Task State Section (TSS, Task-State Segment) and LDT. Segment (Since the GDT is not accessed by the segment descriptor and segment selector, X86 does not think there is a GDT segment; the same, there is no IDT segment).

In the segmentation mechanism, X86 uses the following major data structures: • Global Description Table (GDT, Global Describer Table): Store the segment descriptor used by the system and the segment descriptor shared by each task, which can be any of the above A type of segment descriptor, maximum surface length 64KB; • Local descriptor table (LDT, Local DescribTOR TABLE): Store segment descriptors for each segment of a task-specific segment, can only be the segment descriptor of the three types of processes And call the door descriptor, the maximum surface length 4GB; • Segment Descriptor: 64bits, the base address used to describe a segment (this address is a linear address), the type of the segment, the limit of the segment; Gate Descriptor: 64bits, a special descriptor, providing protection for calls or procedures in different privilege levels; divided into four categories: Call Gate Description (Call Gate Descriptor), Interrupt Gate Descriptor, Trap Gate Descriptor, Task Gate Descriptor; Segment Select Selector: 16bits, used to index in GDT or LDT Segment descriptor; interrupt description table (IDT, Interrupt Describer Table): Store door descriptor, can only be interrupt gate descriptor, trap door descriptor and task door descriptor, maximum point length 64KB; at the same time, X86 is provided Several registers for supporting segmentation mechanisms: • Global Description Table Register (GDTR, GDT Register): 48Bits, 32BITS is the base address of GDT (linear address), 16BITS is the top of GDT; the initial value of GDTR is : The base site 0, the top length is 0xfffff; • Local Descriptor Table Register (LDTR, LDT Register): 80Bits, 16bits is the LDT segment selector, 64bits is the segment descriptor of the LDT segment; · Interrupt Descriptor Table Register (IDTR, ID REGISTER: 48BITS, 32BITS is the base address of IDT (linear address), 16bits is the top of IDT; IDTR's initial value is: base address 0, the top length 0xffffff; · Task Register (TR, TASK Register): 80bits, 16bits is the task status segment selector, 64bits for the paragraph descriptor of the task status segment; · Six segment registers: Divided into visible parts and hidden parts, the visible portion is the segment selector, the hidden part is a segment descriptor;

Six segments are CS, SS, DS, ES, FS, GS, respectively, for these segment registers, see [1] 3.4.2 'segment register'; 86 Work in the protection mode, the 48bits logical address used by the process Logical Address. The high 16bits of the logical address is the segment selector, and the low 32bits is the offset in the segment. The segment selector is indexed in the GDT or LDT index (the base address of the segment), and the offset is added to obtain a linear address corresponding to the logical address. If the branching manage is not used, the linear address is a physical address (Physical Address), so it can be directly accessed with a linear address; otherwise, the linear address is converted to a physical address. The above is a brief description of the related content of the x86 segmentation, for each data structure, the details of the register and the logical address to the linear address, please refer to [1]. II. The paging mechanism of X86 and the linear address space of the corresponding system structure 32BITS can be directly mapped to the physical address space, or indirectly mapped to many small pieces of physical space (disk storage space). This indirect mapping method is a paging mechanism. The x86 available page size is 4kb, 2MB, and 4MB (2MB and 4MB can only be used in the Pentium and Pentium Pro processors, which is limited in this article).

In paging mechanism, x86 uses four data structures: • PPE, Page Directory Entry: 32BITS Structure, High 20bits is a Page Base Address (physical address), with 4KB of increment unit, low 12bits is a page table Attributes, specific conversions See the beginning of the initialization section; • Page Directory: Store page directory items, in a page, a total of 1024 page directory items; • PTE (PTE, Page Table Entry): 32BITS Structure High 20bits for the page base address (physical address), low 12bits for page properties; • Page Table: Store page items, in a page, a total of 1024 page entries; page (Page) : 4KB Continuous address space; in order to achieve paging mechanism and improve address translation efficiency, X86 provides and uses the following hardware structure: • Page flag (PG, PAGE): This flag is 1, explaining page mechanism; actual That is to control the 31bit; • TLBS, TLBS, Translation Lookaside Buffers: Store the most recently used PDE and PTE to improve the efficiency of address translation; • Page Directory Base Site Register (PDBR, Page Directory Base Register : The base address (physical address) used to store the page directory is actually controlling the register CR3; in order to implement the linear address to the physical address, X86 interprets the 32BITS linear address as three parts: Item 31bit to No. 22Bit is the page directory The offset is used to index the page directory item (base address of the corresponding page table); the 21bit to the 12bit is the offset in the page table, used to index page entries (getting the corresponding page base address); 11bit By the point in the page 0BIT. In this way, the physical address corresponding to the linear address can be correctly obtained by the two-stage index and the offset in the page. For detailed description and role of paging mechanism, please refer to the reference document [1]. Linux's segmentation strategy Linux uses a minimum segmentation mechanism on X86, the purpose is to avoid complex segmented mechanisms, improve Linux in other hardware platforms that do not support segmentation mechanisms, and fully utilize The segmentation mechanism of the x86 is to isolate the user code and kernel code. Therefore, on Linux, the logical address and the linear address have the same value. Since the maximum length of the x86 is 64kB, each segment descriptor is 8B, the GDT can hold up to 8192 segment descriptors. Each process is generated, Linux creates two descriptors in GDT in GDT: LDT segment descriptors and TSS descriptors, remove Linux before the first 12 items reserved in GDT, and GDT actually holds up to 4090 processes. Linux's kernel own has a separate code segment and data segment, which stores 2nd and third items stored in GDT, respectively. Each process also has a separate code segment and data segment, and the corresponding segment descriptor is stored in its own LDT.

See Table 1, see Schedule 2, see Schedule 2, see Schedule 2. In Linux, each user process can access 4GB linear address space. Where 0x0 ~ 0xBFFFFFFFF's 3GB space is user-mate, the user-state process can be accessed directly. From the 1GB space of 0xC0000000 ~ 0x3FFFFFFFFFF, the code and data of kernel access are stored, and the user-state process cannot be directly accessed. When the user process is invoked by interrupt or system call, the privileged transition of X86 is triggered (switched from the privilege level 3 to privilege level 0), that is, switched from the user state to the kernel. Linux's paging policy standard Linux's paging is a three-level page table structure, in addition to the page directory and page supported by x86, there is also a level called an intermediate page directory. Therefore, the linear address is interpreted as four parts (not the three parts recognized by x86), increasing the index in the page intermediate directory. When running on the X86 platform, Linux defines the number of page directory items in the middle page directory, and supplies a set of related macros (these macros can replace the intermediate page directory). The decomposition process is perfectly converted to the secondary page used by X86. In this way, there is no need to change the main code of the page interpretation in the kernel (these code considers the linear address consisting of four parts). See Linux source "/include/ASM/pgtable.h"/include/asm/page.h". The kernel state virtual space is mapped to the physical address 0x0 ~ 0x3FFFFFFFFF (4MB) from 3GB to 3GB 4MB (Page Table 468 of the corresponding processes page directory). Therefore, when the process is in the kernel, the low 4MB space of the physical memory can be accessed by accessing 3GB to 3GB 4MB. All processes are the same from 3GB to 4GB linear space, and the same page table, the same page table, mapped to the same physical memory segment. Linux allows the internal nuclear state process to share code and data in this way. Linux segmentation page initialization No matter how the Linux system is booted, after zimage (see Arch / I386 / Boot / Bootsect.s) or LILO, finally jump to execute Arch / I386 / boot / setup.s (loaded to setupseg, Physical Address 0x90200), setup.s Gets the hardware parameters of the computer system from the BIOS (such as hard disk parameters), put it in the memory parameter area (temporary place), and do some initial status checks to prepare for the protection mode. For the specific implementation of the boot process and setup.s, see [2]. The kernel initialization module in the protection mode is executed from the physical address 0x100000. The code and data structure started by the address correspond to the Arch / I386 / Kernel / Head.s, see Schedule 3. The main function of the initialization module is to initialize the relevant register ID, GDT, page directory, and page tables, etc. Below, Ignore the details of the Head.s execution process, summary the main initialization function of Head.s.

1. Initialization of Partial Register: Introduction Segment Register DS, ES, and FS to initialize the __kernel_ds (0x18, include / asm-i386 / segment.h) (introduction to the description and segment selector of the previous segment register) It can be known that its function will be positioned to the third item (kernel data segment) in the GDT, and set the operating specific level of the segment to 0); set the PG bit of the CR0, and set it according to the model number of the CPU AM, WP, NE and MP; initialize CR3 (page directory swapper_pg_dir); set ESP high 32bits to __kernel_ds (0x18), low 32bits for init_user_stack 8192; LDTR is initialized to 0. 2. Initialization of IDT: This is just a temporary initialization IDT, further operation is performed in START_kernel; used to represent the variable of IDT (IDT_TABLE []) defined in Arch / I386 / Kenel / Traps.c, Variable Type (DESC_STRUCT) Define in include / ASM-I386 / DESC.H. IDT has a total IDT_ENTRIES (256) interrupt descriptor, and the property word is 0x8E00, and each interrupt descriptor points to the same interrupt service program ignore_init. The functionality of Ignore_int is merely output messages INT_MSG ("Unknown Interrupt"). The value of IDTR is achieved by the command LIDT IDT_DESCR. By viewing IDT_DESCR in Head.s, it can be calculated that the base address of the IDT is the address of IDT_TABLE, the top length ID_ENTRIES * 8-1 (0x7FF). 3. Initialization for GDT: GDT has a GDT_ENTRIES segment descriptor. The calculation formula of GDT_ENTRIES is: 12 2 * NR_TASKS. 12 of which represent 12 of the previously mentioned Linux reserved in GDT, NR_TASKS (512) refers to the number of processes accommodated, defined in include / Linux / Tasks.h. GDT allocates the storage unit (labeled GDT_TABLE) in Head.s. The GDT after initialization is shown in Schedule 1. The value of GDTR is implemented by the command LGDT GDT_DESCR. By viewing GDT_DESCR in Head.s, it can be calculated that the base address of GDT is the address of GDT_TABLE, and the length GDT_ENTRIES * 8-1 (0x205f). 4. Initialization of Page Directory: Page Directory is represented by variable swapper_pg_dir, with a total of 1024 page directory items.

Articles 0 and 768 are pointed to the PG0 (page 0), the initialization value is 0x00102007 (depending on the value of the value of 20BITS 0x102), 0x102 * 4KB = 0x102000, the physical address is 0x102000 after the page 0 This, it is understood that the virtual addresses 0x0 and 0xBffFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFfff (3GB)) are mapped; the initial value of other page directory items is 0x0; 5. PG0 initialization: N items correspond to the nth page, the attribute is 0x007; that is, the first 20bits value of the initialization value of the nth item is N, the bottom 12BITS value is 0x007; It can be seen that the PG0 maps the low 4MB space of the physical space; 6. Initialization EMPTY_ZERO_PAGE: The first 2kb space of the page is used to store setup.s to save system hardware parameters from the BIOS of the memory parameter area; after 2kb space is used as a command line buffer; Head.s is initially initially called start_kernel (init / main.c) Continue Initialization of all aspects, mainly to call the data structure of various aspects of the initialization kernel, the following pair of call functions related to the X86 system (related to this article) function. 1. setup_arch () (Arch / i386 / kernel / setup.c); set the kernel available physical address range (Memory_Start ~ Memory_end); set the range of init_task.mm; call Request_Region (kernel / resource.c) Apply I / O space See Schedule 4. 2. Paging_init () (ARCH / I386 / MM / INIT.C); Cancel the virtual address 0x0 mapping of the low-end 4MB space of the physical address; all page tables are initialized depending on the actual size of the physical address. 3. TRAP_INIT () (Arch / I386 / Kernel / Traps.c); set various entrance addresses in IDT, such as abnormal event handler portals, system call entry, call doors, etc. Among them, Trap0 ~ Trap17 is a variety of error portals (overflow, 0 divided, page error, etc.), the error handler is defined in Arch / I386 / kernel / entry.s); TRAP18 ~ TRAP47 reserved; set system call (int 0x80) entry For System_Call (Arch / I386 / Kernel / Entry.s); set the TSS segment descriptor and the LDT segment descriptor of the No. 0 process in GDT. 4. INIT_IRQ () (Arch / I386 / Kernel / Irq.c); Initializing the 0x20 ~ 0xFF item in the IDT. 5. Time_init () (Arch / I386 / Kernel / Time.c); Read the real time, reset the clock interrupt IRQ0 interrupt service program entry. 6. MEM_INIT () (Arch / i386 / mm / init.c); initialization EMPTY_ZERO_PAGE; tagging the page that has been occupied. Linux processes and segmentation pairs Whenever a new process is started, Linux creates a process control block (task_struct, include / linux / sched.h).

转载请注明原文地址:https://www.9cbs.com/read-62093.html

New Post(0)