"Undocumented windows 2000 secrets" translation - Chapter 4 (1)

xiaoxiao2021-03-06 53

Chapter 4 Exploring the Memory Management Mechanism of WINDOWS 2000

Translation: kendiv (fcczj@263.net)

Update:

Sunday, February 14, 2005

Disclaimer: Please indicate the source and guarantee the integrity of the article, and all rights to the translation.

Memory management is very important for operating systems. This chapter will fully overview the memory management mechanism of Windows 2000 and the structure of 4GB linear address space. In response to this part, the virtual memory addressing of the Intel i386 CPU family will explain the patch capabilities of the Intel i386 CPU, and the focus will be how the core of Windows 2000 uses them. In order to help us explore memory, this chapter provides a pair of programs: a driver for a kernel mode, which collects system-related information, and the other is a user mode application, the program will be controlled by device I / O To query the data from the driver and display in the console window. The SPY Driver module will be repeated in the remaining chapters to complete other very interesting tasks (these tasks need to be executed in kernel mode). Please insist on reading the first part of this chapter because it will directly face the CPU hardware. However, I still hope that you should not skip it, because virtual memory management is a very exciting topic, understand how it works, will help you inspect the mechanisms adopted by complex operating systems (such as Windows 2000).

Intel i386 memory management mechanism

The Windows 2000 kernel uses the virtual memory management mechanism in the protection mode provided by the Intel i386 CPU series. In order to better understand how Windows 2000 manages its main memory, minimal familiar I386 CPU's architecture is particularly important. Windows 2000 is designed for Pentium above the CPU. However, the memory management models used by these new processors are still stem from the design of the 80386 CPU, but certainly an important enhancement will be added. Therefore, Microsoft usually labels the version of Windows NT and Windows 2000 is Intel processor "i386" or "x86". Don't feel confused, no matter where you have 86 or 386 in this book, please remember that this is just a specific CPU architecture, not a specific processor version.

Basic memory layout

Windows 2000 provides a very simple memory layout for applications and system code. The 4GB virtual memory space provided by 32-bit Intel CPUs is divided into equal two parts. The memory address below 0x80000000 is used by modules in user mode, including Win32 subsystems, and remaining 2GB reserves the system kernel. Windows 2000 Advanced Server also supports another memory model that is commonly referred to as 4GT Ram Tuning, which is introduced with the Enterprise Edition of Windows NT 4.0 Server. This model provides 3GB user address space, and the other 1GB is reserved to the kernel, and the model is enabled by adding / 3GB options in Boot.ini.

Windows 2000 Advanced Server and DataCenter supports memory options for Physical Address Extension, PAE, allowing this memory to allow this memory by adding / PAE in Boot.ini. This option uses some Intel CPU characteristics (eg, the Pentium Pro processor) to allow greater than 4GB of physical memory to 32-bit address space. In this chapter, I will ignore this special setting. You can read Microsoft's basic knowledge articles Q171793 (Microsoft 2000C), Intel's Pentium Manual (Intel 1999a, 19999c), and Windows 2000 DDK Document (Microsoft 2000F) to get more information. Memory segmentation and request paging

Before you go deep into the technical details of the i386 architecture, let's return to 1978, and the Intel released the mother of the PC processor: 8086. I would like to limit the discussion to this milestone. If you plan to know more, read Robert L. 80486 programmer reference (Hummel 1992) will be a great beginning. It seems that this is a bit outdated because it does not cover the new features of the Pentium processor family; however, the basic information of a large number of I386 architectures remains in this reference manual. Although 8086 can access the address space of 1MB RAM, the application is still "see" the entire physical address space, because the address of the CPU register is only 16 digits. This means that the continuous linear address space accessible to the application is only 64kB, but through the help of the 16-bit register, this 64KB size memory window can move up and down in the entire physical space, the linear address in the 64KB logic space is used as The offset and the base address (disposed of 16-bit segment registers) are added to constitute a valid 20-bit address. This ancient memory model is still supported by the latest Pentium CPU, which is called: real address mode, usually called: real mode.

80286 CPU introduces another model called: protected virtual address mode, or simple name: protection mode. The physical address used in the memory model provided by this mode is no longer simple to add linear addresses and segment baseholders. In order to maintain backward compatibility with 8086 and 80186, 80286 still uses segment registers, but after switching to the protection mode, they will no longer include the address of the physical segment. Alternatively, they provide a selector, which consists of an index of a descriptor table. Each of the descriptor tables defines a 24-bit physical base address, allowing access to 16MB RAM, which is a very incredible quantity. However, 80286 is still a 16-bit CPU, so linear address space is still limited to 64KB.

1985 80386 CPU broke through this limit. The chip finally cut off 16-bit addressing chain, pushed the linear address space to 4GB, and retained the basic selector / descriptor architecture while introducing a 32-bit linear address. Fortunately, there are some remaining bits in the descriptor structure of 80286. After the 16-bit migrated to a 32-bit address, the size of the CPU's data register is doubled accordingly, and a new powerful addressing model is also added. The real 32-bit data and address have a practical convenience to the programmer. In fact, the 32-bit model is really supported in Microsoft's Windows platform for several years. The first version of Windows NT was released on July 26, 1993, which realized the true Win32 API. But Windows 3.x programmers still have to process 64KB memory films composed of separate code and data segments. Windows NT provides a flat 4GB address space, where you can use simple 32-bit pointers to address all code and data. No need to segment. Inside, of course, segments are still playing, just as I mentioned earlier. However, all the responsibilities of the management segment are moved to the operating system. Another new feature of 80386 is to support paging on hardware, exactly: request paging virtual memory. This technique allows a storage medium different from the RAM - hard drive to provide support for memory, for example, the CPU is replaced by replacing the most accessed memory data to the backup memory by resemination of the most recent memory data to the backup memory, thus Out of space so that you can access memory space than available physical memory. In theory, you can use this way to access 4GB continuous linear address space, and the backup media provided must be sufficient - even if only very little physical memory is installed. Of course, paging is not the fastest way to access memory, it is best to provide as much physical memory as possible. However, this is the best way to handle large amounts of data, even if these data exceeds the available physical memory. For example, both graphics and database programs require a large block of work memory. If there is no paging mechanism, some of which cannot be run in the low-end PC system.

The 80386 paging mode is to divide the memory into a page of 4KB or 4MB. The designer of the operating system can be freely selected between the two, or mix the two sizes of the page. Later, I will introduce the mixed size scheme adopted by Windows 2000: The 4MB page is used by the operating system, and the 4KB page is used by the remaining code and data. These pages are managed by the page table tree of the hierarchical structure, which records the page currently located in the physical memory, and also records if each page is actually located in physical memory. If the specified page has been replaced on the hard disk, some modules touch the address located in these pages, the CPU generates a latch interrupt (this is similar to the interrupt generated by the peripheral hardware). Next, the latch interrupt process in the operating system kernel will try to transfer this page again to physical memory, which may need to write data in another memory to the hard disk to make space. Typically, the system uses a minimum (LRU) algorithm to determine which page can be replaced. Now you can see why this process is sometimes referred to as ---- Request Paging: That is, a request is made by the software, and then physical memory is submitted according to the memory of the operating system and the application. The data in the backup storage device is moved.

The indirect addressing method provided by the page table contains two more fun. First, there is no preset relationship between the address and the address of the physical address bus used by the program. If you know that the data structure used by your program is in a certain address, such as 0x00140000, you may still don't want to know any information about these data, unless you want to check the page-table tree. This requires an operating system to determine the mapping relationship between these addresses. Even the currently effective address conversion is unpredictable, partially, which is the randomness inherent in the paging mechanism. Fortunately, in most applications, there is no need to know the physical address. However, some knowledge of this area is required for those who develop hardware drivers. Another concealed in paging is that address spaces do not have to be continuous. In fact, according to the contents of the page table, 4GB space can contain a large number of "empty", which are not mapped to physical memory and is not mapped into the backup memory. If an application is trying to read or write such an address, it will be immediately aborted. Later, I will explain in detail how Windows 2000 extends the available memory to 4GB address space. The segmentation and paging mechanisms used by the 80486 and Pentium CPUs are similar to 80386, except for some special addressing features, such as the Physical Address Extension, PAE mechanism used by Pentium Pro. With higher clock frequencies, another feature of the Pentium CPU is the dual instruction pipeline used, which allows it to perform two operations at the same time (as long as these two instructions do not depend on each other). For example, if the command A modifies the value of a register, and the instruction B adjacent to it requires this modified value to calculate, B will not be executed before A is completed. However, if the command b uses another register, the CPU can perform both instructions simultaneously. The Pentium Series CPU uses a variety of optimization methods to provide a broad space for the optimization of the compiler. If you are very interested in this topic, please refer to Rick's "Inner Loops" (Booth 1997).

In I386's memory management, there are three types of addresses, their terms - logic, linear and physical addresses appear in Intel's system programming manual (Intel 1999c).

1. Logical address: This is an accurate description of the memory address, usually expressed as 16 credits: xxxx: yyyyyyyy, here XXXX is Selector, and YYYYYYY is a linear offset for segment addresses selected by Selector. In addition to specifying the specific values of XXXX, the name of the specific segment register can be used instead, such as CS (code segments), DS (data segment), ES (extension), FS (additional data segment # 1), GS (Additional Data Section # 2) and SS (Stack Segment). These symbols are from the old "segment: offset" style, using this mode in 8086 real mode to specify "far point".

2. Linear addresses: Most applications and kernel drivers ignore virtual addresses. They are only interested in part of the offset of the virtual address, and this part is often referred to as linear addresses. This type of address assumes a default segmentation model, which is determined by the CPU's current segment register. Windows 2000 uses Flat Segmentation, at this time, both Cs, DS, ES, and SS registers are directed to the same linear address space; therefore, the program can be configured to communicate all of the code, data, and stack pointers. For example, at any time, an address in the stack can be converted into a data pointer without having to care about the value of the corresponding segment register. 3. Physical Address: This type of address will become very "fun" only when the CPU is working in paging mode. Essentially, a physical address is the measured voltage on the CPU pin. The operating system maps the linear address to the physical address by setting up a page table. Some attributes of the layout of the page table for Windows 2000 are very useful for debugging software developers, which will be discussed later.

There are many traces of virtual addresses and linear addresses, which will alternately use these two words in some documents. I will try my best to ensure the consistency of this term. It is important to note that Windows 2000 assumes that the physical address is 64-bit wide. The Intel i386 system typically has only one 32-bit address bus. However, some Pentium systems support greater than 4GB of physical memory. For example, using the PAE mode Pentium Pro CPU, this CPU can extend the physical address to 36 bits so that Multi 64GB of physical memory (Intel 1999c) can be accessed. Therefore, the API function of Windows 2000 usually uses data type physical_address to represent physical addresses, Physical_Address is actually an alias of the Large_integer structure, as shown in Listing 4-1. Both types are defined in the DDK header file NTDEF.H. Large_integer is actually a structured representation of 64-bit symbolic integers, which can be interpreted as a pair of 32-bit (lowpart and highpart) or a complete 64-bit (quadpart). LONGLONG type is equivalent to the native type __int64 of Visual C / C , which is called Ulonglong or DWordlong, which all rely on basic unsigned type __int64.

Figure 4-1 shows the segmentation model of the I386 memory, and the relationship between logical addresses and linear addresses will be described. For clearer, I will describe the descriptor table and segment painting. In fact, 32-bit operating systems typically use the segmentation scheme shown in Figure 4-2, which is a so-called smooth memory model (Flat Memory Model), which uses a 4GB size segment. The shortcomings of this solution is that the descriptor table becomes part of the segment, so that it can be accessed by a code with sufficient authority.

Typedef Large_integer Physical_Address, * pphysical_address;

Typedef union _large_integer

{

Struct

{

Ulong lowpart;

Long highpart;

}

Longlong quadpart;

Large_integer, * Plarge_integer;

Listing 4-1. Definition of Physical_Address and Large_integer

Figure 4-1. Memory segmentation of I386

Figure 4-2 The memory model given by Windows 2000 as a standard code, data, and stack segments, which means that all logical addresses will include CS, DS, ES, and SS segment registers. FS and GS processing methods are different. Windows 2000 does not use the GS register, and the FS register is specifically used to hold the base address of the system data area in the linear address space. Therefore, the foundation site of FS is much greater than 0, and the size does not exceed 4GB. Interestingly, Windows 2000 maintains two different FS segments for user mode and kernel mode. We will discuss this issue later. Figure 4-2. Smoothing 4GB memory segment

In Figures 4-1 and 4-2, the Selector of the logical address points to the descriptor table, which is specified by a register named GDTR. This is the Global Description Table Register of the CPU, which can be set by the operating system to any suitable linear address. The first item of the GDT (global descriptor table) is preserved, and the corresponding Selector is called "Null Segment Selector". Windows 2000 saves its GDT at 0x80036000. GDT can accommodate up to 8,19264-bit entries, that is, its maximum is 64KB. Windows 2000 uses only 128 items starting and limits the size of the GDT to 1,024 bytes. Together with GDT, I386 CPU also provides a local descriptor table, LDT, and an interrupt descriptor table (IDT), the start addresses of the two tables are saved in LDTR and IDTR, respectively. Two registers. The value of GDTR and IDTR is unique. Each task performed by the CPU uses the same value, and the value of LDTR is the task. The LDTR can accommodate a 16-bit Selector.

Figure 4-3 shows the conversion mechanism of complex linear addresses and physical addresses, if in 4kb paging mode, and allowing the request paging, the I386 memory management unit will adopt this conversion mechanism. Page-Directory Base Register, PDBR in the figure contains the physical address of the page directory. The PDBR is saved by the I386 CR3 register. Use only 20 highs of the register to address. Therefore, the page directory is also paged by a page. The remaining bits of PDBR are used as a logo or reserved for future extension. The page directory takes up a complete 4kb page, consisting of an array containing 1024 page directory entry (Page-Directory Entry), each page directory entry. Similar to PDBR, each PDE is divided into a 20-bit page frame number (PFN) and an array of flags. PFN is used to address page tables. Each page table is aligned with page-Table Entry (PTE). Each PTE is 20-bit as a pointer to a 4KB data page. Implement address conversion by dividing linear addresses: High 10 bits to choose a PDE (belonging to the page directory), the next 10 bits Select a PTE in the previously selected PDE, and the last remaining 12 The bit is used to specify the offset in the data page, which is determined by the previously selected PTE.

Figure 4-3. Two-layer indirect model (using 4KB page)

In 4MB paging mode, things become very simple, because the interior layer is eliminated, as shown in Figure 4-4. At this point, PDBR still points to the page directory, but only 10 bits of each PDE is used because the target address is 4MB alignment. Because there is no page table, this address is also the base address of the 4MB data page. So, the linear address at this time contains only two parts: 10 bits to select PDE, the remaining 22 bits as the offset. There is no 4KB in the overhead of the 4MB memory scheme. This is because the page directory needs additional memory. Each of these 1024 PDEs can address a 4MB page. This is enough to cover the entire 4GB address space. Therefore, the advantage of 4MB paging is to reduce the overhead of memory management, but the result is that the address is large. The 4KB and 4MB paging models have excellent advantages. Fortunately, the design personnel of the operating system do not have to choose one of both, and can be mixed using these two models. For example, Windows 2000 uses a 4MB size page, the kernel module hal.dll and ntoskrnl.exe are loaded into the address range. The remaining linear addresses are managed by 4KB. Intel recommends using this mixing design to improve system performance, which is also because 4KB and 4MB pages (Page Entry) will be cached to different conversion backup buffers (TLB), this TLB is located I386 CPU inside (Intel 1999C, PP.3-22F). The kernel of the operating system is usually relatively large, and resident memory is required, so if they save them in multiple 4KB pages, they will be permanently exhausted in a valuable TLB space.

Figure 4-4. A layer of indirect model (using 4MB page)

Note that all steps in address translation are performed in physical memory. PDBR and all PDEs, PTE contain all physical address pointers. The linear address found in Figures 4-3 and 4-4 is located in the lower left corner, the linear address will be converted to the offset in the physical page. On the other hand, the application must use a linear address, and they don't know anything about the physical address. However, this is not enough by mapping all of the page tables and its subordinate tables to linear address space. In Windows 2000 and Windows NT 4.0, all PDE and PTE can be accessed in linear address range 0xC0000000 ---- 0xc03FFFF, which is a linear memory area with a 4MB page. The PTE associated with it can be simply recorded by a linear address of 20, which is the index of the 32-bit PTE array, and the PTE array starts from 0xC0000000. For example, the PTE represented by address 0x00000000 is located at 0xC0000000. It is assumed that there is a linear address 0x80000000, by moving the address to the right, 0x80000 (ie 20 digits of the address), because each PTE occupies 4 bytes, so the address of the target PTE is: 0xc0000000 (4 * 0x80000 ) = 0xc0200000. Such a result seems interesting, the linear address divides the 4GB address space into two parts, and maps to a PTE address, thereby dividing the PTE array into equal parts.

Now let us further, calculate the address of the data item in the PTE array through the PTE itself. Conventional mapping formulas are: ((LineradDress >> 12) * 4) 0xC0000000. The LineradDress range is: 0xC0000000 ---- 0xc0300000. The data item located in the linear address 0xc0300000 points to the starting position of the PTE array in physical memory. Now let's take a look at Figure 4-3, starting with 1024 data items at address 0xc0300000 is definitely a page directory! This special PDE, the PTE arrangement is used by multiple memory management functions, which are exported by ntoskrnl.exe. For example, the API function mmisaddressValid () and mmgetphysicaladdressValid () and mmgetphysicaladdress () recorded by documents are used to find their PDE, such as available, and check their contents. MmisaddressValid () Simple inspection target page is in physical memory. If the test fails, it means that the linear address or invalid or the page referenced by the address has been replaced into the backup memory (represented by the system page fileset). MMGETPHYSICALADDRESS () First extracts the corresponding page frame counter (PFN) from the linear address, which is the base address of the physical memory page associated therewith (this page will be divided according to the page size). Next, it calculates the offset in the physical page through the remaining 12 bits in the linear address, and finally add the physical page base address indicated by the PFN and the offset calculated by the previously calculated offset to obtain the linear address. Physical address. More thorough inspection MmgetPhysicalAddress () implementation, another interesting feature of the Windows 2000 memory layout. The MMGETPHYSICALADDRESS () function first tests whether the linear address is located in 0x80000000 ----- 0x9FFFFFF. As mentioned earlier, there is a Hal.dll and Ntoskrnl.exe, and this is also the address block for Windows 2000 using the 4MB page. This interesting feature is that if a given linear address is in this range, mmgetPhysicalAdDress () will not care about all PDE or PTE. Alternatively, the function is simple to set the high 3 digits of the linear address to zero, and then add the byte offset, and finally the address will be returned as the physical address. This means, physical address range: 0x00000000 ---- 0x1ffffff will map to linear addresses 0x80000000 ---- 0x9ffffffFfffFFFF To know that ntoskrnl.exe is always loaded to a linear address 0x80400000, which means that the kernel of Windows 2000 is always in the physical address 0x00400000, which occurs in the base address of the second 4MB page located in physical memory. In fact, by checking these memory areas can prove that the above assumptions are correct. Memory Spy provided in this chapter will make you have the opportunity to see this.

supplement:

This part of the content is selected from the "32-bit assembly language programming under the" Windows environment "

X86 memory paging mechanism

When the X86 CPU works in the protection mode and virtual 8086 mode, you can use all 32 address lines to access 4GB of memory. Because all general registers of 80386 are 32-bit, use any universal register to indirect addressing, it is not necessary to segment the 4GB memory address.

But this does not mean that the register is no longer useful. In fact, the segment register is more useful, although there is no segmentation limit on the addressing, but in the protection mode, whether an address space can be written, can be written by how much priority code, is it permitted? Waiting for the issue of protection. To resolve these issues, you must define some security properties to an address space. Segment registers are sent to the field. However, the design properties and other parameters in the lower section of the protection mode are too much, and there is too much information to use 64-bit long data. We call this 64-bit attribute data as a segment descriptor. 80386 segment registers are 16-bit and cannot put down 64-bit segment descriptors in the protection mode. How to solve this problem? The method is to store the segment descriptor sequence in the memory in the specified position in the memory, constitute a DESCRIPTOR TABLE, and 16 bits in the segment register are used to specify the attribute segment description of this segment. The first few descriptors in the table are represented. At this time, the information in the segment register is no longer a segment address, but the segment selector. All information that can be obtained by "Select" in the paragraph descriptor table.

So where is the descriptor table stored? 80386 Introduces two new registers to manage paragraph descriptors. One is a 48-bit global descriptor table register GDTR, one is a 16-bit local descriptor table register LDTR. So why have two descriptor table registers?

The descriptor table pointed to GDTR is Global Descriptor Table GDT (Global Descriptor Table). It contains segment descriptors available in all tasks in the system, which typically contains the descriptors of the code segment, data segment, and stack segments that describe the operating system and the LDT segment of each task. The global descriptor table is only one.

LDTR points to the local descriptor table LDT (Local Descriptor Table). The 80386 processor is designed to have a separate LDT for each task. It contains a descriptor of each task private code segment, a data segment, and a stack segment, also contains some of the door descriptors used by the task, such as task doors and call doors descriptors.

The partial descriptors of different tasks have different memory segments, describing the descriptors of these memory segments as the system descriptor in the global descriptor table. The GDTR directly points to the memory address, the LDTR and CS, DS segment selectors are only stored as the index value, pointing to the location corresponding to the partial descriptor memory segment in the global descriptor table. As the task is switched, as long as the value of the LDTR is changed, the system current local descriptor table LDT also switches so that it is easy to isolate data between tasks. However, GDT does not switch as the task is switched.

How does the 16-bit segment selector use the global descriptor table and a local descriptor table? In fact, only 13 bits high in the segment selector represents the index value. In the remaining 3 data bits, the 0th, 1 bit represents the current priority RPL of the program; the second TI bit is used to represent the position of the segment descriptor; Ti = 0 is represented in GDT, Ti = 1 is shown LDT.

80386 Processor Takes a 4kB size as a "page" memory, physical memory per page can be randomly mapped to different linear addresses based on "Page Directory" and "Page Table". In this way, the mapping of the physical address discontinuous memory can be connected together, and it is considered continuous on the linear address. In the 80386 processor, all instructions associated with CR3 (saving the address) are used, all other instructions are addressed using linear addresses.

Whether to enable the memory paging mechanism is determined by the bit 31 (PG bit) in the 80386 processor's CR0 register. If pg = 0, the paging mechanism is not enabled, at this time, the address (linear address) of all instructions is the actual physical address in the system; when PG = 1, the 80386 processor enters the memory paging management mode, all linearity The address of the page is to obtain the final physical address. …………to be continued…………

转载请注明原文地址:https://www.9cbs.com/read-58768.html

9cbs

New Post(0)