Nuclear comparison: 2.6 kernel improves memory management
English original
content:
Reverse mapping large memory page in high-end memory storage page enabled stability Confirmation Reference Information About the author to the evaluation of this article
related information:
Improvement from 2.4 to 2.6 INTV 2.6 Web Services 2.6 Nuclear Network Improved Trend Trend Linux 2.6 Test Linux Reactive DEVELOPERWORKS TOUX SUBSCRIPTION
In the Linux area:
Tutorial Tools & Product Codes & Component Articles
From large memory pages to reverse mapping: higher stability and faster speed
Level: primary
Paul larson
(Pl@us.ibm.com) Software Engineer, Linux Technology Center, IBM2004 April
2.6 Linux kernel uses many technologies to improve the use of large amounts of memory, making Linux more suitable for enterprises than ever. This article lists some more important changes, including reverse mapping, using larger memory pages, page table entries store in high-end memory, and more stable memory manager.
With the development and ripening of the Linux kernel, more users look forward to Linux to run very large systems to handle scientific analysis applications or even massive databases. These enterprise applications typically require a lot of memory to run well. 2.4 Linux kernel has the ability to identify considerable amounts of memory, but the 2.5 kernel has changed many changes, so that it has the ability to handle a larger amount of memory in a more efficient manner. Reverse Mapping In Linux Memory Manager, the Page Table remains track of memory physical pages used by the process, which maps the virtual page to the physical page. Some of these pages may not be used for a long time, they should be swapped out. However, before they can be exchanged, you must find each process of map that page, so that the page table entry of the corresponding page in those processes can be updated. In the Linux 2.4 kernel, this is a daunting task, because in order to determine if a page is mapped by a page, the page table must be traversed every process. As the number of processes running in the system increases, the workload of these pages will be increased. Reverse mapping, or RMAP is implemented in the 2.5 core in order to solve this problem. The reverse mapping provides a mechanism for finding which processes are using a given memory physical page. It is no longer traversing the page table that traverses each process, and the memory manager has now established a list for each physical page, which contains pointers for page-table entries, PTEs that point to each process of the current map. . This chain is called a PTE chain. The PTE chain is extremely raised to find the speed of the process that maps a page, as shown in Figure 1. Figure 1. The reverse map in 2.6 is of course, nothing is free: the performance improvement in the reverse mapping can also pay the price. The most important thing in reverse mapping is that it brings some memory overhead. Have some memory to keep track of all those reverse mappings. Each entry of the PTE chain uses 4 bytes to store pointers pointing to the page entry, with an additional 4 bytes to store the next entry of the chain. These memory must use low-end memory, and this is not enough in 32-bit hardware. Sometimes this can be optimized to only one entry without using a linked list. This method is called a P-Direct Approach. If there is only one map to this page, you can use a pointer called "Direct" instead of the list. This optimization can only be made when a page is just a unique process mapping. If this page is mapped by another process later, it will have to use the PTE chain again. A tag setting is used to tell the memory manager when this optimization is valid for a given page. The reverse mapping also brings some other complexity. When the page is mapped by a process, it is necessary to establish a reverse mapping for all those pages. Similarly, the corresponding mapping must also be deleted when a process releases mapping on the page. This is especially common when exiting. All of these operations must be in the case of locking. This may be very wasteful for applications that implement many derived and exit, and increase a lot of overhead. Although there are some compromise, it can be proved that the reverse mapping is a quite value for the Linux memory manager. Through this way, finding the position of the location maps this serious bottleneck is minimized as only a simple operation. When the large application requests a large amount of memory to the kernel, the reverse mapping help system continues to operate and expand effectively. There are currently more improved reverse mapping is being studied, and there may be in future Linux kernel versions. Large memory page typically, the memory manager processed on the X86 system is 4 KB. The actual page size is related to the architecture. For most purposes, memory manager manages memory with such size. However, there are some applications to use special memory. Large database is one of the common examples. Since each page must be mapped by each process, you must create a page entry to map the virtual address to the physical address.
If your process is to map 1 GB of memory using 4KB page, this will use 262,144 page entry to keep track of those pages. If each page entry consumes 8 bytes, 2 MB overhead is required for each mapping 1 GB of memory. This is already very considerable overhead, however, if there are multiple processes to share those memory, the problem will become more serious. In this case, each process that maps to the same 1 Gb memory will pay yourself 2 MB at the time of the page entry. If there is enough process, waste on the overhead may exceed the amount of memory used by the application request. One way to solve this problem is to use larger pages. Most new processors support at least one small and a large memory page size. On X86, the size of the large memory page is 4 MB, or the system is 2 MB on the system open in the physical address extension (PAE). Assuming that the page size is 4 MB in front, the same 1 GB memory can be mapped with 256 page entries without 262,144. This way to turn from 2 MB of 2,048 bytes. The use of large memory pages can also improve performance by reducing the number of translation Lookaside Buffer, TLBs. TLB is a cache of a page table that enables pages listed in the table to make virtual addresses to physical addresses. Large memory pages can provide more memory with less actual page, which is equivalent to smaller page size, the more large memory pages used, the more memory can be referenced by TLB. Storage page entries in high-end memory In the 32-bit machine upper page table typically only stores in low-end memory. Low-end memory is limited to the first 896 MB of physical memory, and most of the rest of the kernel is also met. In the case where the application uses a large number of processes and maps a lot of memory, low-end memory may soon be unique. Now, there is a configuration option in the 2.6 core called HighMem PTE, allowing the page entry to store in high-end memory, release more low-end memory areas to those other core data structures that must be placed here. As a price, the process of using these pages will be slightly slower. However, for those systems where a large number of processes are running, store the page table to high-end memory, more memory can be extruded in the low-end memory area. Figure 2. Memory regional stability is better stability is another important improvement of 2.6 memory manager. When the 2.4 kernel is released, the user will immediately begin to encounter a stability problem related to memory management. From memory management's impact on the entire system, stability is critical. Most of the problems have been resolved, but the solution must fundamentally overturned the original memory manager and rewrite a simple manager to replace it. This has left a lot of space for Linux issuers to improve their specific release of Linux's memory manager. However, another aspect of improvement is that the memory management components in 2.4 are different from the release versions of use. To avoid this, memory management has become the most detailed part of the kernel development in 2.6. From a very low-end desktop system to large, enterprise-class, multi-processor systems, new memory management code has been tested and optimized above. Conclusion Linux 2.6 kernel memory management improvement is far from these features mentioned herein. Many variations are subtle, but quite important. These changes come together to active memory manager in the 2.6 core, and its design goal is higher performance, efficiency, and stability. There are some changes, such as HIGHMEM PTE and large memory pages, the purpose is to reduce the overhead of memory management. Other variations, such as reverse mapping, improved performance in certain key areas. The reason why these special examples are chosen because they exemplify what is adjusted and enhanced by the Linux 2.6 core to better deal with enterprise hardware and applications. Reference
Martin Bligh and David Hansen 's papers "Linux Memory Management On Larger Machines" published on 2003 Linux Symposium. Red Hat's Rik Van Riel in 2003 Ottawa Linux Symposium, in a presentation named Towards An O (1) VM, some of the disadvantages of the virtual memory subsystem are discussed, these shortcomings lead to a virtual memory subsystem to have many GB memory Don't work well on the machine. Mel Gorman gives a more in-depth understanding of Linux virtual memory manager. This has also been published in the form of a book by Prentice Hall. There is a page with Linux memory management in Kernel / Analysis Howto. There is an article about Large Page Support in The Linux Kernel, and another article about object-based reverse-mapping vm (object-based reverse mapping VM, which has been integrated into the 2.5 core). In IBM Systems Journal, you can read a lot of articles on how to test software testing in IBM. IBM provides performance management, measurement, and scalability services. IBM's Linux Technology Center is directly working with Linux development community. The Linux AT IBM site provides Linux news and information for the entire IBM. Before 2.6 release, IBM DeveloperWorks commented on Linux 2.6 considering some of the important features of the new kernel, including new scheduler and Native Posix Threading Library (NPTL) (DeveloperWorks, Sep 2003). Read the reliability of Linux (DeveloperWorks, December 2003). Paul Larson also considers the improvement in the 2.6 core development from 2.4 to 2.6 kernel development (developerWorks, February 2004). To learn about the rapid development of the 2.6 core, please read the Web services on 2.4 and 2.6 (DeveloperWorks, February 2004). More reference materials for Linux developers can be found in the developerWorks Linux zone. You can find a lot of selected Linux books on the Linux area of Developer Bookstore. About the author Paul Larson works for the Linux Test team of IBM Linux Technology Center. In the past year, the projects he engaged in Linux test projects, 2.5 / 2.6 kernel stability and core code cover analysis. You can contact him via Pl@us.ibm.com.