This chapter focuses on page-type virtual memory mapping and address transformation process; LRU, FIFO alternative algorithm; LRU stack analysis process; Cache group associated address map and LRU block replacement; virtual memory, Cache performance analysis, demand Application level. This chapter is a key chapter. The basic concepts that requires: LRU, FIFO, full association, direct image, group association, fast table, hit rate, address transform, page, paragraph, segment-type management, virtual memory, cache, etc. .
First, the local principle of the visit
Computer's requirements for memory are high speed, large capacity, low prices.
One of the regularities obtained from a large number of statistics is that the access of 90% of the storage space is limited to the 10% area of the storage space, while the other 10% of access is distributed in the remaining 90% of the storage space. This is the general principle that is usually said. The local law of the visit includes two aspects:
1, time locality: If a storage item is accessed, it may be accessed again. 2, spatial locality: If a memory is accessed, the item and its neighboring items may be accessed quickly.
In order to solve the contradiction between memory capacity and speed, people have applied access to local principles, and design the storage system as a hierarchical structure to meet the requirements. In this hierarchical storage system, it is generally consisting of registers, cache, main memory (memory), deposit (hard drive, etc.). Where registers are the highest level of storage components, the smallest capacity is the fastest. The register is opaque to the programmer, and it is necessary to access the register name instead of the address.
Second, the basic principle of storage system
Since the storage system uses a hierarchical structure, how the data accesses between the layers in the storage system is quite important. The management functions are generally distributed at each level, and each level of storage management controller controls this layer and data access to the associated layer. The unit of the interlayer delivery data is called block or page.
The number of access times and the number of visits in the hit rate merge, the loss efficiency is the number of visits of the failure and the number of total access times. The hit time includes judging whether the time required for the hit and the time required to intermittent to the upper memory, including the access time of the lower memory, and the time required to transfer the data in the lower memory to the upper memory (transmission time)
The goal of the memory design is to reduce the average access time rather than just increase the hit rate. That is to say, the speed performance indicator of the hierarchical memory is the average time of the interview. There is also a bandwidth (bandwidth), a storage cycle, and the like.
Average access time = hit time failure time × lost efficiency
Three problems that the hierarchical storage system must be solved:
1. Location Problem: Which position is stored in the higher layer memory? How to determine and find the block? This is the logo and addressing problem of the block. Generally use the symbolic method to map, identify, and addressing. 2. Replacement Problem: When you miss it, you should transfer the upper data block from the lower layer. If the above is full, how to replace the data in the upper layer at this time? What is good. This is the problem of replacing the strategy. 3. Update Problem: When you need to write access, when is the result obtained in the upper layer to the lower memory? Because the upper layer data is new than the underlying data, the mode is to solve the upper and lower layer data. Consistency problem.
The content of this chapter is carried out around these three problems. Solving these three issues, the main issues of hierarchical storage system management are solved.
Third, cache
The content of this section is to tell about how the Cache cache solves the above three issues, and the management of other layers is basically similar to its solution.
The cache is a high-level storage subsystem located between the CPU and the main memory. The main purpose of the cache is to increase the average access speed of the memory, thereby matching the speed of the memory also the speed of the CPU.
1. Basic working principle and structure of Cache
Cache is usually composed of two parts, block tables, and fast memory. The basic structure of Cache can be seen in the textbook. Figure 7.4, the working principle is that the processor is accessed by the main memory address, and the high segment of the memory address is used by the main memory -Cache address image mechanism with the check list to determine whether the memory cell of the address is in Cache. If you are, then cache hits, press the Cache address to access Cache. Otherwise, cache is not intended, then you need to access the main existed, and transfer to the corresponding data block into the Cache from the main memory. If you have written it in Cache, you should replace a piece of Cache in the cache, and modify it. The related address image relationship. From this working principle, we can see that it has involved two problems. The first is positioning, then the problem is replaced.
The presence of Cache is transparent to the programmer. The replacement algorithms of their address transform and data block are implemented by hardware. Usually Cache is integrated into the CPU to increase access speed.
2. The following is to be expanded to describe how the address maps and transformations in Cache are made.
Because the processor access is accessed by the main memory address, the cache is much smaller than the main memory, how does this access content in cache? Which location in cache is? This requires address image That is, the address in the main memory is mapped to the address in cache. Let Cache correspond to several blocks in the main memory, and when accessing a primary memory address, you can know which address in Cache is. There are three ways to address the address map: direct image, full-phase figures, and group interface.
Direct image is a specified address to the main memory address into the cache. At any time, the data of the storage unit in the main memory can only be transferred to a location in Cache, which is fixed. If this location has data, a conflict is generated, and the original block will be unconditionally replaced.
The full-phase figures are any way to map to any Cache address. In this manner, the data of the storage unit in the main memory can be transferred to any location in the cache. A block conflict occurs only after all the blocks in cache are full.
The group phase is referring to the pages of the storage space into several groups, and the direct images between the groups, and each block is a full phase image.
The three address image modes will be compared below.
Direct Image Full-Pixel Group Piece (1) The main memory address is divided into area code, block number, block interior site (2) Intercepting the Cache address in the main memory address as a cache address (3) Access the directory table with the block number to read the area code and the main memory address (4) If you are equal, the hit (5) is not equal, the block is invalid, stop Cache Access. Interview with the main memory, and adjust the block (1) The main memory address is divided into the main memory number and the block interior site (2) Compare with the main block number (3) If the same, the Cache block number, Cache block is removed. The number and the interior address is attached to the cache address, and if there is no different, if there is no identical, it will generate a missing block, the block (1) main memory partition number, group number, block number, block interior site (2) group group Select a set of (3) Comparison of the area code block number (4) or not found, if the block fails (5) If you find the same, you will read the Cache block number and group number and The internal address splicing forms a cache address. Directory Term: Cache Size Width: Main memory bit - Cache address bit: Cache size wide: (Main memory number cache block) bit main memory number number Participation is relatively long: 2nccache size wide: (area number 2 Block (area code block number) Participation and comparative advantages (1) Hardware province, small catalog table, low cost (2) Access Cache and access area tables simultaneously (1) block conflict (2) Cache Space Utilization The advantages of the highest concentration full association and direct image make up for their disadvantages Disadvantages (1) Block conflict probability (2) Cache space utilization is very low (1) Image Table too long (2) Check table speed slow block Conflict is still greater than the full-port-associated utilization is lower than the full-phase connection table.
3. Replacement strategy and update the main memory policy
There are three ways to read data blocks from the main memory: read, pre-read, and select read. These three ways have advantages and disadvantages, please pay attention to comparison. The textbook mentioned "Shared data is placed in the main memory in Cache, especially in multiprocessor system", because shared data is often rewritten by other processing procedures, if they are placed in cache, often It is involved in the consistency problem of data, so it can ensure that its singleness can be guaranteed in the main memory, and there is no problem with data consistency errors.
In the layered storage system, when the content of a layer of memory is accessed to the upper layer memory, the memory capacity of the upper layer is always less than the lower layer, and when copying to the upper layer, There is a problem that replaces the original data block. If there is a new write data in the replacement block (such as the calculation result), these data must be written to the corresponding block of the lower memory, which involves the update policy.
In the direct image mode, there is no algorithm for block replacement, because each block is fixed, which block data can be directly determined directly to the upper layer determination position. The other two images have a problem with the replacement policy, that is to choose which Cache block is replaced. Replacement algorithm.
The basis for selecting the replacement algorithm is that the overall performance of the memory is mainly the hit rate of the upper memory. Below will be used in a comparison of several replacement algorithms:
Ideological advantages, the shortcomings random algorithm RAND use soft or hard random number generator to generate the page number to be replaced in the upper layer, easy to implement "historical information" used by the upper memory, no reflection, etc., low hit rate . First-first out FIFO Select the earliest page as an alternative page to make it easy, using the information of the main history does not correctly reflect the principle of the program bureau, the hit rate is not high, there may be an exception. Recently, the least use of the LRU selection recently least access page is a replaced page compare the correct reflection of the program local, using the history information of the interuse, the higher-in-hit rate is higher to achieve more complex optimization replacement algorithm OPT will be replaced in the future. The highest rate, can be used as a measure of other replacement algorithms, just an ideal algorithm
For the replacement strategy of the block, you must master the replacement process of the FIFO, especially the LRU algorithm, and can draw the tables of its allocation and analyze.
In order to maintain the consistency of data and data in Cache, there are two types of data in Cache and main memory, and there are two general update strategies:
Update Policy The Advantage Disadvantages Writeback When the CPU is written, the information is only written in cache, and only the rewritten Cache block will be sent back to the main memory (write) when needed. Blocks are beneficial to save many unsempacts that write intermediate results to the main memory. It is necessary to set up a complexity of the Cache to add Cache's complexity (write straight to the Deta) to write the data at the same time, and the data is written to the Cache and the main memory.
In addition, when writing is not hired (that is, when writing the Cache block, this piece is replaced by people, if you can't find this, you have to retrieve this, there are two solutions: one It is not written directly to the main memory, which is no longer transferred back to Cache. Second, write the distribution method, writing this block from the main memory Enter Cache. Generally written method with write distribution method, all write methods are allocated.
4. Data Cache, Directive Cache and Integrated Cache
The cache is used to store data and instructions, respectively, and can be divided into two cache, and the integrated Cache is stored in both data and stores instructions. In general, the separated cache hits has improved.
5. performance analysis of Cache (simple application)
The hit rate of Cache is very affected by computer speed. Practice has proved that the smaller the size of Cache, the greater the impact of address image mapping methods and replacement strategies.
In the case of the size of the group, the greater the capacity of Cache, the higher the hit rate. When the size of the Cache is determined, the size of the group or the size of the block will affect the mid rate, because the internal block is fully connected, the greater the block, the higher the hit rate. We know that the measurement of the speed performance of the storage system should calculate the formula of the average read interview time with an average access time.
TA = HCTC (1-HC) TM where HC is a point in the hits, TC is the hit time, TM is the visitor time. If multi-layer Cache can also be launched in this formula. That is, the time of the upper layer memory hits time plus the time of accessing the lower memory.
Fourth, main memory band widening method
We learned the high-level memory Cache in the hierarchical storage system, and now discuss the main memory.
The main performance indicators of the memory are capacity, speed, and price. The speed indicators of the memory include access times, access cycle time, and bandwidth. The main measures to improve the main memory bandwidth are:
1. Increase the data width of the memory. (I.e., adding data digits) 2. Multi-body cross technology using memory.
Multi-body crossing of memory is an effective way to improve its data bandwidth. Here is my own understanding to explain the high cross memory and low cross memory:
We know that each storage unit of the memory is given an address to be accessed. Parallel memory consists of multiple banks, and parallel access can speed access speed, but this is related to the addressing method. For example, the main memory has 8 storage units (2 3 times, set a little better, the actual number of units is much larger than this), then the entire storage space address is composed of a set of 3-digit binary numbers. : From 000 to 111, if this space is implemented in two storage, there are two addressing methods. One is the "high cross" address, which is to assign the previous digits of the address code to two storage, the first 0, the second is 1 (if there are four storage, it is divided Give the previous two digits, and push the unit in the first memory, the unit starts with this code: 000 ,001, 010, 011 (see the first number of 0); second The four addresses of the storage unit of the bank are: 100, 101, 110, 111. Thus, when accessing data of the two addresses adjacent memory cells, such as 110 and 111 data, in the second bank, only in this body, and the first storage body I am free to visit. When it is stored in data, it is generally stored in the address consecutive memory area. It is now possible to know why the high-level cross-addressing memory is suitable for a multi-machine system, that is, because each process is usually accessing the data required, two memory can work at the same time. The speed is accelerated. Another method is the low cross addressing. As the above example, the last bit of the address code is an address code assigned to the bank. The memory cell in the first memory is 000, 010, 100, 110 (the last bit is always 0), the second memory The memory cells are 001, 011, 101, 111, such a method such that the memory cells of adjacent addresses are distributed in different storage, so when accessing the data of the neighboring unit, multiple parallel dilats can be used simultaneously. Therefore, it is more suitable for high-speed data access within a single process.
The method of increasing memory data width is also a way to broaden bandwidth, generally using a monomer multi-word method. The actual bandwidth of the multi-body cross is higher than the monomer.
V. Virtual memory
Virtual memory is the extension of the main memory, and the virtual memory is dependent on the computer's access capability rather than the actual size of the actual deposit, and the actual storage space can be less than the virtual address space. From a programmer's point of view, the exemption is considered logical storage, the address of the access is a logical address (virtual address), the virtual memory enables the storage system to have both the capacity of the extent and access to the main memory. . Virtual memory access also involves the image, replacement algorithm, etc. of the virtual address and the real address, which is similar to the cache, where we will talk about the address map in block, and in the virtual memory, address image is page Unit. Designing the virtual storage system The indicator to consider is the hit rate of the main memory spatial utilization and the hit rate.
Virtual memory has many identities as the management method of the Cache memory, which requires an address image form and an address transform mechanism. But both are different, please pay attention to comparison.
Three different management methods of virtual memory: Press the storage angle icon algorithm, divided into segment, page, and segment form, and the basic principles of these management methods are similar.
Segment management: Ways the main memory allocated by the storage management method. It is a modular storage management method that can be divided into one segment, which can only access the main memory space corresponding to the paragraph assigned to the module. The length can be set arbitrarily and can be enlarged and narrowed.
The location of each segment in the main memory is specified by a segment table. Including the paragraphs (segment number), segment number, position, and segment length. The table itself is also a paragraph. Segment is generally divided according to the program module.
Page Management: It is a page that divides virtual storage and actual spaces into fixed size, each virtual page can be loaded into different actual page locations in the main memory. In the page storage, the processor logical address is composed of two parts: the virtual page number and the page address, and the actual address is divided into two parts: page number and page, and the address image mechanism converts the virtual page number into the actual memory. Page number.
Page management uses a page table, including the page number, starting position in the main memory per page. The page table is a mapping table with a virtual page number and physical page number. Page management is performed by the operating system, transparent to the application.
Segment-type management: is the combination of the above two methods, which divides the storage space according to the logical module, and each segment is divided into several pages, and the interview is performed through a segment table and several page tables. The length of the segment must be an integer multiple of the page, and the starting point of the segment must be the starting point of a page.
The current operating system generally uses segment page management. The following three management methods are compared:
Segment management advantage Disadvantage address transformation process multi-user (module) address can be divided into: section number, segment number, segment offset three parts, address transformation processes are as follows: (1) Find the corresponding segment table base register by the program number There are segment forms and segment lengths. (2) Check whether the offline is compared from the length of the section. Normal turn (3) (3) Date form list and segment number find the corresponding entry in its segment table, where there is a main memory address, load bit, access bit, segment long, auxiliary address, etc. (4) Check if the load position is "1" (in the main memory), "1" turn (5), otherwise the shortage interrupt is generated, and the transfer is transferred from the main memory. (5) Form a real physical address in the main memory address segment. (1) Multiple program segmentation, multiple programs or parallel programming, zoom programming time; (2) each paragraph is relatively independent, its modification, the expansion does not affect other segments; (3) Implement virtual storage; (4) ) Easy to share and protect. (1) Section management main memory, the main memory utilization rate is not very high, a large number of zero heads; (2) To form a valid address, multiple visits must be accessed, reducing the speed; (3) allocation and recycling idle area comparison Complex; (4) The address field and the segment long field are longer, reduce the detection schedule speed. The page management user logical address is divided into: user logo, user virtual page number, and page offset. The process is as follows: (1) Find the corresponding page table base register by the user flag, where is the page table address. (2) Find the corresponding entry in the page table from the page list and page number. (3) Check if the installation position is "1" (in the main memory), to the '1' turn (4) otherwise generates a widaking interrupt. (4) Form a valid address by the main memory block number and the page offset. (1) The page table item is short, and the visitor is reduced. (2) There is less zero. (3) Quickality is fast. (1) Mandatory paging, page no logical significance, is not conducive to storage protection and expansion. (2) One valid address is generated multiple times, and the visits are lowered. Segment-type management user logical address is divided into: user logo, segment number, page number, page out of the page. The process is as follows: (1) Find the segment table base register by the user logo. (2) The length of the paragraph and the segment number are crossed. (3) Segment form 1 segment number to find the corresponding entry in the table. (4) Take the inspection of the segmentation and segment. (5) Find the corresponding entry in the page table in the page table page number. (6) Operation and other inspection. (7) Real page number page offset forms a valid address. With a segment, the advantage of the page is a valid address to form a three-time visit, the speed is slow. Please note that the management method in the control class is understood.
Page Virtual Memory Structure and Its Implementation: The main problem to be solved is the transformation speed of page failure of page failure and virtual address. There is also a protection problem of virtual memory.
When the address transformation is performed in the virtual memory, you need to change the virtual page number to the internal address transform of the real-page number in the main memory, which is generally implemented by the internal page table, if the page page is invalid, you need to check the address of the proximity page The deposit is transferred to the page. So the increase in the access speed of the page table is the key to improve address transformation speed. Depending on the local principle of the interview, a part of the probability high page item is placed in the form of the fast hardware, and the entire form is placed in the main memory, which leads to the concept of the quick table and slow table. When checking the table, the quick table and slow table are simultaneously found. The presence of a quick table is transparent to all the programmers.
· The protection of the false memory is indispensable for multi-channel program systems and multi-user systems. The protection of the storage system protects the protection and access method of the storage area. The protection method of the virtual memory has the image form protection method, the key protection method, and the loop protection method.