Kernel diary
I read "Linux kernel source code analysis" (Maude), I feel that I have written it yet, and it is too detailed that I can't have a grasp of the whole country. (I have seen a book of the kernel written in the country. " It seems to give several encodings, add some annotations. I feel that the kernel is also a software, and it should be understood by software engineering. The software engineering design method is of course: demand, overall design, detailed design, code, etc. I think the book of the old Mao is basically at the coding explanation level, can't have a clear idea of macro grasp. I feel that I want to read the source code, I should first understand it, that is, I understand the design concept, and the second is the specific coding understanding. So I turned to the overall two days. Read now
I have a problem, I haven't thought it is too clear. Please give you guides:
A process is to enter the Task_Interruptible or task_uninterruptible status, what conditions should be available, how to wake up (can you wake up, can you be put into the ready queue)?
I thought about this question again, I think:
The process is going to sleep, there is voluntary and forced two kinds.
Voluntary, mainly tune Sleep_on (), etc., their wakeup relies on the function on the timer, which puts the corresponding sleep process into the ready queue, awakening the process.
The forced mode is mainly because of the wait event. Waiting for the event, I understand that it is available for a signal to be available. When the kernel function helps the process checks if the semaphore is available, if it is not available, the kernel function will hang the process to the waiting queue (in the queue of the semaphore, as if hangs like hanging In the queue to the semaphore, there is no need to hang in the waiting queue, as long as the process is not allowed to run the running queue), it is not allowed to be scheduled. When a process releases the resource indicated by the semaphore or when an event is issued, it calls a V operation function of a similar PV operation. This V operation function will put the wait process in the ready queue to awake the process.
This is my understanding of the process of sleeping and awakening, I don't know if it is correct? Welcome to express your feelings of your kernel.
Mutual exchange, joint progress
__________________
Bao Jianfeng is grinding out, plum blossoms come from bitter
How do I check how the memory data is tracked?
First of all, I declare that I accidentally put the post to the wrong position, put it in the "kernel source code learning", so I put another one here.
So if you post, please return to the kernel diary. I am sorry.
Today's diary:
I want to ask another question:
Enter task_int .., Task_unint .. Is it necessary to be in a waiting queue?
My understanding is: Some processes enter the sleep state without hanging into the waiting queue,
The reason is: such as timing sleep (kernel function is: Sleep_on_timeout (),
Interruptible_sleep_on_timeout ()), these functions apply for a dynamic timer for the process, then set the PID of the Timer's DATA field, so when the timer arrives, this will wake the process, which is not used to wait for the queue.
I don't know if I understand it correctly, welcome to correct.
Today, I saw the semaphore mechanism and memory management (the Buddy algorithm and the SLAB distributor, but also a bit problem, but after reading it), there is a tracking system call. I want to follow and then ask questions:
Hardware provides two mechanisms for tracking: EFLAGS IF logo, debug register group
Linux provides: PF_TRACESYS logo in the process control table to interrupt system call
(I don't know if there is any other way?)
When the A process is tracked by the B process, B is a parent process (this should be no problem). If the A encounters the breakpoint is interrupted, it will enter the task_stopped state, B will receive a SigchLD signal, This is awake as B run, it checks A, no more information on the hardware context of B, B's memory data, or B's process control table can be embodied in all information. (Will you see what else?)
I want to ask now: b How does the kernel service when seeing this information through the system call?
I wrote my thoughts, welcome to supplement:
1. B To see all information that can be embodied in a register and process control table, the kernel only needs to extract from the process control table.
2. B To see memory data, you should call the system to call the address parameters, the kernel must find
A address space found this data, how to find it? (This is the key to the key) I think the core will use a page global directory, use the calculation method to find the physical address of this data, and then use the fixed mapping method of the kernel to get the linear address in the A space (this address> 3G) Then you can access this data.
I don't know if my understanding is correct?
__________________
Bao Jianfeng is grinding out, plum blossoms come from bitter
2003.04.19 Read Nuclear Diary:
2003.04.19 Read Nuclear Diary:
Look at memory management today, deep sense of knowledge is not enough!
3 main data structures in memory management: Buddy algorithm, SLAB dispenser, VM_Area split, where SLAB is most difficult to understand.
I don't quite understand now, please give you guides:
What is the difference between kmalloc, vmalloc, malloc?
The kernel is managed by buddy physical memory through the SLAB dispenser.
Allocation services, such that the memory required by the kernel is available in the large block small block.
How do the memory space of the user process? The kernel implements a large block memory allocation of user linear space through VM_Area_Struct technology, such as: code area, data area, pile area, and stack area. I want to ask now that when the process needs small memory, who is managed? I know that typically users allocate fragmented memory through Malloc.
But what method is Malloc to get zero memory, is the system call, or does not need to be called? I checked the system call, as if there is no corresponding system call, I didn't find any algorithm for the calibration memory management of the user space. So I doubt that Mallocc doesn't use system calls, can I do without system calling it again? I thought about thinking, I think if Malloc doesn't need to be called, it must be managed by himself, that is, it must build a management data structure. What data structure is used, the easiest way is to use the chain that has been allocated The block is string, and the complicated words can also be managed by the SLAB.
I don't know who everyone is clear about Malloc. If you have a C library source code, you may wish to share it.
Yes, there is a problem?
How to achieve anti-assessment on the library?
__________________
Bao Jianfeng is grinding out, plum blossoms come from bitter
Ask Malloc
BRK can only modify the size of the stack, or is a boundary, and it is in a page.
Malloc must use BRK, but Malloc must implement the management of small objects within the page must be managed in a small piece of small block in the page. This is the BRK can't be competent
So my Hui Ying Malloc has to implement the management of small objects.
But I haven't confirmed yet. I have downloaded the source code of the C library, but it may not understand it for a while.
__________________
Bao Jianfeng is grinding out, plum blossoms come from bitter
2003.04.20 Reading Diary - Future Linux Architecture
2003.04.20 Reading Diary - Future Linux Architecture
Today, I browsed VFS, I admired it for it. Its understanding of my achievement
PCI slot, plug in different independent design specific file systems, scalable.
Therefore, it is not so much a technical specification that is a general file system.
With this specification, seamless docking of different software can be achieved.
Lenoving POSIX, is it not a norm of OS design, so I think, will there be Linux a day?
The parties are also designed to be associated with a specification, just like an assembled PC. However, it is so sad
VFS has realized specifications for file systems and specific file systems. I think the next step should also be designed to standardize the memory management, and the other parts of Linux can complete from physical memory allocation, recycling only by standard function interface.
, Process space management, and buffer management. Speaking of buffering management, I think there are two major categories of the Linux kernel: one is the buffer of the allocation recycling of the fixed object (or buffer pool), such as the small object allocation mechanism for SLAB. The second is the high-speed buffer, such as the pre-reading management of the disk block, the aging of the page, etc. These two buffer management should also be standardized. When it is negotiable, we can also design the memory management into a module like designing a specific file system, and then do not redeem the kernel, you can convert the original memory management module to the memory management module designed by the INS_MM_MODE function. . I have thought about it. When replacing, the old new work is to be transferred, that is, to convert the data record of the old module to the new module so that the new module will redo the kernel memory. Our goal: Linux wants to make a fool assembly, there will be many memory management, process management, file system plugins optional to suit your specific situation.
But current goals: read nuclear, don't talk. (Learn a little in the past two days, the book is scattered after the back)
I hope everyone talks about your reading experience, so that you will take a few detours like this.
__________________
Bao Jianfeng is grinding out, plum blossoms come from bitter
2003.04.21 Reading Diary - Discover Linux to be inefficient when the data structure of the Buddy algorithm is initialized
Today, read the Buddy memory allocation algorithm, the algorithm is really good, but Linux is very inefficient when the data structure of the Buddy algorithm is initialized.
I first make a metaphor: give you a big bowl, full of rice, now let you transfer rice
In another bowl, do you pour the rice directly into another bowl, or put the rice one grain from a bowl to another bowl? The answer is clear and fell directly.
When Linux initializes the data structure of the Buddy algorithm, LINUX has chosen a method of placing a rice from a bowl in another bowl.
Ok, look at how the system is initializing the data structure of the Buddy algorithm.
Buddy generally uses free_pages (), free_page () when reclaiming the page, and they all have partners' mergers. And up to 9-ORDER merges.
In MEM_INIT (), by scanning all dynamic memory, free_page () is called to insert the empty page one page into the buddy.
In fact, the dynamic memory is so large, so it can be used to use a recursive function.
The number of times can be completed.
I wrote one:
Buddy_init (start_addr, end_addr, order)
// start_addr, end_addr is the first end of the memory block to be released, or the buddy slot number to be inserted.
{UNSIGNED long mask = (~ 0ul) <
Page * start, end; // Middle Ampliki
IF (ORDER == 0)
{free_page (start_addr); return;} // Removement exit, only one page requires release
// Since the start, end;
START = (start_addr (1 <
End = end_addr & mask;
For (TMP = Start, TMP
<
{Free_Pages (TMP, Order); // The middle is released in the border of ORDER.
}
If (start_addr! = start) // is equal, then explain that there is no left zero head, no recursive
Buddy_init (start_addr, start, order-1);
If (end_addr! = end) // is equal, then explain that there is no right zero, no need to recurrent
Buddy_init (end, end_addr, order-1);
}
The function written is not strict, but its idea is to use free_pages () to insert large blocks.
Because this can release up to 512 pages, it is much higher than one page.
__________________
Bao Jianfeng is grinding out, plum blossoms come from bitter
2003.4.21 Read Note Diary
Because it is written to the notepad, stick it up, but I don't know why the data is not sticky, so I'm sticking it again.
very sorry
Today, read the Buddy memory allocation algorithm, the algorithm is really good, but Linux is very inefficient when the data structure of the Buddy algorithm is initialized.
I first make a metaphor: give you a big bowl, full of rice, now let you transfer rice
In another bowl, do you pour the rice directly into another bowl, or put the rice one grain from a bowl to another bowl? The answer is clear and fell directly.
When Linux initializes the data structure of the Buddy algorithm, LINUX has chosen a method of placing a rice from a bowl in another bowl.
Ok, look at how the system is initializing the data structure of the Buddy algorithm.
Buddy generally uses free_pages (), free_page () when reclaiming the page, and they all have partners' mergers. And up to 9-ORDER merges.
In MEM_INIT (), by scanning all dynamic memory, free_page () is called to insert the empty page one page into the buddy.
In fact, the dynamic memory is so large, so it can be used to use a recursive function.
The number of times can be completed.
I wrote one:
Buddy_init (start_addr, end_addr, order)
// start_addr, end_addr is the first end of the memory block to be released, or the buddy slot number to be inserted.
{UNSIGNED long mask = (~ 0ul) IF (ORDER == 0) {free_page (start_addr); return;} // Removement exit, only one page requires release // Since the start, end; START = (start_addr (1 < End = end_addr & mask; For (TMP = Start; TMP < {Free_Pages (TMP, Order); // The middle is released in the border of ORDER. } If (start_addr! = start) // is equal, then explain that there is no left zero head, no recursive Buddy_init (start_addr, start, order-1); If (end_addr! = end) // is equal, then explain that there is no right zero, no need to recurrent Buddy_init (end, end_addr, order-1); } The function writes is not strict, but its idea is to use free_pages () to release large blocks. Because this can release up to 512 pages, it is much higher than one page. __________________ Bao Jianfeng is grinding out, plum blossoms come from bitter I have seen the Buddy algorithm for a long time, and I have written the program, just didn't read its initialization, just seem to have a little no attention (I don't know if I remember wrong), for example your start = START_ADDR (1 "<<" Order) -1) & mask; Is it possible to have a page_shift behind 1 "<< Order), of course, this is a small thing. As for the inefficient problem you said. Yes, You have seen his initialization procedure, then the initial system is all arranged in the order of ORDER from 9, 8 .. is also the first to raise 9ORDER, the sporadic row 8, and the rose of the sparestock 7 ... PS: will When it is the system initialization, I am afraid that a piece of physical memory is broken, causing discontinuities of memory space, so I am insertion of a page to merge A? PS2: You can't do some programs, that is because This community is probably for security or other coding, and it is wrong to explain some special characters. For example, you can enter your program normally in two> add a pair of quotes. __________________ I still remember that I have said that I have a goodbye to the iron, and there is a sense of burning in the darkness; I still remember the tears that slipped from your eyes, and there is a kind of tearful illusion in confusion .... This post was edited by Blueflame by Blueflame. 2003-04-22 09:18. BlueFlame: Or you have experienced rich, I have seen something that I didn't show. I also understand it now: Use the View source code to read this page directly to the HTML text, are you doing this? FREE_PAGE () When a page is released, it will start from ORDER0, change the corresponding MAP tag, and then see if it can be merged up (that is, to determine his partner through the corresponding position of MAP), if it is not, it is inserted into Order0, Then return; if it can be merged with your partner, then insert it to ORDER1, if you can merge, you can push until ORDER9 is over. So if you use free_page () to initialize Buddy, a large amount of merges occur, (because the dynamic memory at this time is a large block continuous, so in the end, in addition to the left and right zero, it is a block that belongs to Order9), but if you use free_pages ( , Then, since the order is separated directly into the Order9, there is no merged phenomenon. As for the left and right zero head, they will decompose, insert into the corresponding slot, and the merged phenomenon does not occur because of trying mergers, will inevitably fail. As soon as it is to check if physical memory is damaged, I want to be separated, don't initialize Buddy because of the physical memory to scan each page, in short, with free_page () cost is too big, (but true, no need Write code) __________________ 宝剑 锋 2003.04.22 Reading Diary - Slad Algorithm 2.4 The kernel version is more concise Today, I read << In-depth understanding Linux core >> (first edition, 2.2 kernel) SLAB distributor principle. The KMEM_CACHE_T, KMEM_SLAB_T principle is very clear, but when it comes to KMEM_BUFCTL_T, it can't make it clear how it cooperates with the first two structures. (Of course, this is not the author's pass, but the 2.2 version is not perfect to the algorithm). Finally, I have to see the source code (version 2.4.18), found that the implementation of the object descriptor "has great improvements: 1. If the SLAB descriptor is inside the SLAB, it is changed from the rear of the SLAB at the end of the SLAB and follows the object descriptor. 2. The object descriptor can be inside or outside in principle, but Linux only puts it inside, no longer placed outside, so you don't have to record the address of the object descriptor, and the object descriptor can be calculated by KMEM_CACHE_T, KMEM_SLAB_T, and the address of the object. The address, and the role of the object descriptor is simpler, just chain of empty objects; the object descriptor of the assigned object has no effect. I cleared it, I also saw the 2nd edition of Understanding Linux Kernel, and there is also a scenario. The discovery analysis is basically the code, so there is only two pictures in the book, and Figure 2.8 is 2.2 version, Figure 2.9 I don't know what it wants to express. So I suggest that when you look at the Buddy and SLAB algorithms, you will see the first edition of Understanding Linux Kernel and 2nd edition. It is very good to speak very well. It is very delicate in it. It is very helpful to understand the algorithm. Look at different versions, think about Linux's growth process, isn't it true? _________________ 宝剑 锋, 花, 香 香 核 核 核 核 核 核 日 日 日 日 日 日 日 日 Yesterday, I sent a post on the SLAB algorithm. I thought about it at night, I found a little problem, so I got up early and analyze the source code. So do the following corrections: The object descriptor can still be within Linux2.4, but it is like a SLAB descriptor as follows. That is, it always follows the SLAB descriptor, so as long as you know the address of the SLAB descriptor, the address of the object descriptor can be calculated. In short, the SLAB descriptor is outside, it is outside, the SLAB descriptor is in, it is, it is, when the SLAB descriptor is in the SLAB area, that is, followed by the coloration area. When the SLAB descriptor is outside, because the number of object descriptors required for different Cache's SLAB zone is different, the number of object descriptors required by a SLAB is different, and "SLAB descriptor add object descriptor array is calculated by calculation). "The required memory size, this is the true size when the SLAB descriptor is assigned. So the SLAB descriptor outside the outside may be distributed in different cache in cache_sizes. __________________ 宝 宝 锋 锋 Blueflame: Thank you for your concern. In fact, I have seen Linux, all day long, sometimes I have been very annoying, when I download a movie, or sleep, there is nothing. I divide learning Linux into three phases: The first phase is from the whole of the relationships and working mechanisms between various important data structures and parts of each part to learn the system programming (I put It is called the overall design of Linux), it can be done in the next semester; the second phase is to turn Linux from the head to the end, mainly complete the analysis of the source code (I call it a physical design) I estimate that it takes 2 months. The third stage is basically the continuous development of the development of the Linux kernel and has the ability to make a certain modification. What books have you mentioned in your network? How do you intend to overcome it? (Laughing is not hunching, only the spring is reported, when the mountain is getting angry, she laughs from it) __________________ 宝 宝 锋 锋 2003.04.25 read nuclear diary - thank you for your concern Thank you for your concern. I will continue to read and put the emotion. In fact, I just want to take a break in one or two days, we need to trim, look at the movie, and it is easy. Through these days read kernel management, I feel that the data structure used by memory or buffer is not a chasette. And in order to manage a group of objects more effectively, the way to adopt is no addition: 1. By adding an index level. Such as the SLAB distributor: through level 3, the highest level is KMEM_CACHE_S, next is KMEM_SLAB_T, then it is kmem_bufctl_s, and finally the object itself. (Imagine why it will use Level 3, in fact, you can use first level, That is, use hash or a chain so that it will eventually take an object, and the middle-level pair is hidden) As long as I want to pass this, I will see the data structure in the future, I will first look at it. Level index. Then I can infer its basic organizational structure, which is how to associate between different levels. 2. At the same level, the form is not worthy: either use the linked list, or an array, either with Hash, or use the tree (such as avl). The first two are basic forms, and the latter is just in order to increase the speed. Note that different effects of arrays and linkers do not? For example, the string array is the shift pointer, but the linked list can only be used by the address. 3. When the assigned space is divided, there are two basic strategies, one is to manage the air idle block, that is, the space discrete block is string, typical is Buddy, SLAB, one other is the strings that are allocated Typical is VMA_AREA_STRUCT, VMA_STRUCT; say VFS, let's talk, the fundamental purpose is to achieve the name access (in order to improve efficiency, Linux has improved for several years, so don't expect to learn immediately. The efficiency of people, so I feel that a software can achieve it is the first step, and the second is to improve it. Is the CPU not also improved for many years? Still improvement. Take a look at Linux, you know Linus. Not a fairy, his previous version of the algorithm is also very bad, but later improved), in a sentence, it is the essence of the level of management. But if you use the first level index, don't the file is not like a cattle? Therefore, it is available, according to the multi-level level of the logical disk, according to the process, etc. So I want to pass it. The entire data structure is also vivid. Please add to you! Thank you again for your concern. __________________ 宝 宝 锋 锋 2003.04.28 Read Nuclear Diary - Understanding of the Data Structure of Cache Today, I read << In-depth understanding of Linux kernel >> Chapter 14 disk cache, basically understanding it. This makes me a lot of confidence in the reading of VFS. It will now be understood as follows (originally wanted to draw a picture, but I feel too much.) 1. There are three high-speed caches: directory cache, buffer cache, and page caches. In fact, I think the book is called "buffer" and buffer_head. I think it should be changed to a better name: such as a block object (block object) and block descriptor (Block Descripter). 2. The entire data structure of buffer cache 1) Create a "buffer" and "buffer head", two sources of different buffers come from buddy, namely, buddy gives a page box, which is divided into this page. Several large buffers. The buffer headers are derived from the SLAB dispenser. Each buffer must have a buffer header to describe it. These buffers in a page box you can call it a brother, they use a chain string. 2) After the creation, they are idle, they must manage it, used as "turnover funds", how to manage? Because different devices may define different block sizes, the block size of the buffer is not uniform. The method is a chain of the buffer descriptor (ie, the buffer header of the same size, but I still like the buffer descriptor), which is: 512, 1024, 2048, 4096, 8192, 16384, 32768, a total of 7 strands are generated. But a block size cannot exceed one page, so the PC only uses the top 4 chains. 7 Chain links are uniformly stored with an array free_list. Ok, the idle block has been managed. After use, you can take it from here, if it is not enough, then create. 3) Manage the assigned out: Because these buffers may be clean, dirty, or is to write to disk. They are divided into three strands. 3 chains are placed in an array lru_list. Ok, allocate it is also managed. However, in order to quickly find, add a Hash table, the table is: hash_table must be provided with hash, and the retrieval of the buffer is to use the device number and block number. As for those functions, as long as there is this data structure framework in your mind, you will understand. 3. Page cache data structures are relatively simple. Don't say it, but to note that he uses index nodes and file offset as a Hash key. Also note that the buffer page is ultimately divided into several buffers and assigns buffer descriptions for these block buffers. The operation of the disk is completed using the write back and reading functions of the buffer. (Because the disk only enters the block, does not make a page), there are several good pictures in the book. Can promote people to understand these structures. Welcome to correct. __________________ 宝 宝 锋 锋 I also have the same feeling, sometimes, I haven't done anything, the hard disk is running, I want to be exchanged, but I don't want to do it. I feel a little tired, I have not practiced, so many more records. So I want to change the way, I want to learn a system and module programming, observe the behavior of the Linux kernel from the outside. At the same time, I also want to digest the original core of the previously learned, and then forward, I feel that this piece is not good, the front of the BH is almost almost, so it looks a little effort. Hey, I'm really bitter. It is simply the test of people's Yi Li. __________________ 宝 宝 锋 锋 2003.04.30 Read Nuclear Diary - About the structure of the FAT file system should be written in a more detailed data structure of the FAT file system, for your reference: The overall structure is: IPL (boot area), FAT Table 1, FAT Table 2, root directory, data area. The FAT Table 2 is the same as FAT Table 1, mainly for backup. The data area is allocated in a cluster as the basic unit. The IPL is booting the system effect, but also contains data similar to the super block, so you should take the information of the super block here. The FAT table has two functions, one is the role of the records of the idle block, and the function of the file data chain pointer. FAT is actually an array, each of which (FAT12 is 12-bit calculation, that is, 1.5 bytes, FAT16 is a 2 byte one), each record the usage of the corresponding cluster. When it is 0, it means this cluster is idle, when not 0, indicates the cluster number of the next cluster of the file (there are several special numbers, such as FAT16, 0xffff indicate the end of the file, that is, no next Clusters.) The block in the cluster and EXT2 is a basic allocation unit. It can be seen from the structure of the FAT, which is much simpler than EXT2. Different blocks are connected by FAT, rather than ext2, put it in the I node, or the inside of the file (referring to the block required by the secondary, and the three-level index) is the specific structure: 1. IPL (Initial Program Loader) Structure: A sector offset (16) length (byte) content ========================= ============= 00 3 Short jump instruction 03 8 OEM manufacturers name and version 0B 2 Each byte number 0D 1 Each cluster number 0E 2 reserved number of sectors 10 1 FAT number 11 2 Total number of items 13 2 Logical sector number 15 1 Media Descriptor 16 2 Each FAT number 18 2 per track sector number 1A 2 magnetic head 1C 2 Include the number of sectors ---------- -------- 1e IPL boot procedure -------------- 4 partitions table 2. The value of the FAT table (refers to 12-bit and 16-bit) cluster value (0x) content ======================== 000 (FAT12) idle cluster 0000 (FAT16) ----------------------------------------- 001 0001 - ------------------------------------- FF7 poor clusters (ie may have bad regions) FFF7- ------------------------------------- FF8 - FFF file end sign FFF8-FFFF --- ------------------------------------ 002 - FF6 Next Cluster 0002 - FFF6 == ===================== 3. Structure of the catalog item: Each 32-byte offset byte content =============================================================================================================================================================================================== Extended 0B 1 File Properties 0x-15 10 Reserved Area 16-17 2 End Edit Time 18-19 2 End Edited Date 1A-1B 2 Startup Cluster No. 1C-1F 4 File Size File Name First byte All The meaning 00 The directory does not use E5 to indicate that this directory item is deleted, if the second byte is 2e, it is represented as a parent directory (.) If the second byte is not 2e, it is represented as itself (.) File attribute: 0 only Read 1 Injection 2 System 3 Volume 4 Sub Directory 5 Archive 4. Calculation of some data: Note: Cluster number starts from 2 Start Downtown Distance Number = (Cluster - 2) * Cluster Length Data Region Start Send Seaf Sector = Appointment Sector Number (Delit) * FAT Tabby Roots Clusters * (Slims per Cluster) Number of root sectors = (number of roots * 32) / (byte per sector) max cluster number = cluster 1FAT cluster Number <= 4085: Use FAT12> = 4086: Use FAT16 if there is a mistake, please refer to people who know the FAT32, NTFS structure. __________________ 宝 宝 锋 锋 2003.04.28 Read Nuclear Diary - Cache data structure understands that you can say a buffer cache, there may be several buffers in a page box. There is also the 7 practice, it seems like that is About the file buffering. And you said that the third page cache is used for file cache. I have been ambiguous for these caches. The buffer cache is buffer cache. That should not be used The file buffer is buffered. The mapping structure of this page should be NULL. __________________ I still remember that I have said that I have a goodbye to the iron, and there is a sense of burning in the darkness; I still remember the tears that slipped from your eyes, and there is a kind of tearful illusion in confusion .... Buffer cache refers to the slow buffer, and the page cache refers to the sluggish page, that is, the file you said is slow, because the file is read and written in a page, so the gentle of file is a page A slow. These two English nouns, I feel that it is generally understood that it is generally understood as a buffer, regardless of their size, but buffer in the buffer Cache here is that the disk block size buffer, it directly Contact with disk read, that is, after a disk block is read by DMA, first put it in a buffer of Buffer Cache, so I feel this buffer cache to understand Block Cache More people understand. Because of a file, you may correspond to several disk blocks on the disk (because 1 page> 1), and these blocks may not be continuous, so when one page of this file is read into memory, it must be This page box is divided into several blocks (the block divided by the page is obviously continuous). When several disk blocks on the disk are read into the memory, they are continuously placed in this page, namely in several consecutive blocks. So the mapping of the discontinuous disk block to a continuous memory block. At the same time, the mapping of a page to the memory is completed. So, the page slow, ultimately use buffer slow, to complete the reading and writing of the deposit, or its level is: A page of the deposit file (decomposition into a few, may not be continuous) ------ Examples of disk block ----- the buffer block of memory ------- Memory file (by Several continuous buffer components of a page box are divided. ------ Type system The address_space used in 2.4 is actually not fresh, but it is more concentrated to save the high-end management data. So just understand version 2.2, then it is not difficult to see 2.4. The expression ability is limited, I always feel unclear, the first edition of Understanding Linux Kernel (2.2 kernel) and the second edition (2.4 kernel), it is relatively thorough. __________________ Bao Jianfeng is grinding out, plum blossoms come from bitter I have seen the article recommended by Blueflame, but it seems that there are several places that have problems: "We know that the more the cylinders outside the disk, the faster the rotation, and each time is rotated, the disk read writing can cover more regions, which means that the outer cylinders can be better performance. So In partitioning, we should consider having high access frequencies, and impact system performance relatively large partitions in the outside of the disk. " 1. Is the track of the hard disk from the 里 外号, or from outside? I didn't check the information, but the CD is from the outside number, there should be no problem, this can be seen from the CD, so I can check the information to confirm. (He said that it is from outside, and I suspect) 2. His partition I think it is useless. Because the disk speed is not uniform, but when the magnetic head moves from the outside, in order to maintain the wire speed of the substantially constant read data, the speed must constantly adjust the speed, so that the line speed of reading data is basically unchanged, so I suspect He said that when the magnetic head is outside, he is very fast, it is a kind of understanding, please confirm. 3. He talked to soft RAID technology, I think this does not improve performance, up to increase the redundancy of the disk, thereby improving reliability. Because the same disk is divided into different zones, these areas must have inside and outside (this with different hard drives is two yards, because different hard drives can operate in parallel, and the same disk is divided into different regions, and only Serial seek, and in order to find different districts, you must also keep seek. So I suspect that the performance has improved this kind, please confirm it. __________________ Bao Jianfeng is grinding out, plum blossoms come from bitter Don't expect to exchange algorithms to make you satisfied, if you satisfy your satisfaction, others are dissatisfied. If you want to satisfy your own, you must keep the algorithm's balance to you. I read Understanding The Linux Kernel, finally understood this reason: "As a matter of fact, the hardest job of a developer working on the virtual memory subsystem consists of finding an algorithm that ensures acceptable performances both to desktop machines and to high-level machines like large database servers. Unfortunately, finding a good page frame reclaiming algorithm is a rather empirical job, with very little support for theory.The situation is somewhat similar to evaluation the parameters that achieve good system performance, without asking too many questions about why it works well.Often, it's just a matter of "let's try this approach and see what happens ..." An unpleasant side effect of this empirical approach is the code changes quickly, even in the evennumbered versions of Linux, which are supposed to be stable. " So, you should use the "Let's Try this Approach and see what happens ..." Read this part of the algorithm, understand the author's intention. If you don't care, change it, let it fit your appetite. For example: You have constantly write files above, obviously don't buffer, as long as you write back files to your hard drive is the best algorithm. __________________ Bao Jianfeng is grinding out, plum blossoms come from bitter