Introduction
Speaking of the task schedule, I think that all of us are very familiar, any operating system tutorial, all described very detailed, but most of them are standing in theoretical height, but specifically to a real machine, Among the actual operating systems, they specifically use the code to implement, the description is relatively small. This report will stand in a way to write the actual operating system, go to study how to achieve task schedules on the Intel 386 CPU structure.
The task schedule of this experiment uses the simplest time slice rotation algorithm, as it is not to study what algorithm scheduling is more effective, but how to deal with it. I think, after I know how the CPU hardware supports the task scheduling, you can use any of your favorite algorithms to complete such scheduling.
This experiment will also be based on the PYOS system, you can find all the source code and information on http://purec.binghua.com (pure C forum).
Task Scheduling Overview
Now the CPU has strong computing power, the operation is very fast, but the external equipment is far from the speed gap, for example, a process needs to read a file from the disk, then this process needs to send one by CPU After reading the disk command, after the disk controller gets this command, the disk controller needs to start the disk drive, find the head, track, sector, and then send a read command to read the data to the buffer of the disk controller. Finally, inform the CPU to the buffer to remove the data, we can call the time of this disk drive as data preparation time, and during this paragraph data preparation time, the current process is not allowed to wait for data, and then The CPU has nothing to do so, so that the efficiency of the CPU is extremely low. So, we use a better strategy, first call up this process from the CPU, and transfer a new process into the CPU. When the original process is waiting for the data, the CPU can start executing this new process. After the original process data is ready, the CPU will then turn out the process that is now executed, re-transferring the original process, so that the CPU does not have an empty waiting, the utilization of the CPU will also be greatly improved.
Thus, now the CPU can start multiple processes at the same time, but only one process is being executed by the CPU, while the remaining processes are waiting for the CPU to re-execute because of such or such reasons. The way is to assign a certain amount of time slice given by the operating system. This time film indicates how long the process can be used once. When the process is used, the operating system will call it CPU, simultaneously transfer another process into the CPU, so that the time slice of each process is small, usually dozens of milliseconds, so a second can make a lot of processes to perform one or two times, This fast speed will make people feel that the handover is performed. It is that people think that it is running at the same time. In fact, it is a concept of macroscopic and non-micro.
After toning a process from the CPU, one process should be transferred to the CPU, which is a scheduling algorithm, usually the operating system selects a process that will be transferred according to such an algorithm, which is the operating system Things, and the CPU does not know how the operating system chooses this process, it only knows that the operating system command is now transferred into a new process.
To transfer a new process, you need to call the original process. Because I can return to the original process, I must save the current CPU state, which is very like the protection in interrupt processing. On-site, then the CPU needs to protect those information? Where is this information? At this time, the TSS (task status structure) appeared. (Note: The general book is called "task status segment", but I think this call will fill the Southam in the code segment, data segment. Therefore, here I call it "task status structure" ", In fact it is essentially a space for storing task-related information.) TSS (task status structure) structure
Let's take a look at the structure of TSS:
The above figure, the most basic structure of TTS, behind it, the operating system can additionally add several bytes to store some additional data, but the CPU only uses the most basic total 104-byte space. From the above structure we can see that there is almost all the information you need to use by the process run, let's take a detailed study of this structure.
In the above figure, the above TSS structure has been divided into three parts, and we can do not have to pay attention to the "unused" part, and there are two parties: "CPU Auto Update Zone" and "CPU Read-only area ", the so-called" automatic update "means that when the CPU is switched, the current task is automatically stored in the corresponding position of the TSS, so that the CPU saves the current task. "Read-only area" means that the CPU reads related information from the task switch, but when the task is switched out, it will not save them, so the "read-only area" information is created by the operating system. When you specify, the CPU just reads without modifying them.
From the above figure, we know where the CPU saves the relevant information about the current task, but this TSS is too big! It cannot be placed in the CPU, but can only be placed in memory, so we also need a pointer to only TSS in memory, so that the CPU can find the corresponding TSS through this pointer, such a pointer has two kinds, We are called "TSS Descriptors" and "Task Gate" respectively.
The structure and use of "TSS Descriptor" and "Task Gate"
Let's first take a look at the "TSS Descriptor" structure:
The above figure is the "TSS Descriptor" structure, which we can see from the figure, which gives the memory location of TSS and the size of TSS. It should be noted here that we can know that a TSS basic structure cannot be less than 104 bytes from the front TSS basic structure. Therefore, the size limit of the TSS here is 103 (the size of the minimum structure of the TSS 104 - 1) . In addition, it is also particularly noticed that "B" bit in the figure, which is used to mark a task is a busy task. If the B bit is cleared, then we say this task is a "available task". If the B bit is set 1, we say that this task is a "busy task", the so-called "busy task" means it is running, Or it is a task that is attached to the task chain. It understands the "B" bit is very important in the actual programming, but it is not intended to be described in detail, and the detailed description is left to the following text.
From the above figure, we can see that a "TSS descriptor" refers to a TSS structure, which is completely sufficient, but used for Intel allows you to switch when interrupted, so we You can use the interrupt handler as a dedicated task, however, it is only a door descriptor in the interrupt descriptor table, and the top "TSS descriptor" is not a door descriptor, so it cannot be placed In the interrupt descriptor table, Intel has also defined a "task door", which actually points to a "TSS descriptor", but because it is a door descriptor, it can be placed in an interrupt description In the table, the CPU is queried by the interrupt number query interrupt descriptor table when an interrupt is interrupted, and if it is found to find the corresponding "TSS descriptor" by finding it. The corresponding "TSS structure" is found by the corresponding "TSS descriptor". In fact, I always feel that the "task door" is a bit more, but Intel has been doing this, we have to take it. Below, let's take a look at the structure of "Task Gate": The above figure is the structure of "task door", which is very small in which intel is really wasteful! The P-bit is the same as the corresponding position in the previous "TSS descriptor" in front "TSS descriptor", which is not multi-mentioned, and the remaining "TSS selectors" will be said.
From the previous TSS descriptor we know, a "TSS Descriptor" refers to a TSS structure, through which we can know the location of a TSS structure in memory. So how do we get a "TSS descriptor"? Where is it placed?
In the operating system, such TSS descriptors are accessed by the CPU, interrupt service, other processes, so it can only be placed in the Global Descriptor Table ("Have" Transportation Bureau Descriptor Table "In" operating system boot Inquiry "in a detailed description). So we need to use an index to point out the "TSS Descriptor" in the global descriptor table, so we can find the corresponding "TSS descriptor", this index is called "TSS selection", as the names It is used to select "TSS Descriptor" in the Global Description Table.
Below, we look at how the CPU is handover, this place is more interesting is that "Task Gate" and "TSS Descriptor" are placed in the Global Descriptor Table, and they need one Indexes indicate their location in the table, and this index is a selector, called "Task Gate Select" and "TSS Select", respectively.
CPU task switching behavior overview
The above figure is a schematic diagram of a task switching, let's take a detailed explanation.
As we can see from the figure, the CPU has a task switching in the three cases:
1. Use the JMP, CALL instructions, and the operator of the instruction, that is, the target address points to a "TSS descriptor" or a "task door descriptor". (This task door descriptor actually points to a "TSS Descriptor").
2. Generate an interrupt, and the interrupt vector points to a "task door descriptor" in the interrupt vector table.
3. Using the IRET instruction and the NT bit in the EFLAGS register is set 1 when executing this instruction.
In these three cases, the task switching CPU has different actions. Let us take a look in detail.
1. If this switch is caused by JMP instructions, the CPU will first perform privilege check, and check if the B bit (busy bit) of the target task is 0, then the CPU will set the B position of the target task to 1, put the current task The B bit clears, then the CPU presses the status information of the current task into the corresponding TSS structure, and removes the corresponding information from the TSS structure of the target task, so that a task switch is completed. 2. If this switch is caused by the CALL instruction or interrupt, then the CPU will also check if the B bit of the target task is 0 after the privilege check, and then the CPU sets NT in the EFLAGS of the target task 1. And put the "TSS Descriptor" of the current task into the LINK field of the target task, then the CPU presses the status information of the current task into the corresponding TSS structure, and remove the corresponding information from the TSS structure of the target task, complete the task Switch.
3. If this switch is caused by the IRET instruction (note, the NT bit in the eflags of the current task is set 1), then the CPU checks if the B bit of the target task is 1 after the privilege detection is performed. (Note, here is the required 1, not previous 0), then clear the B bit of the current task, and clear the NT bit in the eflags of the current task, then, the CPU is pressed into the current task information In the corresponding TSS structure, the corresponding information is removed from the TSS structure of the target task, and the task switch is completed.
From the CPU behavior characteristics of the CPU, we are not difficult to find that JMP instructions are just a very simple jump instruction. From a task to another task, the CPU does not do more operations, while in the CALL instruction In the task switching of the interrupt, the CPU will put the "TSS Descriptor" of the current task into the "TSS Descriptor" of the target task, and set the corresponding status bit, so that it will be the same as a linked list. The task is linked to the target task, so when using the IRET instruction, you can return from the target task to the original task. This feature is especially useful in interrupt processing, because you can use the IRET instruction directly to return to the original task from the interrupt service task to continue running.
So much, there are so many tasks, so many "TSS descriptors" exist at the same time, how does the CPU know which "TSS Descriptor" referred to is the current task currently being executed? In fact, there is a TR (task register) inside the CPU, which is stored in the "TSS descriptor" of the task currently executing, and the CPU will automatically carry the "TSS Descriptor" of the new task when the task is handover. Where it is.
Ok, the basic knowledge used in this experiment has been introduced. Here we will see how this is achieved in the actual operating system.
Go! ~~
PYOS This experimental related issues overview
Through the previous experiments, you may be familiar with PYOS, but as a system under development, constantly modifications and perfection are essential, in this experiment, Pyos has made it in structure. Amazing adjustment. The PYOS used in this experiment report is 2004_05_31_10_00 version, you can find all of its source code on http://purec.binghua.com (pure C forum).
Since PYOS currently does not complete disk-drive and file management, the two processes used in this experiment, the A process and the B process are directly written to the specified location in the image file, and after the system is started by Setup The program directly reads the memory specified position. Let's take a look at the code of these two processes:
#include "video" video.h "
EXTERN "C" void user_main ()
{
Static I = 0;
For (;;) {
IF ( i == 0xfffffff) {// delay
Class_pyos_video :: Print ("_A_");
i = 0;
}
}
}
The above code is the code of the A process, the code of the B process is almost exactly exactly, but the output is not "_a_" but "-b-".
First we compile them separately by using pyos_gcc:
OUT / PYOS_GCC.EXE SOURCE / A_PROCESS.CPP 0x80000 OUT / A_PROCESS.BIN
OUT / PYOS_GCC.EXE SOURCE / B_PROCESS.CPP 0x90000 OUT / B_PROCESS.BIN
The 0x80000, means that the A process will be placed in memory, the same 0x90000 is the position where the B process is placed in memory. When startup, they will be read by the setup.asm program to the above-specified memory. In the address. There is a detailed description of this compile content in the source code package in the source code package, this is not much to say, interested friends can look at the relevant source code.
Let's take a look at the kernel code directly below:
/ * Kernel main function * /
EXTERN "C" void pyos_main ()
{
/* system initialization */
Class_pyos_system :: init ();
/ * Clean screen, print welcome information * /
Class_pyos_video :: clear ();
Class_pyos_video :: Print ("Pyos Task Switch Experiment / N");
/ * Install the clock interrupt * /
Class_pyos_system :: installcputimeInterrupt ();
/ * License Clock Interrupt * /
Class_pyos_system :: enablecputimeInterrupt ();
/ * Open interrupt * /
Class_pyos_interrupt :: openinterrupt ();
For (;;);
}
The kernel code is very simple, and the comments are also very detailed. Here is the focus. When the kernel runs, the clock interrupt will open, in the clock interrupt, the system will count, see how many interrupts have occurred, and each clock is interrupted, and it is a most basic time film in PYOS. Subsequently, the interrupt service program compares the time tablets owned by the currently running process, see if the process's time slice uses light, if used, the clock interrupt service program is task to switch. Below, let's take a look at this code:
/ * Clock interrupt handler * /
Void class_pyos_system :: handlecputimeInterrupt ()
{
// Clock interruption, ready for task switching
Static I = 0;
Unsigned int processno = class_pyos_process :: currentprocessnumber;
IF (ProcessNo == 0) {
// If it is now a kernel process, switch to the next process and ensure that no longer switch to the kernel process processno = processno% 2 1;
}
Else {
/ / If it is not a kernel process, time film 1
i;
IF (i == class_pyos_process :: processqueue [processno] .cputime) {
// If a process of CPU is time, and switch to the next process
ProcessNo = ProcessNo% 2 1;
// Recons calculate the time film
i = 0;
}
}
Class_pyos_process :: currentprocessnumber = processno;
Unsigned int base;
// First-line decline processing, obtain the TSS descriptor of previous tasks
Base = m_gdt.tss [3] .BASE_0_15;
Base | = (m_gdt.tss [3] .BASE_16_23 << 16);
Base | = (m_gdt.tss [3] .BASE_24_31 << 24);
Struct_pyos_tss * ptss = (struct_pyos_tss *) base;
Base = PTSS-> LINK;
// Tagged the previous task as a non-busy task
Base = (unsigned int) & m_gdt;
Struct_pyos_tssitem * p = (struct_pyos_tssitem *) base;
P-> b = 0;
// Will be executed the task descriptor mark as a busy task
Class_pyos_system :: m_gdt.tss [processno] .b = 1;
// Put the current task (the clock interrupt handler's LINK is changed to the task selection sub)
PTSS-> LINK = Class_pyos_Process :: processqueue [processno] .tssselector;
Return;
}
Here, the clock interrupt first performs a resolution process, as we know in the previous description, in the task switch caused by the interrupt, the TSS descriptor of the original task is placed in the LINK field in the TSS of the interrupt service program, so When returning in the middle, you can return to the original task. Now because you need to switch to another task in the interrupt service program, you cannot return directly to the original task of the interrupt (because if it is still returned Continue to run in the original task of the interrupted, you can't complete the task schedule). Therefore, it is necessary to first release the relationship between the task and the interrupt service program that is interrupted, then established a link relationship with the task prepared to schedule, put the TSS descriptor of the prepared task into the TSS of the interrupt service program, so When executing the IRET instruction is returned, it will return to the task of preparation schedule, which completes the task scheduling.
In the above code, many structures are used, they are all defined in the file of Process.h, which defines a class to handle tasks:
// Process Management Class
Class class_pyos_process {
PUBLIC:
Static struct_pyos_process processqueue [4]; // process queue
Static struct_pyos_tss tss [4]; // task status segment
Static void init (); // process management initializes static unsigned int currentprocessNumber; // The process number currently being executed
}
In the PYOS in this experiment, four processes are defined.
No. 0 process: kernel process
No. 1 process: A process
No. 2 process: B process
3 process: clock interrupt service process (also task scheduling process)
Struct_pyos_process This structure defines the status information you need to use, that is, the process structure that is often mentioned in the operating system tutorial, which is used to identify a process:
// Define the process structure
Struct struct_pyos_process {
UNSIGNED INT CPUTIME; // The CPU Time Size assigned by the process
UNSIGNED INT TSSSELECTOR; // TSS segment selector
}
Since only the above information of the process is used in this experiment, this structure is very simple.
Below, let's take a look at the process initialization function, its main job is to initialize the process structure and the corresponding TSS structure and TSS descriptor for the process of running, in order to call at any time:
// Process initialization function
Void class_pyos_process :: init ()
{
Struct_pyos_tss * tss;
Struct_pyos_tssitem * TSSITEM;
UNSIGNED INT TSSADDR;
/ * Set the current process number is 0 * /
CurrentProcessNumber = 0;
/ * Task segment used by the initialization No. 0 process * /
TSS = & class_pyos_process :: TSS [0];
TSS-> CR3 = 0;
TSS-> LDT = 0;
/ * Task segment descriptor used by the initialization No. 0 process * /
TSSADDR = (unsigned int) & class_pyos_process :: TSS [0];
TSSITEM = & class_pyos_system :: m_gdt.tss [0];
TSSITEM-> b = 0;
TSSITEM-> BASE_0_15 = TSSADDR;
TSSITEM-> BASE_16_23 = TSSADDR >> 16;
TSSITEM-> BASE_24_31 = TSSADDR >> 24;
TSSITEM-> DPL = 0;
TSSITEM-> g = 1;
TSSITEM-> LIMITLENGTH_0_15 = 103;
TSSITEM-> LIMITLENGTLENGTH_16_19 = 0;
TSSITEM-> P = 1;
TSSITEM-> Saved_00 = 0;
TSSITEM-> Saved_010 = 2;
TSSITEM-> Saved_1 = 1;
/ * Load No. 0 process is also the selection of TSS of the current process to TR (task register) * /
__ASM __ ("MOVW $ 0x28,% AX");
__ASM __ ("LTR% AX");
/ / Set the process queue parameters of the 1 process
Class_pyos_process :: processqueue [1] .cputime = 200; // Assign 200 time films
Class_pyos_process :: processqueue [1] .tssSselector = 0x30; / * Initialization Task Section used by the 1st Process (A process) * /
TSS = & class_pyos_process :: TSS [1];
TSS-> CR3 = 0;
TSS-> CS = 0x8;
TSS-> DS = 0x10;
TSS-> EAX = 0;
TSS-> EBP = 0;
TSS-> EBX = 0;
TSS-> ECX = 0;
TSS-> EDI = 0;
TSS-> EDX = 0;
TSS-> EIP = 0x80000;
TSS-> ES = 0x10;
TSS-> ESI = 0;
TSS-> ESP = 0x8fff;
TSS-> FS = 0x10;
TSS-> GS = 0x10;
TSS-> LDT = 0;
TSS-> SS = 0x10;
TSS-> EFLAGS = 0x202; // Specify EFLAGS, where broken flag is set to open interrupt
/ * Task section descriptor used by initialization 1 process * /
TSSADDR = (unsigned int) & class_pyos_process :: TSS [1];
TSSITEM = & class_pyos_system :: m_gdt.tss [1];
TSSITEM-> b = 0;
TSSITEM-> BASE_0_15 = TSSADDR;
TSSITEM-> BASE_16_23 = TSSADDR >> 16;
TSSITEM-> BASE_24_31 = TSSADDR >> 24;
TSSITEM-> DPL = 0;
TSSITEM-> g = 1;
TSSITEM-> LIMITLENGTH_0_15 = 103;
TSSITEM-> LIMITLENGTLENGTH_16_19 = 0;
TSSITEM-> P = 1;
TSSITEM-> Saved_00 = 0;
TSSITEM-> Saved_010 = 2;
TSSITEM-> Saved_1 = 1;
// Set the process queue parameters of the 2 process
Class_pyos_process :: processqueue [2] .cputime = 400; // Assign No. 2 process 400 time films
Class_pyos_process :: processqueue [2] .tssselector = 0x38;
/ * Task segment used by initialization 2 process (B process) * /
TSS = & class_pyos_process :: TSS [2];
TSS-> CR3 = 0;
TSS-> CS = 0x8;
TSS-> DS = 0x10;
TSS-> EAX = 0;
TSS-> EBP = 0;
TSS-> EBX = 0;
TSS-> ECX = 0;
TSS-> EDI = 0;
TSS-> EDX = 0;
TSS-> EIP = 0x90000;
TSS-> ES = 0x10;
TSS-> ESI = 0;
TSS-> ESP = 0x9fff; TSS-> FS = 0x10;
TSS-> GS = 0x10;
TSS-> LDT = 0;
TSS-> SS = 0x10;
TSS-> EFLAGS = 0x202;
/ * Task section descriptors used in the initialization of the 2nd process * /
TSSADDR = (unsigned int) & class_pyos_process :: TSS [2];
TSSITEM = & class_pyos_system :: m_gdt.tss [2];
TSSITEM-> b = 0;
TSSITEM-> BASE_0_15 = TSSADDR;
TSSITEM-> BASE_16_23 = TSSADDR >> 16;
TSSITEM-> BASE_24_31 = TSSADDR >> 24;
TSSITEM-> DPL = 0;
TSSITEM-> g = 1;
TSSITEM-> LIMITLENGTH_0_15 = 103;
TSSITEM-> LIMITLENGTLENGTH_16_19 = 0;
TSSITEM-> P = 1;
TSSITEM-> Saved_00 = 0;
TSSITEM-> Saved_010 = 2;
TSSITEM-> Saved_1 = 1;
/ * Task segment used by initialization 3 process (clock interruption) * /
TSS = & class_pyos_process :: TSS [3];
TSS-> CR3 = 0;
TSS-> CS = 0x8;
TSS-> DS = 0x10;
TSS-> EAX = 0;
TSS-> EBP = 0;
TSS-> EBX = 0;
TSS-> ECX = 0;
TSS-> EDI = 0;
TSS-> EDX = 0;
TSS-> EIP = (unsigned int) pYOS_ASM_INTERRUPT_HANDLE_FOR_CPU_TIME;
TSS-> ES = 0x10;
TSS-> ESI = 0;
TSS-> ESP = 0x7fff; // The stack area used to // clock disrupts (the clock is interrupted separately, using its own stack area, so as not to appear stack errors)
TSS-> FS = 0x10;
TSS-> GS = 0x10;
TSS-> LDT = 0;
TSS-> SS = 0x10;
TSS-> EFLAGS = 0x202;
/ * Task section descriptors used by the initialization 3 process * /
TSSADDR = (unsigned int) & class_pyos_process :: TSS [3];
TSSITEM = & class_pyos_system :: m_gdt.tss [3];
TSSITEM-> b = 0;
TSSITEM-> BASE_0_15 = TSSADDR;
TSSITEM-> BASE_16_23 = TSSADDR >> 16;
TSSITEM-> BASE_24_31 = TSSADDR >> 24;
TSSITEM-> DPL = 0;
TSSITEM-> g = 1;
TSSITEM-> LIMITLENGTH_0_15 = 103; TSSITEM-> LimitLength_16_19 = 0;
TSSITEM-> P = 1;
TSSITEM-> Saved_00 = 0;
TSSITEM-> Saved_010 = 2;
TSSITEM-> Saved_1 = 1;
}
The code is very simple, and there is a very detailed note, there is not much to say, some details can be found in the actual code.
Finally, let's take a look at what the TSS descriptor and task doors are stored in what order in the global descriptor table, which is the key to the descriptor similar to 0x28, 0x30, and this code. It is in system.h.
/ * GDT table * /
Struct struct_pyos_gdt {
Struct_pyos_gdtitem gdtnull; // empty segment, Intel reserved
Struct_pyos_gdtitem gdtsystemcode; // System code segment
Struct_pyos_gdtitem gdtsystemDate; // System data segment
/ * System call door * /
Struct_pyos_invokegate invokegate [2];
// Task Section (TSS) Descriptor
Struct_pyos_tssitem TSS [4]; // 0 (0x28) No. 1 (0x30) number to A process for the No. 0 process
// 2 (0x38) to the B process 3 (0x40) to the clock interrupt
// Task Gate
Struct_pyos_tssgate tssgate; // 0x48
}
The whole PYOS is running just like the figure below in this experiment:
Experimental supplementary instructions
The code and procedures of this experiment are relatively simple, so you can write more complex scheduling algorithms on this basis, or modify the size of the time slice of different processes in the program to see if it is executed Speed effect.
When it is description, this experiment is intended to explore a method of actually handling task scheduling in an operating system, "Yes maybe, rather than it should be". As can be seen from the code, PYOS is a fairly snail system because it performs multiple task switches when dispatting a task, and each switching CPU will save 104 bytes of data, this operation is quite time consuming. . Therefore, if it is an actual system that excellent performance, such an operation should be avoided, which can be achieved using a soft switch rather than through the CPU hardware. Don't always think that the hardware is so fast, because the software can be adjusted as needed, you can do very small, you can only save the most important data, so that the actual switching will be much faster. However, PYOS does not aim to become a high-efficiency system, it is just an experimental system, with it to verify the learned, experiment.
Reference:
"Pentium Family User's Manual Volume3: Architecture and Programming Manual" .intel. 1994
"80x86 assembly language programming tutorial". Yang Jiwen. Tsinghua University Press, June 1998
Original: http://purec.binghua.com/Article/showArticle.asp? ArticleId = 217