Linux 0.11 process schedule, hardware base GDT IDT

xiaoxiao2021-03-06 35

Written in accordance with Zhao Bo's full note.

Personally think that the process is scheduled, you must know the knowledge of X86. I hope to use the things below. Writing is not written.

I have time to study.

Basic knowledge

System address register:

The system register is used to manage the use of system tables in protection mode. System table for memory has a global descriptor (GDT), partial representation table (LDT), interrupt description table (IDT). System register (in this chapter without special instructions, system registers containing only system registers for memory management) have global descriptor table registers (GDTR), local representation table registers (LDTR), and interrupt statements registers (IDTR).

GDTR: is a 48-bit register, stores the 32-bit linear address of GDT, and the boundary of the 16-bit GDT table. 48-bit registers are loaded by LDT.

IDTR: is also a 48-bit register, stores an interrupt descriptor table LDT's 32-bit linear address, the limit of the 16-bit IDT table. LIDT loads 48-bit registers.

LDTR: This register is special, divided into a user-visible portion (16-bit) and an invisible portion (48 bits, used to store the base address and boundaries of the LDT), with privileged instruction LLDT loading.

The setup and format of IDT will be introduced in process management.

X86 unique segment management:

After passing the transformation, the virtual address of the program is converted into a linear address, and the linear address turns into a physical address, output to the address line, which can be accessed to the correct data or instructions.

In 386, there are 6 segment registers CS, DS, ES, SS, FS, GS, in real address mode, these registers are used to store 16-segment addresses, we all know through CS: IP form to access 20 physical physics Address, that is, CS left shift 4 bits and then add IP just a 20-bit address bus. The role of the register in the protection mode has changed. In the segment management of the protection mode, each program has a segment expression (referring to one item of GDT and LDT, the specific format will be introduced later), as for selecting GDT or LDT, and which one is specifically In terms of segment registers (CS, DS) decided. At this time, the meaning of CS: EIP changes, the Bit [2] of the CS pointed out that the use of LDT or GDT, bit [3-15] pointed out the index value in the descriptor table. Then, from the descriptor in the descriptor table, the base address is then added to the EIP constitutes a 32 linear address, and after the page table conversion is 32 physical addresses. Since LDT and GDT are in memory, if you have an additional interview operation every time you visit, this overhead is unacceptable. So before a process begins to run, the operating system loads the LDT and GDT descriptors of this process into a cache for quick access. At this time, based on the bit segment (2] of the segment selector (in the high speed buffer), it is not necessary to access the memory each time. The specific segment changes are shown below:

The format of the descriptor in LDT and GDT (a total of 8 bytes):

High position

Segment bound limited 7-0 segment limit 15-8 base 7-0 base address 15-8

Attachment 23-16 P DPL S TYPE G D / B 0 AVL Segment Different 19-16 STS 31-240

Seduce

S = 1 is the non-system segment, that is, if this descriptor is selected, the base and segment boundary limit in the descriptor can be used directly. Divided into data segment descriptors and code segment descriptors.

S = 1 is the system segment, with the LDT descriptor and TSS descriptor, use it when the task is switched. Specifically, in the process management chapter.

When using the privilege command LLDT% DX load LDTR, the selection in DX does not point to LDT, but points to GDT, and the processor loads DX to the visible portion of LDRT when performing this instruction, and then follow the DX content from GDT The base address and the boundary value in the LDT descriptor are removed in the LDT descriptor are loaded into the invisible portion of the LDTR. The base address is based on the descriptor in the LDT in the segment to pass the descriptor in the LDT.

X86 paging mechanism:

The paging mechanism is to convert a linear address into a physical address for data storage. When analyzing Linux source code, it must be clearly clear what type of address, virtual address (unchanged transformation), linear address (unpaged transformation), physical address (output to the real address of the address.). The segment transformation in the protected mode is very simple, with CS: IP as an example, the address after the conversion is CS << 4 IP. After entering the protective mode, it is also used as CS: EIP for example. At this time, the CS has been segment selectors, and the base address is taken from the GDT or LDT according to the content of the CS, and then the EIP is a linear address. Thus we know at least GDT to be set at least before entering the protection mode, or the segment transformation must be wrong. The paging mechanism is to mark whether to mark the pagination via CR0.pg. Below, the page transformation process is described below.

There are several paging methods in the Intel X86 processor, and is determined by CR4 [PSE] and CR4 [PAE].

PAE = 0 PSE = 0 32 5KB paging mode under physical address bus

PAE = 0 PSE = 1 32 bits of the 4M and 4MB paging methods under physical address buses

PAE = 1 PSE = 0 36 5KB paging method under physical address bus

PAE = 1 PSE = 1 36 5KB and 2MB paging methods under physical address bus

I mainly introduce 4M and 4MB paging methods under 32 physical address businesses.

(1) 4KB paging method

The 32-bit linear address has a space size of 4G, Intel uses the secondary page table to map the entire linear space. As shown below:

1024 page items

Physical page ... physical page

Physical memory

4KB paging mode The format of the next page of the page is:

Page TabletA page number offset address

CR3

(2) 4MB paging method

When CR4 [PSE] = 1, it indicates that is 4KB or 4MB paging. The specific page directory items in the specifically page directory indicates that if bit [7] = 1 is 4MB, 4KB. Because a page directory has 1024 items, each pointing to a 4MB address space, just a page directory for 32 linear addresses. So there is only one level page, so you will access the memory once. Improve efficiency. The page base address in the natural page directory becomes 10 digits, not 20 in 4kb. The address conversion graph is as follows: 31 22 21 0

Page catalog number offset address

CR3

Linear address

22nd

High 10-bit 32-bit physical address

Page directory

Linux 0.11 supports only 64 processes, why is it 64 instead of other numbers here. Of course, this reason is that I guess, and I haven't passed the Linus. One is because Linux 0.11 uses only 32-bit physical addresses of the X86 chip, so its linear address is also 32-bit. This completes the entire addressing space is 4 GB. Considering that Linux0.11 is using the Minix file system, its file size cannot exceed 64MB, so the maximum virtual space occupied by each process does not exceed 64MB, and 64 processes will not exceed 4GB. This is important. The virtual address space of all the processes in the system does not exceed 4GB, so the entire system only provides page transformations for all processes.

Linux 0.11 management memory mainly has two files, one time Memory.c and Page.S, Memory.c will be described in detail below. Linux 0.11 In order to save physical memory, Linux 0.11 uses write-time replication technology. That is, when a child process is created using Fork. The child process does not copy the physical page of the parent process, but only the page directory and page of the parent process. And modify the page directory and page items to read-only properties. When a child process or a parent process wants to modify the page, an int 14 interrupt will be triggered, and a page of physical pages is true. Invalidate

Page.s is used to handle int 14 missing information. The latch interrupt includes two: one is that the physical page corresponding to the linear address of the access is not in memory, and it is caused by write-time replication technology, which will cause this interrupt when the process is first written for the first time.

Basic knowledge

Interrupt Descriptor Table IDT and Interrupt Register IDTR:

The interrupt descriptor table contains three descriptors: task door, interrupt door, trap door. Since only interrupt doors are used in Linux, only the interrupt gate is introduced.

When the INT N command is executed, the CPU uses the idtr as the base address of the interrupt descriptor table with N-bit index. Take a segment selector (16-bit) to CS from IDT to remove 32-bit offset to EIP.

X86 Task Switch Full Description

Task switches a total of four forms:

(1) Perform a LJMP or CALL instruction to go to the TSS descriptor in the GDT during the current program, task or process. (Direct task conversion) (using LJMP in Linux0.11)

(2) A JMP or CALL instruction is executed in the current program, task or process to go to the GDT or a task door descriptor (not used in Linux0.11) in the current LDT. (Indirect task conversion)

(3) A task door descriptor (not used in Linux0.11) through an interrupt or abnormality vector pointing to a task door descriptor in the IDT. (Indirect task conversion)

⑷ When the flag EFLAGS · NT is set, the current task executes the instruction IRET (or IRETD for 32-bit program conversion. (Direct task conversion) For instruction JMP, Call, IRET, interrupt and exception, they are executed A redirection mechanism. A TSS descriptor, a task door (call or jump to a task) or the status of the flag bit NT (executing the instruction IRET) determines whether or not task switching. The above first and fourth kinds The form can be expressed in the form of the figure:

When JMP or CALL, if the CPU discovers the target selector in the GDT corresponding to the TSS descriptor, this is the address that the CPU is not really jumped to the JMP or CALL. Instead, task switching. The specific task is sluggish.

The second and third forms are represented by the following figure:

This form does not exist in Linux0.11, and is not used in future versions.

Task Register TR: is a 16th register that stores the selection word of the TSS descriptor. When the task switch is performed, the context saving and new task of the new task are restored according to the selection word given in TR. In Linux0.11, TSS is only placed in GDT.

System call (mode switch)

System call is the only way to access the kernel user program, as shown below:

User space and kernel space

When the user space and kernel space are distinguished, the CPU is running level. When the CPL of the CPL is 0, the kernel space of the process is the user space for the process of the CPL of 3. The process 0 switches the process 0 from the kernel space to the user space in one macro definition in Linux0.11. Let's take a specific analysis of the short code:

We can clearly see when executed

Iret

Current

CPU

CPL

for

. and

In the stack

Cs = 00001 1 11, that is, its DPL is 3. In addition to popping EIP, CS, EFLAGS from the stack, ESP and SS. The next four lines set the DS equal segment register. This completes the switching from the kernel space to user space. We can also see here, the process 0 is the same as the stack of the kernel space, the stack of the user space. When doing this, the process one's kernel stack content is copied to copy the process 0, so the kernel stack of the process 0 must be "clean". So the process 0 is not used in the kernel space, but the user's space, the smartly "stealing column" here.

System calling process:

1. When the process is executed, the CPU accesses the interrupt descriptor, and the CPL in the current CPL is different from the DPL of the interrupt descriptor. Then the CPU will make stack switching (the value of the kernel stack is obtained from TSS), that is, put SS, ESP into the kernel stack of the process, and then EFLAGS, CS, EIP. The values of CS and EIP come from the interrupt descriptor. It can be seen in this chapter of the basics part of this chapter. At this time, the content of the kernel stack is red content:

User SS

User ESP

EFLAGS

EIP

EDX

ECX

EBX

Task_struct

Enter the kernel space

, That is, the processing program that is called by the system

At this point, the system call program presses DS, ES, FS, EDX, ECX, EBX.

The kernel stack. At this time, the stack content is red plus blue:

MOVL $ 0x10,% EDX # Because the kernel space is already, so

MOV% DX,% DS # sets the data segment selector to point to GDT [2]

MOV% DX,% ES # Because CS is already set, it is already set,

So here you don't have to set it, there is code and data segment, and the program can naturally run.

MOVL $ 0x17,% EDX # fs points to data segment descriptors in LDT

MOV% DX,% FS # is this for kernel space

# User space write data. Write data to user space in the future, data segment

Select symbols must be an FS register

3. Do the above work

Call _sys_call_table (,% EAX, 4) instructions jumps to the corresponding

The system call function is executed.

4. System call returns

When the system calls return, execute the IRET instruction. When returning to the privileged level of the target and the current privilege level, the CPU only pops up EIP, CS, EFLAGS from the stack. If you return the target level and the current privilege level, you have to pop up ESP, SS, and switch the stack of the process from the kernel stack to the user stack.

Process schedule

Process scheduling is the most complex code operating system, and it is often difficult to understand. I don't know where to say it. I hope that you have a certain basis through the previous story. Below I am from the main function, what is the process schedule is going on?

Void main (void)

{

...... ...

MEM_INIT (Main_Memory_Start, Memory_end); / * Initialization of memory * /

TRAP_INIT (); / * Initialize various interrupt doors, in * /

BLK_DEV_INIT ();

CHR_DEV_INIT ();

TTY_INIT ();

TIME_INIT ();

SCHED_INIT (); / * Initialization Process 0 * /

Buffer_init (buffer_memory_end);

HD_INIT ();

FLOPPY_INIT ();

STI (); / * open interrupt * /

Move_to_user_mode (); / * Put process 0 from kernel space to user space * /

IF (! fork ()) {/ * We count on this going OK, here the fork is inlined function * /

INIT ();

}

For (;;) pause ();

}

Since you have to understand the process scheduling and management, you start from the process 0 until it depends from the process 1. Then I will discuss the schedule of the process. At that time, you will know how to know how the process is fork, how to schedule, and when a process fork is a child process, why is the child process number in the parent process, while in the child process is 0. I hope my wishes can become a reality. After the introduction, you will also introduce Interruptible_sleep_on (), sleep_on () and wake_up (). Here is an implicit linked list.

SCHED_INIT ():

To say that the process 0 is really difficult to express it. Because the process 0 is not like other processes, call a Fork created. As long as the Fork is executed, the representative sub-process and creation are completed. The field of the partk_struct of the process 0 is manually manufactured, and it has been set up after writing the program. See the init_task macro definition in SCHED.C. It is the content of the initial process 0 Task Struct.

It is actually called a complete process before executing SCHED_INIT (). Because the process 0 does not have a TSS descriptor in the GDT, the task register TR will not have a legal value. Without TR, the TSS descriptor is impossible to switch.

Void Sched_Init (Void)

{

INT I;

Struct dec_struct * p;

..........

SET_TSS_DESC (GDT First_TSS_ENTRY, & (Init_Task.task.tss);

/ * Set the TSS descriptor for process switching * /

SET_LDT_DESC (GDT First_LDT_ENTRY, & (Init_Task.task.LDT)); / * Sets the LDT descriptor, the process 0 is converted from the kernel space to the user space. * /

P = GDT 2 first_TSS_ENTRY;

For (i = 1; i

/ * Initialize the other descriptors of GDT to 0 * /

Task [i] = null;

P-> a = p-> b = 0;

P ;

P-> a = p-> b = 0;

P ;

}

__asm __ ("Pushfl; Andl $ 0xffffBFFF, (% ESP); POPFL");

LTR (0); / * Load the TR register for process switching * /

LLDT (0);

......

}

Process 0 calls fork generation process 1

A child process will be generated when the process 0 is running to Fork, and the Fork is a system call. The system call has been described above, which will give the flow of the entire system call. This will also assume that there is a process switching to explain how the process is switched.

When the process is executed, the next instruction is pressed into the kernel stack after returning the next instruction. The status map of the kernel stack is as follows.

IF (! fork ()) {/ * We count on this going ok * /

INIT ();

}

The core stack of the process 0 This time the process 0 has been in the kernel space and starts to perform the following instructions.

Call _sys_call_table (,% EAX, 4) / * Jump to the real system call function is executed, and the system call number is performed by the process 0 in the execution INT command. (,% EAX, 4) indicates EAX × 4 because the function address is 32 is the address. Since it is close to the EIP, only press the EIP, run to SYS_FORK running * /

Pushl% EAX / * Please see sys_fork before watching this directive. If the call is fork, Eax is a sub-process number * /

MOVL _CURRENT,% EAX

CMPL $ 0, State (% EAX) # state

Jne Reschandule

CMPL $ 0, Counter (% EAX) # counter

JE reschedule / * Here we assume that the parent process has already used the time to finish, then schedule, and dispatch to the child process, see Schedule * /

/ * The following is the parent process continues to run * /

......

3: POPL% EAX / * pops up the EAX procedure, after the IRET is executed, EAX is the default return value, then the return value is the child process number * /

POPL% EBX

POPL% ECX

POPL% EDX

POP% FS

POP% ES

POP% DS

Iret

_SYS_FORK:

Call _find_empty_process / * returns an idle process number, if not, return a negative number * /

Testl% EAX,% EAX

JS 1F

PUSH% GS

Pushl% ESI

Pushl% EDI

Pushl% EBP

Pushl% EAX

Call _copy_process / * Task_TRUCT * / with a new process

/ * COPY_PROCESS Returns the sub-process number, according to the regulations, the value is in EAX * /

Add1 $ 20,% ESP / * pops up 5 registers in the above

1: Ret / * Return to the function call * /

INT COPY_PROCESS

(INT NR, LONG EBP, LONG EDI, Long EBX, LONG ECX, Long EDX, LONG EIP, Long ES, Long EIP, LONG CS, LONG EFLAGS, Long ESP, LONG SS)

{

P = (struct task_struct *) get_free_page ();

Assign a page of physical pages, the destination of the physical page is Task_struct, the second half

The kernel stack.

* p = * current; copy parent process Task_struct

Modify the field of some task_struct

P-> state = task_uninterruptible; / * Setting process is uninterrupted * /

P-> PID = last_pid; / * Set the process number * /

P-> Father = current-> pid; / * Set the parent process number * /

P-> TSS.ESP = ESP;

P-> TSS.SS = SS & 0xFFFF; The user stack of the child is temporarily shared with the parent process. When writing the stack, the shortage interruption occurs, from this "parting"

P-> TSS.EIP = EIP; / * This is the return value of the parent process call INT 0x80, which also said that the parent-child process returns to the same virtual address * /

P-> TSS.EAX = 0; / * This is the return value in the child process. According to the above EIP, you can know why the child process returns 0 * /

P-> TSS.ES = ES & 0xFFF; all segment registers are previously pressed by the parent process

Return Last_PID; / * Return the sub-process number, the default return value is in Eax * /

}

Void Schedule (Void) and Switch_to

{

...... Here you detect the amount of semaphore, change the status of the signal process

Then select a process according to the priority bar to run. We assume that it is the child process just fork.

Next is the next process number to run.

Switch_to (next); / * This is a macro definition, no parameters are not included * /

}

Tem.a EIP

Tem.b cs

EAX sub-process number

EBX

ECX

EDX

EIP

EFLAGS

User ESP

User SS

Task switching:

When the CPU find segment selector is pointing to the TSS descriptor, it ignores the EIP, not jump, but task switching. Specific steps are as follows:

1. First save the status of the current task. The processor addressing the base address of the current TSS according to the content of TR, saving all universal registers, segment registers, EFLAGS, EIP to the TSS descriptor of the current task.

2. Copy the value of the current selector to the TR register, use the next task switching to save the context. A register of the processor is loaded from the TSS data structure of the new task: General Registers, EFLAGS, EIP, and segment registers.

Based on the Copy_Process function, we immediately know that the value of the EIP is a command address after the Fork returns. The return value is Eax in the child process, set to 0 in the copy_process, and finally know why Fork returns 0 in the sub-process.

Sleep_on and wake_up

When the process is in response to the buffer or disk, it is likely that the data cannot be immediately obtained. The process is directly blown up by calling the Sleep_on function and dispatches other processes. We assume that there is a buffer, there are 3 processes P1, P2, P3 to access it, and the result is not available. These three processes call Sleep_on, respectively, on this buffer waiting queue, is actually an implicit waiting queue. The result is as shown in the figure below: (The state of P1 P2 P3 is Task_uninterruptible)

You can also wake up the process pointed to by Wake_UP.

Void sleep_on (struct task_struct ** P)

{

Struct Task_struct * TMP;

IF (! p)

Return;

IF (current == & (init_task.task) / * process 0 can not hang * /

PANIC ("Task [0] Trying to Sleep");

TMP = * P;

* p = current;

Current-> State = Task_uninterruptible; / * Uninterrupted * /

Schedule (); / * Scheduling other programs * /

IF (TMP) / * Wake up the waiting process in front * /

TMP-> State = 0;

}

Void wake_up (struct task_struct ** P)

{

IF (p && * p) {

(** p) .State = 0; / * Set the status of the process to be running state * /

P = null; / * must be set to null, otherwise the consequences are unimaginable * /

}

转载请注明原文地址:https://www.9cbs.com/read-66402.html

9cbs

New Post(0)