A main application of memory segmentation mechanism is to implement a multitasking of operating systems, which provides two key abstractions for applications: a separate logical control stream, a private address space. This article will analyze and experiment with the creation and scheduling of the process, thus deeper understanding of the segmentation mechanism. Confucius section of the debugging environment See the foregoing: guided the code oscillate in memory from Linux0.11
Process schedule initialization (SCHED_INIT function)
After the boot code is executed, the execution sequence will jump to the main function, perform a series of initialization, where there is an initialization process of task 0, which is included in the SCHED_INIT function in kernel / sched.c:
Void SCHED_INIT (VOID) {INT I; Struct DESC_STRUCT * P; if (Struct SigAction! = 16) PANIC ("Struct SigAction Must BE 16 BYTES); / * Establish a TSS, LDT Descriptor of Task No. 0 Entries * / set_tss_desc (gdt first_tss_entry, & (init_task.task.tss)); set_ldt_dsc (gdt first_ld_entry, & (init_task.task.ldt)); p = GDT 2 first_tss_entry; for (i = 1; I
The set_tss_desc function is defined in include / asm / system.h:
/ * Set * / # define _SET_TSSLDT_DESC (N, Addr, Type) / __ ASM__ ("MOVW $ 104,% 1 / N / T" / "MOVW %% AX,% 2 / N / T "/" RORL $ 16, %% EAX / N / T "/" MOVB %% Al,% 3 / N / T "/" MOVB $ "TYPE",% 4 / N / T "/" MOVB $ 0 X00,% 5 / N / T "/" MOVB %% AH,% 6 / N / T "/" RORL $ 16, %% EAX "/:" a "(addr)," M "(* (N) ), "m" (* (N 2)), "M" (* (* (N 5)), "M" (* (N 6)) , "M" (* (N 7)) /) / * 0x89 is the attribute of the TSS descriptor, 0x82 is the attribute of the LDT descriptor * / # define set_tss_desc (n, addr) _SET_TSSLDT_DESC ((char *) (N ))), addr, "0x89") # define set_ldt_desc (n, addr) _SET_TSSLDT_DESC ((char *)), addr, "0x82") above the assembly code is a 8-byte descriptor table for GDT The various bytes of the item are set, and the contents of each descriptor after setting are as follows:
System segment descriptor 76543210ADDR high 8 bits 0x0089 or 0x0082ADDR low 24 bits 0x0068 (segment boundary)
0x0089 represents available 386TSS (0x9), 0x0082 represents available LDT (0x2).
INIT_TASK is a global variable, which initializes the init_task, the init_task macro is as follows:
#define init_task /// * State etc * / {0,15,15, // * signals * / 0, {{},}, 0, // * EC, BRK ... * / 0, 0, 0, 0, 0, 0, // * pid ETC.. * / 0, -1, 0, 0, 0, // * Uid ETC * / 0, 0, 0, 0, 0, // * alarm * / 0,0,0,0,0,0, // * math * / 0, // * fs info * / -1, 0022, null, null, null, 0, // * filterp * / {NULL, }, / {/ {0,0}, // * ldt * / {0x
9f, 0xc0fa00}, / / * code length 640K, boundary particle size 4K byte, base 0x0 * / {0x
9f, 0xc
0f200}, / / * Data length 640K, limit particle size 4K byte, base address 0x0 * /}, // * TSS * / {0, Page_Size (long) & init_task, 0x10, 0, 0, 0, (Long) ) & pg_dir, / 0, 0, 0, 0, 0, 0, 0, 0, / 0, 0, 0x17, 0x17, 0x17, 0X17, 0X17, 0X17, / _LDT (0), 0x80000000, / {} /} , /}
In this code, we care about the setting of the LDT and TSS, and the LDT table of each task has three entries. The first item is not used, the second is the CODE segment, and the third item is DATA. At the descriptor table of Task 0, the address of the two fields is the content of TSS0 descriptors and LDT0 descriptors. The LTR and LLDT functions are used to load the descriptor entry in the index in the GDT table to the corresponding register. Take the task 0 as an example. After performing the two functions, the value in the TR register is 4 * 8 = 0x20, The value in the LDT register is 5 * 8 = 0x28.
The above content is adjusted to the SCHED_INIT function. First, the address of the function is found in the System.map file generated after the kernel compile: 0x72bc. Start Bochsdbg, set breakpoints at 0x72bc, command line as follows:
(0) BreakPoint 1, 0x72bc in ?? ()
Next at t = 16800742
(0) [0x000072BC] 0008: 000072BC (UNK. CTXT): PUSH EBP; 55
......
000072ce: (): MOV WORD PTR DS: 0x5cd8, 0x68;
66C705D
85C00006
800
000072D7: (): MOV Word PTR DS: 0x5cda, AX;
66A3DA
5C0000
000072dd: (): Ror Eax, 0x10; C
1C810
000072E0: (): MOV BYTE PTR DS: 0x5cdc, Al; 8805DC
5C0000
000072E6: (): MOV BYTE PTR DS: 0x5cdd, 0x89; C605DD
5C000089
000072ed: (): MOV BYTE PTR DS: 0x5cde, 0x0; C605DE
5C000000
000072F4: (): MOV BYTE PTR DS: 0x5CDF, AH; 8825DF
5C0000
000072FA: (): Ror Eax, 0x10; C
1C810
000072fd: (): add eax, 0xffffe8;
83C0E8
00007300: (): MOV WORD PTR DS: 0x5CE0, 0x68;
66C705E
05C00006
800
00007309: (): MOV Word PTR DS: 0x5CE2, AX;
66A3E
25C0000
0000730f: (): ROR EAX, 0x10; C
1C810
00007312: (): MOV BYTE PTR DS: 0x5CE4, AL; 8805E
45C0000
00007318: (): MOV BYTE PTR DS: 0x5CE5, 0x82; C605E
55C000082
0000731f: (): MOV BYTE PTR DS: 0x5CE6, 0x0; C605E65C000000
00007326: (): MOV BYTE PTR DS: 0x5CE7, AH; 8825E
75C0000
0000732c: (): Ror Eax, 0x10; C
1C810
......
0000737A: (): MOV Eax, 0x20; B820000000
0000737f: (): LTR AX;
0F00D8
00007382: (): MOV Eax, 0x28; B828000000
00007387: (): LLDT AX;
0F00D0
......
(0) Breakpoint 2, 0x
7387 in ?? ()
Next at t = 16801469
(0) [0x00007387] 0008: 00007387 (UNK. CTXT): LLDT AX;
0F00D0
......
CS: S = 0x8, DL = 0x7FF, DH = 0xc
09A00, VALID = 1
SS: S = 0x10, DL = 0xFFF, DH = 0xc09300, Valid = 7
DS: S = 0x10, DL = 0xFFF, DH = 0xc09200, Valid = 7
ES: S = 0x10, DL = 0xFFF, DH = 0xc09300, Valid = 5
FS: S = 0x10, DL = 0xFFF, DH = 0xc09300, Valid = 1
GS: S = 0x10, DL = 0xFFF, DH = 0xc09300, Valid = 1
LDTR: S = 0x28, DL = 0x84640068, DH = 0x8201, VALID = 1
Tr: s = 0x20, DL = 0X
847C0068, DH = 0x8901, VALID = 1
GDTR: Base = 0x5cb8, limited = 0x7ff
IDTR: Base = 0x54b8, limited = 0x7ff
......
0x000072d7 - 0x
0000732C is set for the contents of the TSS0 and LDT0 entry of the GDT table. 0X
0000737A - 0x00007387 is set to the TR register and the LDT register, after the setting is completed, the cs segment register value is 0x8 as an example, Bit0 to bit1 bits indicate the privilege level is 0, the bit2-bit Ti field bit 0, indicating the 13-bit composition Refers to the index of the GDT table (if the Ti field is 1, indicating that the value of the 13-bit composition is the index of the LDT table).
Start Task 0 (MOVE_TO_USER_MODE Macro)
After completing a series of initialization, the kernel will switch to the user mode task 0 to continue execution. Its code is the macro Move_to_user_mode, the code is as follows:
#define move_to_user_mode () / __ASM__ ("MOVL %% ESP, %% EAX / N / T" / "Pushl $ 0X17 / N / T" / "Pushl %% EAX / N / T" / "pushfl / n / t "/" Pushl $ 0x0F / N / T "/ / * Press the code segment (CS) selector * /" pushl $ 0 选择 选择 选择 入 *
1F / N / T "/ / * Press the address of the label 1, as the return address of the IRET * /" IRET / N "// * Switch to task 0, start the execution of its instruction sequence * /" 1: / TMOVL $ 0x17 %% EAX / N / T "/" MOVW %% AX, %% DS / N / T "/" MOVW %% AX, %% ES / N / T "/" MOVW %% AX, %% FS / N / T "/" MOVW %% AX, %% GS "/ :::" AX ")
The base segment and data segment set in the LDT table of the process 0 are 0, which is consistent with the base address of the core code segment and the data segment. During the stack process, the returned address inserted is the kernel code execution sequence. The address, the most critical difference from the previous kernel execution sequence is that the selector in the segment register is the index of the task 0 unique LDT table, and no longer point to the index of the GDT table. The base address of the LDT table is found through the LDLT register: in the initialization or task switching process, load the selection of the descriptor of the Task LDT into the LDTR during the initialization or task switching process, and the processor is based on the selection of the LDTR visible part, from the GDT Remove the corresponding descriptor and save information such as the LDT base address, the boundaries, and attributes, and the invisible cache registers of the LDTR.
The MOVE_TO_USER_MODE macro will be tested. First find the main function address 0x in systemp.map file
664C, find the location of the Move_to_user_mode macro by viewing the assembly instruction stream, the command line is as follows:
664C
(0) Breakpoint 1, 0x
664C in ?? ()
Next At t = 16769622
(0) [0x
0000664C] 0008:
0000664C (UNK. CTXT): PUSH EBP; 55
......
00006753: (): MOV EAX, ESP; 89E0
00006755: (): push 0x17;
6A17
00006757: (): push eax; 50
00006758: (): pushfd;
9C
00006759: (): push 0xF;
6A
0F
0000675B: (): push 0x6761; 6861670000
00006760: (): IRETD; CF
00006761: (): MOV Eax, 0x17; B81700000000006766: (): MOV DS, AX; 668ed8
00006769: (): MOV ES, AX; 668EC0
0000676c: (): MOV FS, AX; 668EE0
0000676f: (): MOV GS, AX; 668EE8
00006772: (): add esp, 0xc;
83C
40C
......
(0) Breakpoint 1, 0x
6761 in ?? ()
Next at t = 16878984
(0) [0x00006761]
000f: 00006761 (UNK. CTXT): MOV Eax, 0x17; B8170000
00
......
EIP: 0x6761
CS: S = 0xF, DL = 0x
9f, DH = 0xc0fa00, valid = 1
SS: S = 0x17, DL = 0x
9f, DH = 0xc
0F200, Valid = 1
DS: S = 0x0, DL = 0x0, DH = 0x0, VALID = 0
ES: S = 0x0, DL = 0x0, DH = 0x0, VALID = 0
FS: S = 0x0, DL = 0x0, DH = 0x0, VALID = 0
GS: S = 0x0, DL = 0x0, DH = 0x0, VALID = 0
LDTR: S = 0x28, DL = 0x84640068, DH = 0x8201, VALID = 1
Tr: s = 0x20, DL = 0X
847C0068, DH = 0x8901, VALID = 1
GDTR: Base = 0x5cb8, limited = 0x7ff
IDTR: Base = 0x54b8, limited = 0x7ff
......
These debugging information have 3 places worth noting. The first is the pointer of the EIP register points to the next command address of IRETD. Second, the code segment descriptor at this time is 0xF, bit0 to bit1 bits indicate the privilege level of 3, bit2-bit Ti field bit 1, indicating that the high 13-bit composition is pointing to the 1st index of the LDT table (from 0 Start). Finally, the value of LDTR: s = 0x28 indicates that the descriptor is 0x28 / 8 = 5, 8-byte descriptor in the GDT table is 0x00 0x0082 0x018464 0x0068, that is, DL and DH combination, it represents the LDT table The address is 0x00018464 (linear address), seeing memory, knowing the base segment and the data segment, the command line is as follows:
[bochs]:
0x00018464
0000009F
0x
00c0fa00
0x00018474
00C
0F200
These values are the value set by the previously analyzed init_task.task.ldt.
Create a sub-process (fork function)
The FORK function is a system call for creating a child process. All processes in Linux are the child process of process 0. The calling process for system functions will be elaborated in future articles. It is only analyzed here, and these functions are located in Kernel / fork.c. The Copy_Process function is used to create and copy the code segment and data segment and environment of the parent process, the code is as follows:
INT COPY_PROCESS (INT NR, Long EBP, LONG EDI, Long ESI, Long ECX, Long EDX, LONG FS, LONG ES, Long DS, Long EIP, Long CS, Long Eflags, long ESP , long ss) {struct task_struct * p; INT i; struct file * f; / * Found an unbeatable page * / P = (struct task_struct *) get_free_page (); if (! p) return -EAGAIN; TASK [NR] = P; * p = * current; / * Note! This doesn't copy the supervisor stack * // *! The initialization code * / if (last_task_used_math == current) __ASM __ ("CLTS; FNS% 0": "M" (P-> TSS.I387)); / * Set new task code And data segment base address, limited to page table * / if (COPY_MEM (NR, P)) {task [nr] = null; free_page ((long) p); return -eagain;} for (i = 0; I
int copy_mem (int nr, struct task_struct * p) {unsigned long old_data_base, new_data_base, data_limit; unsigned long old_code_base, new_code_base, code_limit; / * acquired base address and segment LDT tables of task 0 and the code and data segments of indefinite length * / code_limit = get_limit (0x0f); data_limit = get_limit (0x17); old_code_base = get_base (current-> ldt [1]); old_data_base = get_base (current-> ldt [2]); if (! old_data_base = old_code_base) panic ("We don't support Separate I & D"); if (Data_Limit
Linux0.11 core will be entire
The 4G address space is divided into 64 pieces for 64 processes, and the address of the sub-process and the parent process, or the address of the data segment will cause them to be logically separate (paging mechanisms and writing time). Copying may make the code or data of task 0 and task 1 in the same physical page).
Process scheduling (Schedule function)
The Schedule function implements the process schedule, located in kernel / scheD.h, the code is as follows:
Void Schedule (Void) {INT I, NEXT, C; STRUCT TASK_STRUCT ** P; / * CHECK ALARM, WAKE UP ANY INTERRUptible Tasks That Have Got AiGNAL * / / * Detect all processes to wake up any one has been received Task * / for (P = & Last_Task; P> & first_task; --P) if (* p) {ix ((* p) -> alarm && (* p) -> alarm
#define switch_to (n) {/ struct {long a, b;} __tmp; / __ ASM __ ("CMPL %% ECX, _CURRENT / N / T" / "JE1F / N / T" / "MOVW %% DX,% 1 / N / T "/" xchgl %% ECX, _CURRENT / N / T "/" LJMP% 0 / N / T "/ / * complete task switch * /" cmpl %% ECX, _last_task_used_math / n / t " / "JNE
1F / N / T "/" CLTS / N "/" 1: "/ ::" m "(* & __ tmp.a)," M "(* & __ tmp.b), /" d "(_tss (n)) , "C" ((long) Task [N])); /}
The specific operation of the task switch is shown below:
Figure 1: Schematic diagram of task switching operation (taken from Linux kernel full notes)
Next, the handover process will be tested, first find the address of Schedul in the system.map file: 0x6b
8C, start bochsdgb, find the location of the LJMP instruction in the switch_to macro, the command line is as follows:
8C
(0) BreakPoint 1, 0x6b
8c in ?? ()
Next At t = 16886214
(0) [0x00006B
8c] 0008: 00006B
8C (UNK. CTXT): PUSH EBP; 55
......
00006c6b: (): CMP DWORD PTR DS: 0X
1919C, ECX; 390D
9c910100
00006c71: (): jz. 0x
6C
8A; 7417
00006C73: (): MOV Word PTR SS: [EBP 0xFFFFFFFC], DX;
668955F
c
00006C77: (): XCHG DWORD PTR DS: 0X
1919C, ECX; 870D
9c910100
00006C7D: (): JMP FAR SS: [EBP 0xFFFFFFF1]; FF6DF8
00006C80: (): CMP DWORD PTR DS: 0X
191A0, ECX; 390DA0910100
00006C86: (): JNZ. 0x
6C
8A; 7502
00006C88: (): CLTS;
0F06
......
6c7d
(0) Breakpoint 2, 0x
6c7d in ?? ()
Next At t = 16886886
(0) [0x
00006C7D] 0008:
00006C7D (UNK. CTXT): JMP FAR SS: [EBP 0xfffffffff8]; FF6DF80X
The address of the JMP FAR instruction jump at 00006C7D is segment selector: offset value, where the segment selector is SS: [EBP 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFf 0 bits to 31 bits, if the segment selector is the task status segment selector TSS, the CPU will automatically switch the process, at which point the CPU saves all registers to the current task pointing to the TSS segment selector in the current task register TR. In the TSS structure of the data structure, the register information in the new task data structure pointed to by the new task status segment selector is then restored to the CPU, and the system is officially running the newly switched task. The command line is as follows by debugging.
......
EBP: 0x
1915C
......
EIP: 0x
6c7d
......
LDTR: S = 0x28, DL = 0x84640068, DH = 0x8201, VALID = 1
Tr: s = 0x20, DL = 0X
847C0068, DH = 0x8901, VALID = 1
GDTR: Base = 0x5cb8, limited = 0x7ff
IDTR: Base = 0x54b8, limited = 0x7ff
......
00019148 [00019148] 0003
0001914C [
0001914c] 0000
00019150 [00019150] 0FFC
00019154 [00019154]
1F248
00019158 [00019158] 0030
0001915C [
0001915C]
1F248
00019160 [00019160] 6CA4
00019164 [00019164] 743B
00019168 [00019168] 0003
0001916C [
0001916C] 3E400
00019170 [00019170] 1FAE4
00019174 [00019174] 0017
00019178 [00019178] 0017
0001917C [
0001917C] 0017
00019180 [00019180]
67A1
00019184 [00019184]
000f
% EBP register value is 0X
1915c, the address of the LJMP jump is 0x30: 0x
1F248,% GDTR's base address is 0x5cb8, and the geology of the segment descriptor is 0x5cb8 0x30 = 0x5CE8, and the descriptor is removed:
[bochs]:
0x00005CE8
This debugging information tells us that this descriptor is a 386TSS descriptor that the base address is 0x00FFF2E8 and the segment is limited to 0x68. Therefore, the CPU will automatically perform process switches. The CPU will take out the 0x68 byte content starting from the address 0x00FFF2E8 on the setting of the process TSS to which you want to perform:
[bochs]:
0x00FFF2E8
0x00000000
0x00FFF
2F8
0x00FFF308
0000677C 0x00000616 0x00000000
0x0003E400
0x00FFF318
0001f248
0x
0001f248
0x00FFF328
0x
0000000F
0x00FFF338
0x00000017
0x00FFF348
These debugging information is arranged in the order of TSS fields to draw the following table:
BIT31-BIT16BIT15-BIT1BIT0OffsetData0000000000000000 format of link field 00x00000000ESP040x010000000000000000000000SS080x00000010ESP10CH0x000000000000000000000000SS110H0x00000000ESP214H0x000000000000000000000000SS218H0x00000000CR31CH0x00000000EIP20H0x essential part of Task State Segment
0000677ceflags24H0X00000616EAX28H0X00000000ECX2CH0X0003E400EDX30H0X0000000003EBX34H0X00000003ESP38H0X
0001f248EBP3CH0X
0001f248esi40H0X00000000EDI44H0X00000FFC000000000000000000S48H0X000000170000000000000000CS4CH0X
0000000F00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002
The various registers after switching will assign a value according to the value corresponding to the above table, continue to debug, to observe the switching process, the command line is as follows:
Next At t = 16886887
(0) [0x
0000677C]
000f:
0000677C (UNK. CTXT): TEST EAX, EAX;
85C0
EAX: 0x0
EBX: 0x3
ECX: 0x3e400
EDX: 0x21
EBP: 0x
1F248
ESI: 0x0
EDI: 0xffc
ESP: 0x
1F248
EFLAGS: 0x616
EIP: 0x
677C
CS: S = 0xF, DL = 0x
9f, DH = 0x
4c0fa00, valid = 1
SS: S = 0x17, DL = 0x
9f, DH = 0x
4C
0F300, VALID = 1
DS: S = 0x17, DL = 0x9f, DH = 0x
4C
0F300, VALID = 1
ES: S = 0x17, DL = 0X
9f, DH = 0x
4C
0F300, VALID = 1
FS: s = 0x17, DL = 0x
9f, DH = 0x
4C
0F300, VALID = 1
GS: S = 0x17, DL = 0X
9f, DH = 0x
4C
0F300, VALID = 1
LDTR: S = 0x38, DL = 0xf2d00068, DH = 0x82FF, VALID = 1
Tr: s = 0x30, DL = 0xf2e80068, DH = 0x89FF, VALID = 1
GDTR: Base = 0x5cb8, limited = 0x7ff
IDTR: Base = 0x54b8, limited = 0x7ff
DR0: 0x0
DR1: 0x0
DR2: 0x0
DR3: 0x0
DR6: 0xffff0FF0
DR7: 0x400
TR3: 0x0
TR4: 0x0
TR5: 0x0
TR6: 0x0
TR7: 0x0
CR0: 0x8000001B
CR1: 0x0
CR2: 0x0
CR3: 0x0
CR4: 0x0
INHIBIT_MASK: 0
DONE
This switching process is also a glimpse: 6 segment registers are all 1, indicating that this segment descriptor is an index of a local descriptor table; the address of the local descriptor table is provided by the global descriptor table, namely 0x5cb8 0x38 Descriptor in the address; the address in% EIP acts as a sequence of execution after switching; the value of the general register is obtained from the TSS structure of the switching process; the value of the LDTR is automatically loaded by the CPU, which is obtained from the TSS structure.
postscript
Finally, the analysis of the memory segment mechanism is completed. For the understanding of the memory segmentation mechanism, it can actually see how to shoot a two-dimensional array into one array. If you need to address the local descriptor table of the process, you can Segment mechanisms see how to shoot a three-dimensional number of groups into one array, it is so simple! (If you are wrong, don't blame me ^ - ^)