Watch the function call before the internal nuclear source code

xiaoxiao2021-04-10  539

by

The kernel source code looks the function call before the world.

Author: Xiaohua Yang

Stack (Stack

: A orderly accumulation or accumulation

Wayet Dictionary

For each tired programmer, the stack has been deeply branded in its mind, and even has changed. The stack can be used to transfer function parameters, store local variables, and information stored for return values, and can also be used to save the value of the register for recovery.

On the X86 platform (also known as IA32), the application borrows the stack to support the function (also known as the process) call, and the storage of the variable is performed in the way forward (LIFO).

First, stack frame layout

Before the specific explanation function calls, let's clarify several concepts: Full stack and empty stack, ascending stack and descending stack.

Full stack is the last data unit that is written by the stack pointer, and the empty stack pointer points to the first idle unit. A descendant stack is a reverse growth in the memory (that is, the reverse growth starts from the end of the application space), and the ascending stack grows forward in memory.

The RISC machine uses a traditional full-sequencing stack (FD full descending). If you are using a compiler that complies with the IA32, it usually sets your stack pointer to the end of the application space and then uses a full sequencing stack. A stack area for stacking a function of a function is called a stack frame, as shown in Figure 1.

Figure 1 stack frame structure

The design of the stack frame is designed to take into account the architecture characteristics of the instruction set and the characteristics of the compiled programming language. However, the manufacturer of the computer often specifies a "standard" stack layout for its architecture to adopt all programming language compilers. Such stack frame layouts may not be most convenient for certain specific programming languages ​​or compilers, but through this "standard" layout, functions written in different programming languages ​​are called with each other.

When p calls Q, the parameters of Q are in the frame of P. Further, when p calls Q, the next command address in P will be pressed into the stack, forming the end of the P.frame, specifically refer to FIG. 1, the return address is where the program should continue when the program returns from Q . Q Start from the position of the saving frame pointer, then start saving the value of other registers. Q also uses a stack frame to save other local variables that cannot be stored in the register. If the function wants to return an integer or pointer, the register% EAX is often used to save the return value.

When the program is executed, the stack pointer is movable, so the access to most information is relative to the frame pointer (% EBP).

Second, register usage conventions

Suppose the function p (...) call function q (A1, ..., AN), we call P is the caller, Q is called by the caller (Callee). If we must be saved and recovered by the caller, we call it

Caller-save); if it is the responsibility of the caller,

Callee-save is protected by the caller.

The program register group is the only resource shared by all functions. Although there can be only one function is active at a given moment, we must ensure that when a function calls another function, the caller does not override the value of the register that a caller will use later.

To this end, any platform will make a set of standards, allowing all functions to follow, including functions in the library. However, in most computer system structures, the concept of registers and registers protected by the caller protection is not implemented by hardware, but a convention specified in the machine reference manual.

For example, in the ARM architecture, all function calls must comply with the ARM process call criteria (APCS, ARM Procedure Call Standard). This standard provides a compact code writing mechanism, defined functions can be interleaved with functions written in other languages. Other functions can be compiled from C, Pascal, or a function written in assembly language. Similarly, the IA32 platform also uses a unified register usage convention. Based on conventions, register% EAX,% EDX,% ECX are divided into caller saving. When a function P (caller) calls Q (called the caller), Q can cover the value of these registers without damaging the data required for any P. In addition,% EBX,% ESI,% EDI,% EBP are divided into being saved by the caller, which means that Q must save the value of these registers to the stack before returning them before returning them.

Third, parameter transmission practice

About 1960, parameter delivery is not passed through the stack, but passes through a static allocated storage space, which hinders the use of recursive functions. Since the 1970s, most calls approximate function parameters are implemented by the stack (because the access register is much faster than accessing the memory), and it will also cause some unnecessary memory access. Research on the actual procedures shows that the number of parameters with a function is more than 4 and there are very few 6. Therefore, the parameter delivery conventions in the modern computer are specified, and the first k parameters (typical, k = 4, or k = 6) of a function are passed in the register, and the remaining parameters are passed in the memory.

In the ARM system platform, the APCS clearly defines:

1) The first 4 integer arguments (or fewer!) Are loaded into the R0 - R4 register.

2) The first 4 floats are stored (or fewer!) Is loaded into the F0 - F3 register.

3) Other any arguments (if any) is stored in memory, and the space pointed to by the stack pointer when using the entered function. In other words, the remaining parameters are pressed into the top of the stack.

However, on the IA32 platform, the parameter delivery is not implemented through the register, but is implemented by a stack frame. According to different call mode, the parameter has a difference in the storage method of the stack frame, which is different from the following table:

Call mode

The order of the parameters in the stack

Operation method

_CDECL

The first parameter is at the low address

Caller

_stdcall

The first parameter is at the low address

Caller

_fastcall

Compiler designation

Caller

_Pascal

The first parameter in the high address

Caller

The Borland and GNU compiler use _cdecl way, while Microsoft uses _stdcall way. By here, you can find that today's two mainstream compilers, the first parameters are low addresses, that is, the first parameter is the last stack. When the called function is called, it is assigned to it. The first ginseng.

Fourth, Linux kernel source code research

Below we process the essence of function calls with Linux kernel interrupt. The core code version mentioned in this paper is 2.6.10, and readers need to understand the basics of GCC assembly language.

When an interrupt occurs, the PUSH $ I-256, JMP Common_Interrupt command will be executed, and will jump to

COMMON_INTERRUPT execution.

File Name: Arch / i386 / kernel / entry.s (Description: The front number represents the line number)

359 align

360 Common_INTERRUPT:

361 SAVE_ALL362 MOVL% ESP,% EAX

363 Call Do_irq

364 JMP RET_FROM_INTR

The main operation here is a macro operation Save_all, which is the so-called "saving site", saving the contents of all registers on the eve of the interrupt in the stack, and then "restore the scene" before returning the service.

File name: Arch / i386 / kernel / Irq.c

48 FastCall Unsigned Int Do_irq (Struct Pt_Regs * REGS)

49 {

50 / * High bits buy in return_from_ code * /

/ / Get the interrupt vector number

51 int IRQ = regs-> orig_eax & 0xff;

52 #ifdef config_4kstacks

53 UNION IRQ_CTX * CURCTX, * IRQCTX;

54 U32 * ISP;

55 #ENDIF

......

107}

Let's analyze the SAVE_ALL and DO_IRQ functions in the Struct PT_REGS structure.

File name: Arch / i386 / kernel / entry.s

File name: include / asm-i386 / ptrace.h

84 #Define save_all /

85 CLD; /

86 pushl% ES; /

87 pushl% DS; /

88 pushl% EAX; /

89 pushl% EBP; /

90 pushl% EDI; /

91 pushl% ESI; /

92 pushl% EDX; /

93 pushl% ECX; /

94 pushl% EBX; /

95 MOVL $ (__ user_ds),% EDX; /

96 MOVL% EDX,% DS; /

97 MOVL% EDX,% ES;

26 struct pt_regs {

27 long EBX;

28 long ECX;

29 long Edx;

30 long ESI;

31 long Edi;

32 Long EBP;

33 Long Eax;

34 int xds;

35 int XES;

36 long orig_eax;

37 long EIP;

38 int XCS;

39 long EFLAGS;

40 long ESP;

41 INT XSS; 42};

As can be seen from the above code, in Save_all, the value in the register is pressed into the stack, where the last press is Pushl% EBX; we observe the Struct Pt_Regs structure, you can see the first member of the structure. The variable is long EBX; then corresponds to it. The corresponding PUSH statement of long orig_eax is: PUSH $ I-256. It is pressed before JMP Common_INTERRUPT. Careful readers may also produce another question: How does the last few variables in the Struct Pt_REGS structure have no corresponding PUSH in Save_all. Because this part of the value is pressed into the stack before entering the interrupt service program, this is done by the hardware.

As can be seen from the above, on the IA32 platform, the passage of the parameters is not implemented by the register, but is implemented by a stack frame. But this is not on the Linux operating system, but all IA32 platforms are true, no matter what operating system, what kind of compiler, has to follow the specification mentioned in the previous.

Below we are still interrupted as an example, how to protect the registers in the kernel.

48 FastCall Unsigned Int Do_irq (Struct Pt_Regs * REGS)

49 {

......

73 #ifdef config_4kstacks

......

92 ASM Volatile

93 "XCHGL %% EBX, %% ESP / N"

94 "call __do_irq / n"

95 "MOVL %% EBX, %% ESP / N"

96: "= a" (arg1), "= D" (arg2), "= b" (EBX)

97: "0" (IRQ), "1" (REGS), "2" (ISP)

98: "Memory", "CC", "ECX"

99);

......

101 #ENDIF

The first colon in the above code indicates the output value. The second colon represents the input value. After the third colon, the portion that will be damaged is indicated, and the value needs to be recovered. In this segment code, the __do_irq () function will be called. Since the value is used after the function is called, all the values ​​that need to be saved will save ECX. Then you can use the ECX register without anything in the __do_irq () function.

From the above example, it can also be found that the output value of the function is often stored in the register EAX. "= A" after the first colon indicates that the variable Arg1 and EAX are bound, that is, Arg1 = Eax.

V. Case analysis

Not long ago, when I was visiting the 9CBS forum, many people found a fierce discussion on the following topics.

#include low_to_up (char in);

void main ()

{

Printf ("% C / N", Low_TO_UP ('D'));

}

Low_to_up (char in)

{

CHAR CH;

IF (in> = 'a' && in <= 'z')

CH = in-'a ' ' a ';

Else

Return (CH);

}

We first negatively assemble this code in VC 6.0:

1: #include

2: Low_to_UP (Char IN);

3:

4: void main ()

5: {

00401020 PUSH EBP

00401021 MOV EBP, ESP

00401023 SUB ESP, 40H

00401026 PUSH EBX

00401027 PUSH ESI

00401028 Push EDI

00401029 LEA EDI, [EBP-40H]

0040102C MOV ECX, 10H

00401031 MOV EAX, 0cccccccch

00401036 Rep Stos DWORD PTR [EDI]

6: Printf ("% c / n", low_to_up ('d'));

00401038 Push # 64h D ASC code (1 place)

0040103A

Call @ ilt 5 (low_to_up) (0040100A)

At this point, the value of Eax is stack and then call the Printf function, so the printed is d.

0040103F Add ESP, 4

00401042 Push EAX # (5 place)

00401043 Push Offset String "% C / N" (0042001C)

00401048 Call Printf (004010E0)

0040104D Add ESP, 8

7:}

00401050 POP EDI

00401051 POP ESI

00401052 POP EBX

00401053 Add ESP, 40H

00401056 CMP EBP, ESP

00401058 Call __chkesp (00401160)

0040105D MOV ESP, EBP

0040105F POP EBP

00401060 RET 00401060

8:

9: Low_to_UP (Char in)

10: {

00401080

Push EBP

The content of the previous stack frame

......

64H

Return address

EBP

(CH)

......

Current Stack of content

The content of EBP 8 here is 64h, and EAX is 64.

00401081 MOV EBP, ESP

00401083 SUB ESP, 44H

00401086 PUSH EBX

00401087 PUSH ESI

00401088 Push EDI

00401089 Lea EDI, [EBP-44H] 0040108C MOV ECX, 11H

00401091 MOV Eax, 0cccccccch

00401096 Rep Stos DWORD PTR [EDI]

11: CHAR CH;

12: IF (in> = 'a' && in <= 'z')

00401098 Movsx Eax, Byte PTR [EBP 8] # (2 place)

0040109C CMP EAX, 61H

0040109f jl low_to_up 36h (004010B6)

004010A1 MOVSX ECX, BYTE PTR [EBP 8]

004010A5 CMP ECX, 7AH

004010A8 JG LOW_TO_UP 36H (004010B6)

At this time, the EDX is the value after the computer, but the value of EAX still has not changed. Then store it in EBP-4, it is in CH.

13: CH = in-'a ' ' a '; 004010AA Movsx EDX, BYTE PTR [EBP 8] # (3 place)

004010AE Sub EDX, 20H

004010B1 MOV BYTE PTR [EBP-4], DL

14: Else

004010B4 JMP low_to_up 3ah (004010BA)

15: Return (CH); 004010B6 MOVSX EAX, BYTE PTR [EBP-4]

16:}

00401010BA POP EDI # Restores the value of the register, do return processing (7 places)

004010BB POP ESI

004010BC POP EBX

004010BD MOV ESP, EBP

004010BF POP EBP

004010c0 RET

From the above assembly, it can be seen that the registers saved by the caller are reflected here. In the caller function, you will not use the registers saved by the caller to be used directly. It also reflects the common EAX to store the return value. Since the return value is used to use Eax, at 5, the EAX stack directly, so the program prints "D" in VC 6.0. Due to the Low_TO_UP () function, the EAX register is borrowed during the comparison process. According to the normal process, after executing the IF statement, there should be an appropriate exit statement, so that the assembly statement does not perform 7, store the correct value into EAX. Thus the isolar, when executing Printf, the error value is printed. Is this possible to be a VC compiler bug?

The GCC compiler seems to be a high-level, and it is more satisfactory. Let's analyze the GCC disassembly code (read this part of the code to understand the GCC compilation syntax):

.file "9cbs.c"

.Text

The content of the previous stack frame

......

100

Return address

EBP

......

ESP-> (CH)

......

Current Stack of content

The content of EBP 8 here is 100, and CH is stored at EBP-8.

.globl low_to_up

.Type low_to_up, @function

Low_to_up:

Pushl% EBP

MOVL% ESP,% EBP

SUBL $ 8,% ESP

MOVL 8 (% EBP),% EAX # (2 place)

MOVB% Al, -1 (% EBP)

CMPB $ 96, -1 (% EBP)

Jle .l2

CMPB $ 122, -1 (% EBP)

Jg .l2

MOVZBL -1 (% EBP),% EAX

SUBB $ 32,% Al

The content of the IF statement is running here, but does not assign the value in CH

MOVB% Al, -2 (% EBP) # (3 place)

JMP .l3

.L2:

Movsbl -2 (% EBP),% EAX

MOVL% EAX, -8 (% EBP) # (5 place)

JMP .l1

.L3:

.L1:

Get the value from the CH, store it in Eax, then return

MOVL -8 (% EBP),% EAX # (4 locates) Leave

RET

.size low_to_up,.-low_to_up

.SECTION.RODATA

.Lc0:

.String "% C / N"

.Text

.globl main

.Type main, @function

MAIN:

Pushl% EBP

MOVL% ESP,% EBP

SUBL $ 8,% ESP

Andl $ -16,% ESP

MOVL $ 0,% EAX

SUBL% EAX,% ESP

MOVL $ 100, (% ESP) # Press the value of D into the stack, then call the low_to_up () function (1)

Call low_to_up

MOVL% EAX, 4 (% ESP) # (6)

MOVL $ .LC0, (% ESP)

Call Printf

MOVL $ 0,% EAX

Leave

RET

.size main ,.-main

.section .note.gnu-stack, "" @ progbits

.ident "GCC: (GNU) 3.3.5 (Debian 1: 3.3.5-13)"

As can be seen from the above code, the processing of the GCC compiler and the VC compiler is similar to when calling functions. However, GCC presses the local variable CH in EBP-8. Then use EBP-1 / EBP-2 to make temporary storage. We can find out from 5, in the ELSE statement, the compiler is pressed into EAX, and then the value in EAX is pressed into EBP-8, ie the CH stack. When the return is returned, the value of the CH will be taken from EBP-8, and then assign the value to Eax, the returns return returns. At 6, the value of Eax is stack, and then the printf is called, the output value is output. Since the value is not stored in the CH in the IF condition, when the set_to_up function returns, when the assembly code is executed, the random value of the CH is placed in EAX, so the program is in the Linux environment, The output value is a random value.

The author guess, if the same code is placed in L3, then the correct value can be obtained. The author then joined the Return statement in the IF statement condition in the code, and the GCC revealed code verified the author's guess, the code as follows: low_to_up:

Pushl% EBP

MOVL% ESP,% EBP

SUBL $ 8,% ESP

MOVL 8 (% EBP),% EAX

MOVB% Al, -1 (% EBP)

CMPB $ 96, -1 (% EBP)

Jle .l2

CMPB $ 122, -1 (% EBP)

Jg .l2

MOVZBL -1 (% EBP),% EAX

SUBB $ 32,% Al

MOVB% Al, -2 (% EBP)

JMP .l3

.L2:

Movsbl -2 (% EBP),% EAX

MOVL% EAX, -8 (% EBP)

JMP .l1

.L3:

Movsbl -2 (% EBP),% EAX

MOVL% EAX, -8 (% EBP)

.L1:

MOVL -8 (% EBP),% EAX

Leave

RET

Six, summary

In summary, when we understand the function call law, combine assembly code, you can locate some of the problems that make people feel unexpected during the actual project development process. Do not use procedures in the project to cause difficulties to the project transplantability. Why is it better in this platform, how come another platform?

Seven, reference

[1] Andrew W.Appel, Zhao Kejia and other translations "Modern Compilation Principles C Language Description" People's Posts and Telecommunications Publishing House 2006

[2] Randal E.BRYANT, Gong Yuli, etc. "In-depth understanding of computer system" China Electric Press 2004

[3] AGNER FOG, Translation of Yunfeng

"How to optimize the Pentium series processor code"

http://www.codingnow.com/2000/download/cpendopt.htm

[4] Linus Torvalds, Linux kernel source code (2.6.10 version) http://www.kernel.org

转载请注明原文地址:https://www.9cbs.com/read-133484.html

New Post(0)