Bottom mechanism called by function

xiaoxiao2021-03-19  223

Key words: function call, calling convention, c language implementation

Bottom mechanism called by function

Author: apf :: detrox

This is an article on how the function call in the C language is implemented. Write to those to C language

The bottom floor of the behavior realizes the entry-level article of interest. If you are a C language or compilation, the old bird of the underlying technology is not interested in this problem, then this article will only delay your time, you don't have to read him. Of course, if the predecessors are willing to pointed to me, I will thank you very much for your guidance and apologize to delay your precious time. Ok, talk nonsense! To study this problem, let's open VC first. It is best to 6.0, :-P. (What you don't have VC , fell! .... hurry up

! @ # $

,be quick! First, let's establish a Win32 Console Application project in VC , and establish the main file fun.c. And enter the following.

INT FUN (INT A, INT B) {a = 0x4455; b = 0x6677; Return A B;} int main () {Fun (0x8899, 0x1100); return 0;}

The most critical is to turn off the optimization function in the project settings. That is, put project-> setting-> C / C -> Optimizations as DISABLED. The optimization of the compiler is not very popular in analyzing the underlying implementation. Press the F10 key on the keyboard to enter the single-step debugging mode (Step over). Have a yellow small arrow on the left side of your main function? That is the statement that the program is about to be executed. Press Alt 8. Open the anti-compilation window and see the compilation statement? Do you want this?

==> 00401078 PUSH 1100H 0040107D PUSH 8899H 00401082 Call @ ilt 5 (FUN) (0040100A) 00401087 Add ESP, 8

Did you see two Push instructions? Look at the numbers behind, isn't it the parameters we have to pass? Strange? We are clearly the first 0x8899 how to reverse Push 1100h? Oh, this phenomenon is called Calling Conversion. What exactly is sacred, I will explain it in detail later. Don't worry. The role of subsequent CALL instructions is to start calling functions. Next, turn off the disassembly window, press F11 (Step Into) into the function body at the source window. When you see the yellow small arrow pointing to the function name, the reverse assembly window (Alt 8) is called. You will see the following code:

1: int fun (int a, int b) {00401000 push ebp00401001 mov ebp, esp00401003 sub esp, 40h00401006 push ebx00401007 push esi00401008 push edi00401009 lea edi, [ebp-40h] 0040100C mov ecx, 10h00401011 mov eax, 0CCCCCCCCh00401016 rep stos dword ptr [EDI] 2: a = 0x4455; 00401018 MOV DWORD PTR [EBP 8], 4455H3: B = 0x6677; 0040101F MOV DWORD PTR [EBP 0CH], 6677H4: RETURN A B; 004026 MOV Eax, DWORD PTR [EBP 8] 00401029 add eax, dword ptr [ebp 0Ch] 5:} 0040102C pop edi0040102D pop esi0040102E pop ebx0040102F mov esp, ebp00401031 pop ebp00401032 retVC is good, join the C language source code is still difficult to understand the statement before compilation. However, there are many code we don't need. Therefore, you only need to care about the red part. Amazing? Isn't the parameter passed with Push? Why didn't you see it by POP? The question is actually the case. When you call the Call to enter the function, Call will carry you a thing. Call entered the stack of the address Push of the next statement. (Side: What! This is why?) The reason is very simple, because the function call is over, use RET to return. How do RET know where to return? By the way, the RET command POP has given his address (clear this relationship), then return to this address. Call and RET cooperate, a PUSH Pop definitely does not make the stack unbalanced (外 外 n u u). Now understand, if you come to Pop Eax, what is the EAX? Of course, Return the Return to Ret is used. Ok, if you want POP EAX, it is equal to the things you want to use. Regardless of the program process and moral standards you do :-P. But how can I use the parameters in my function? The problem is not difficult, since the parameter we can use the ESP (stack pointer) to access it in the stack. However, I believe that you also thought. ESP is a regular value. Once, POP or PUSH will change in the function. This is not easy to locate the location of the parameters in memory. Therefore, we need a basis for any changes as access parameters. Look at the beginning of the function body:

00401000 Push EBP00401001 MOV EBP, ESP

The value of the original EBP is saved first with the PUSH EBP to give the value of the ESP to EBP. It turns out that EBP is used to do a benchmark. It is also no wonder that he is called EBP (Base Pointer). Very natural RET returns POP EBP is to restore the value of the original EBP. Of course, you must recover because the function can be called in the function. Each function uses EBP, naturally, it is necessary to return to Zhao after the use. Now when the function is executed to the MOV EBP, the stack behind the ESP should become this.

/ ------------------- / higher address | parameter 2: 0x1100h | ---------------- | Parameters 1: 0x8899h | --------------- | Function Return Address | | 0x00401087 | ---------------- | EBP | / ------------------- / Lower Address <== Stack Pointer & EBP All Point To Here, Now We use the int type used on VC is a 32-bit type, EBP and function return value are also 32 bits. Therefore, each quantity should take up to 4 bytes. It is also necessary to pay attention to the expansion direction of the stack to the low address. With these instructions. We can analyze that the address of the first parameter is EBP 08H, and the second parameter is EBP 0CH. Look at the disassembled code:

2: a = 0x4455; 00401018 MOV DWORD PTR [EBP 8], 4455H3: B = 0x6677; 0040101F MOV DWORD PTR [EBP 0CH], 6677H

Combined with our calculation. Then:

00401031 POP EBP00401032 RET

The original value is returned to Zhao, call the RET instruction, the RET command POP returns the address, then return to the next statement of the CALL instruction of the call function. After Ret, the stack should become this.

/ ------------------- / higher address | parameter 2: 0x1100h | ---------------- | Parameters 1: 0x8899h | / ------------------- / Lower Address <== Stack Pointer

Haha, the problem has appeared, and then the function returns a stack in the stack unbalanced. How to do it? Ok, directly POP CX POP CX will take the stack balance. Fortunately, we have only two parameters, if there are 20 words, there must be 20 POP CX. Do not say that affecting beauty, the program efficiency will be low. So VC uses this method to solve the problem:

00401082 Call @ ilt 5 (FUN) (0040100A) 00401087 Add ESP, 8

Look at the red statement, directly add the value of the ESP 8, let the stack becomes

/ ------------------- / Higher Address <== stack pointer | parameter 2: 0x1100h | --------------- - | Parameter 1: 0x8899h | / ----------------- / Lower Address

STACK Unwinding is fundamentally solved by changing the ESP. (Push, the PUS is essentially by changing the ESP to implement the stack balance) now, now understand how the function passes the parameters, how to call, how to return. The next question is to look at how the function passes the return value. I believe that you have long noticed it.

4: RETURN A B; 00401026 MOV Eax, DWORD PTR [EBP 8] 00401029 Add Eax, DWORD PTR [EBP 0CH]

It can be seen that the function is officially saved with the EAX register. If you want to use the return value of the function, you must read the value of the EAX register in a function return. As for why I don't have EBX, ECX ..., although this is not specified, everyone is used to use EAX. And the Windows program is also clear, and the return value of the function must be placed in EAX. OK, now solve what is the historical legacy of Calling Conversion. If you carefully think, you must think of the function of the function, why is the stack transfer? The register can not be passed? And very fast. The order of delivery of the parameters does not have to be beforewards, and there will be no problems in the post-after delivery. Why do you have to wait until the function returns a problem with the stack balance, can you make the stack balance before the function returns? All of the above proposals are absolutely feasible, and the different combinations they have created a different call method. That is, STDCALL, PASCAL, FASTCALL, WINAPI, CDECL, etc. you often have seen or heard. These different processing function call modes are called Calling Convention. By default, the C language is used by CDECL, that is, mentioned above. The parameter is made from right to left, calling the functioner to handle the stack balance. If you add __stdcall before we just in the program we just, then use the above method to analyze.

8: Fun (0x8899, 0x1100); 00401058 Push 1100h; <== Parameter is still transmitted by right to left 0040105D PUSH 8899H 00401062 Call Fun (00401000); <== Nothing ADD ESP, 08H1: INT __STDCALL FUN (int __stdcall fun " a, int b) {00401000 push ebp00401001 mov ebp, esp00401003 sub esp, 40h00401006 push ebx00401007 push esi00401008 push edi00401009 lea edi, [ebp-40h] 0040100C mov ecx, 10h00401011 mov eax, 0CCCCCCCCh00401016 rep stos dword ptr [edi] 2: a = 0x4455; 00401018 MOV DWORD PTR [EBP 8], 4455H3: B = 0x6677; 0040101F MOV DWORD PTR [EBP 0CH], 6677H4: RETURN A B; 00401026 MOV EAX, DWORD PTR [EBP 8] 00401029 Add EAX , DWORD PTR [EBP 0CH] 5:} 0040102C POP EDI0040102D POP ESI0040102E POP EBX0040102F MOV ESP, EBP00401031 POP EBP00401032 RET 8; <== RET Removing the return address, add 8 to the ESP 8. ;Look! The stack balance is completed within the function; the RET instruction is specifically used; the completion of the stack balance within the function is concluded, and the stdcall is transmitted by the right to left, and the modified function recovered the stack of Calling Convertion. Calling Convertion. The modified keyword is __pascal, __ fastcall, WinAPI (this to include Windows.h can be used). Now, you can analyze your own characteristics with the methods mentioned above.

转载请注明原文地址:https://www.9cbs.com/read-130202.html

New Post(0)