Key words: function call, calling convention, c language implementation
Mechanism of the underlying function call: apf :: detrox
This is an article on how the function call in the C language is implemented. Write to entry-level articles that achieve interest in achieving the underlying behavior of C language. If you are a C language or compilation, the old bird of the underlying technology is not interested in this problem, then this article will only delay your time, you don't have to read him. Of course, if the predecessors are willing to pointed to me, I will thank you very much for your guidance and apologize to delay your precious time. Ok, talk nonsense! To study this problem, let's open VC first. It is best to 6.0, :-P. (What you don't have VC , fell! .... hurry up! @ # $, Fast!) First, let us build a Win32 Console Application project in VC , and establish the main file fun.c. And enter the following.
INT Fun (int A, int b) {
A = 0x4455;
B = 0x6677;
RETURN A B;
}
Int main () {
Fun (0x8899, 0x1100);
Return 0;
}
The most critical is to turn off the optimization function in the project settings. That is, put project-> setting-> C / C -> Optimizations as DISABLED. The optimization of the compiler is not very popular in analyzing the underlying implementation. Press the F10 key on the keyboard to enter the single-step debugging mode (Step over). Have a yellow small arrow on the left side of your main function? That is the statement that the program is about to be executed. Press Alt 8. Open the anti-compilation window and see the compilation statement? Do you want this?
==> 00401078 PUSH 1100H
0040107D Push 8899h
00401082 Call @ ilt 5 (FUN) (0040100A)
00401087 Add ESP, 8
Did you see two Push instructions? Look at the numbers behind, isn't it the parameters we have to pass? Strange? We are clearly the first 0x8899 how to reverse Push 1100h? Oh, this phenomenon is called Calling Conversion. What exactly is sacred, I will explain it in detail later. Don't worry. The role of subsequent CALL instructions is to start calling functions. Next, turn off the disassembly window, press F11 (Step Into) into the function body at the source window. When you see the yellow small arrow pointing to the function name, the reverse assembly window (Alt 8) is called. You will see the following code:
1: INT FUN (int A, int b) {
00401000 PUSH EBP
00401001 MOV EBP, ESP
00401003 SUB ESP, 40H
00401006 PUSH EBX
00401007 PUSH ESI
00401008 Push EDI
00401009 LEA EDI, [EBP-40H]
0040100C MOV ECX, 10H
00401011 MOV EAX, 0cccccccch
00401016 Rep Stos DWORD PTR [EDI]
2: a = 0x4455;
004018 MOV DWORD PTR [EBP 8], 4455H
3: b = 0x6677; 0040101F MOV DWORD PTR [EBP 0CH], 6677H
4: RETURN A B;
00401026 MOV EAX, DWORD PTR [EBP 8]
00401029 Add Eax, DWORD PTR [EBP 0CH]
5:}
0040102C POP EDI
0040102D POP ESI
0040102E POP EBX
0040102F MOV ESP, EBP
00401031 POP EBP
00401032 RET
VC is good, but also adds the source code for the C language before the compilation statement. However, there are many code we don't need. Therefore, you only need to care about the red part. Amazing? Isn't the parameter passed with Push? Why didn't you see it by POP? The question is actually the case. When you call the Call to enter the function, Call will carry you a thing. Call entered the stack of the address Push of the next statement. (Side: What! This is why?) The reason is very simple, because the function call is over, use RET to return. How do RET know where to return? By the way, the RET command POP has given his address (clear this relationship), then return to this address. Call and RET cooperate, a PUSH Pop definitely does not make the stack unbalanced (外 外 n u u). Now understand, if you come to Pop Eax, what is the EAX? Of course, Return the Return to Ret is used. Ok, if you want POP EAX, it is equal to the things you want to use. Regardless of the program process and moral standards you do :-P. But how can I use the parameters in my function? The problem is not difficult, since the parameter we can use the ESP (stack pointer) to access it in the stack. However, I believe that you also thought. ESP is a regular value. Once, POP or PUSH will change in the function. This is not easy to locate the location of the parameters in memory. Therefore, we need a basis for any changes as access parameters. Look at the beginning of the function body:
00401000 PUSH EBP
00401001 MOV EBP, ESP
The value of the original EBP is saved first with the PUSH EBP to give the value of the ESP to EBP. It turns out that EBP is used to do a benchmark. It is also no wonder that he is called EBP (Base Pointer). Very natural RET returns POP EBP is to restore the value of the original EBP. Of course, you must recover because the function can be called in the function. Each function uses EBP, naturally, it is necessary to return to Zhao after the use. Now when the function is executed to the MOV EBP, the stack behind the ESP should become this.
/ ------------------- / Higher Address
| Parameter 2: 0x1100h |
---------------
| Parameter 1: 0x8899h |
---------------
| Function Return Address |
| 0x00401087 |
---------------
| EBP |
/ ------------------- / Lower Address <== Stack Pointer
& EBP All Point to Here, Now
Since the Int type used in VC is a 32-bit type, EBP and function return value are also 32-bit. Therefore, each quantity should take up to 4 bytes. It is also necessary to pay attention to the expansion direction of the stack to the low address. With these instructions. We can analyze that the address of the first parameter is EBP 08H, and the second parameter is EBP 0CH. Look at the disassembled code: 2: a = 0x4455;
004018 MOV DWORD PTR [EBP 8], 4455H
3: b = 0x6677;
0040101F MOV DWORD PTR [EBP 0CH], 6677H
Combined with our calculation. Then:
00401031 POP EBP
00401032 RET
The original value is returned to Zhao, call the RET instruction, the RET command POP returns the address, then return to the next statement of the CALL instruction of the call function. After Ret, the stack should become this.
/ ------------------- / Higher Address
| Parameter 2: 0x1100h |
---------------
| Parameter 1: 0x8899h |
/ ------------------- / Lower Address <== Stack Pointer
Haha, the problem has appeared, and then the function returns a stack in the stack unbalanced. How to do it? Ok, directly POP CX POP CX will take the stack balance. Fortunately, we have only two parameters, if there are 20 words, there must be 20 POP CX. Do not say that affecting beauty, the program efficiency will be low. So VC uses this method to solve the problem:
00401082 Call @ ilt 5 (FUN) (0040100A)
00401087 Add ESP, 8
Look at the red statement, directly add the value of the ESP 8, let the stack becomes
/ ------------------- / Higher Address <== Stack Pointer
| Parameter 2: 0x1100h |
---------------
| Parameter 1: 0x8899h |
/ ------------------- / Lower Address
STACK Unwinding is fundamentally solved by changing the ESP. (Push, the PUS is essentially by changing the ESP to implement the stack balance) now, now understand how the function passes the parameters, how to call, how to return. The next question is to look at how the function passes the return value. I believe that you have long noticed it.
4: RETURN A B;
00401026 MOV EAX, DWORD PTR [EBP 8]
00401029 Add Eax, DWORD PTR [EBP 0CH]
It can be seen that the function is officially saved with the EAX register. If you want to use the return value of the function, you must read the value of the EAX register in a function return. As for why I don't have EBX, ECX ..., although this is not specified, everyone is used to use EAX. And the Windows program is also clear, and the return value of the function must be placed in EAX. OK, now solve what is the historical legacy of Calling Conversion. If you carefully think, you must think of the function of the function, why is the stack transfer? The register can not be passed? And very fast. The order of delivery of the parameters does not have to be beforewards, and there will be no problems in the post-after delivery. Why do you have to wait until the function returns a problem with the stack balance, can you make the stack balance before the function returns? All of the above proposals are absolutely feasible, and the different combinations they have created a different call method. That is, STDCALL, PASCAL, FASTCALL, WINAPI, CDECL, etc. you often have seen or heard. These different processing function call modes are called Calling Convention. By default, the C language is used by CDECL, that is, mentioned above. The parameter is made from right to left, calling the functioner to handle the stack balance. If you add __stdcall before we just in the program we just, then use the above method to analyze. 8: FUN (0x8899, 0x1100);
00401058 PUSH 1100H; <== Parameter is still transmitted by right to left
0040105D Push 8899h
00401062 Call Fun (00401000)
; <== here is not add ESP, 08H
1: INT __STDCALL FUN (Int A, INT B) {
00401000 PUSH EBP
00401001 MOV EBP, ESP
00401003 SUB ESP, 40H
00401006 PUSH EBX
00401007 PUSH ESI
00401008 Push EDI
00401009 LEA EDI, [EBP-40H]
0040100C MOV ECX, 10H
00401011 MOV EAX, 0cccccccch
00401016 Rep Stos DWORD PTR [EDI]
2: a = 0x4455;
004018 MOV DWORD PTR [EBP 8], 4455H
3: b = 0x6677;
0040101F MOV DWORD PTR [EBP 0CH], 6677H
4: RETURN A B;
00401026 MOV EAX, DWORD PTR [EBP 8]
00401029 Add Eax, DWORD PTR [EBP 0CH]
5:}
0040102C POP EDI
0040102D POP ESI
0040102E POP EBX
0040102F MOV ESP, EBP
00401031 POP EBP
00401032 RET 8; <== RET Removes the return address,
; Add 8 to ESP. Look! Stack balance is completed in the function.
; RET instructions this grammar design is specifically used to implement functions
The completion of the stack balance concludes that stdcall is transmitted by right-to-left passing parameters, called a function to restore the stack of Calling convention. The modified keywords of several Calling Convertions are __pascal, __ fastcall, WinAPI (this is included Windows.h can be used). Now, you can analyze your own characteristics with the methods mentioned above.
- top - TOP -