In-depth understanding of C language (1) [original] TANGL_99 2004-04-26
The code generated by the C language is high than other advanced languages than other advanced languages. Now let's take a look at what the code generated by the C language is what is like. When you read this article, you will be more step more. This article Take a C language through a actual case program. Research case one Tool: TurboC C v2.0, Debug, Masm V5.0, NASM instance C program: / * example1.c * / char ch; int E_MAIN () {? ??? e_putchar (ch);} Target content: C language call function method and details We use the C compiler is 16-bit TurboC C v2.0, which generates 16-bit code, relatively simple, convenient for us To study. At the same time, we also need to use DEBUG under DOS to make an anti-assessment. Since the procedures in our case are not a complete C program, TLINK under Turboc does not generate the target program for us, so I will use MASM. Link.exe, exe2bin.com at the same time can also convert the exe file to bin files. This program does not have a main function, we use e_main instead of the main function. This way we can avoid the main function to the main function Series processing code. Similarly, we also use e_putchar () to replace our usual use of Putchar (). The meaning of "e" is "example". There is no main function, our C program has no entrance, so Before starting to compile this C code, I have to write a few lines of simple assembly code. It comes to the entrance of our program.; The entry of the C program Start.asm [BITS 16] [Global Start] [EXTERN _E_MAIN] Start: ?? Call_e_main, according to the C language habits, the total nature of C must be automatically added to a "_" underline. So, we have an E_MAIN function in C, if you want to call in the assembly, turn it into _e_main Function. This assembly code has only one sentence: call _e_main, is called our E_MAIN function in C. I will use NASM to compile. Generate start.objnasmw -f obj -o start.obj start.asm below We use TurboC C to compile this C code: tcc -mt -oexample1.obj -c example1.clink start.obj Example1.obj, example1.exe ,,, exe2bin example.exe, we got this C code compiled Machine code file (Example1 . BIN). Below we use Debug this old DOS tool to reverse elsewi. Debug-n example1.bin-l 0-u 0xxxx: 0000 ?? Call 0003xxxx: 0003 ?? MOV ?? AX, 000BXXXX: 0006 ?? Push Axxxxx: 0007 ?? Call 0020xxxx: 000A ?? POP ?? CX Here you see the blue code is the code of the entire C program. The first first sentence called 0003 is us The code generated by START.ASM compiled with NASM. Our main goal is to study the blue C language code, the first code generated by Start.asm is too simple, that is, call the E_MAIN function. And our E_main function is blue Color code part. We see from the C source program, we do it in E_MAIN is a thing: call e_putchar (ch);
Where CH is the parameter passing to E_PUTCHAR. MOV AX, 000B 000B is the address of our overall variable CH where the address is located. C language will pass all global variables in another memory area. C code first to AX , Then press the value of the AX by PUSH AX, the value of the CH is pressed into the stack. Then call 0020 and 0020 is the address of the E_PUTCHAR code. With this hop statement, the computer jumps to the E_PUTCHAR code section to execute. I am This case is not given here, because this case is just how to pass the parameters in the C language to other functions, regardless of the e_putchar how to take parameters. In one case, we will study how to take the parameters. Here I You have to explain the CALL instruction, because you may confuse the .call xxxx command simply or push ipjmp xxxx in the part of the next study function. The Call XXXXXX is first pressing the current execution address IP into the stack, and then jumps to The address of the Call and RET instructions are supported. RET instructions are equivalent to POP IP is also the execution address IP before replying to CALL. Because of this, once you use the Call instruction, your stack pointer SP will automatically reduce 2.POP CX is a must-have to operate after each function call. It doesn't work here. The only role is that it corresponds to the PUSH AX before Call 0020. Such a stack pointer SP can return to the original. Good The simple first case study is over. Although these 4 jump instructions, we can see how the C language passes the parameter method. Summary is through the "MOV AX, Parameter Address" to pass the address of the parameters to AX Then "Push AX" puts the address of the parameter into the stack. The last "Call function address" turned the function to be called. Finally, "POP CX", restore the stack pointer sp. Research case two Tools: TurboC C v2 .0, Debug, Masm V5.0, NASM, TASM instance C procedures: / * example1.c * / char ch; extern void e_putchar (char c); int E_main () {???? CH = 0x44; ?? ?? e-putchar (ch);} instance assembler:; eio..asm_text segment byte public 'code'dgroup group _text ?? Assume cs: _Text, DS: DGroup, ss: dgroup ?? public _k_putchar_k_putchar proc ?? Near ?? push ?? bp ?? MOV ?? BP, sp ?? MOV ?? AH, 0EH ?? MOV ?? BX, 7h ?? MOV ?? Al, Byte Ptr [BP 4] ?? INT ?? 10h ?? POP ?? bp ?? RET_K_PUTCHAR ENDP Target Content: The function of functions in the C language This section we will use TASM to write a standard C function with compilation. The content of this section may be seen in many compilation books. It is time to talk about the connection method of C language and assembly language. Maybe you will be strange, we already have MASM, NASM two compilers, why also use Tasm another assembly compiler. I don't know MASM Whether it can be coordinated with our TurboC C, but TASM can be fully matched with TurboC C. After all, they are Borland's products, and the assembly code generated in TurboC C is fully based on syntax in TASM. This is enough to see "intimate" between TurboC C and TASM. In this case, we mainly do not study C code. That is to study the C function written with compiled. Push ?? bpmov ?? BP, spmov ?? ah, 0ehmov ?? bx, 7hmov ?? al, byte PTR [BP 4] int ?? 10hpop ?? BPRET where Byte PTR [BP 4] is the parameter value we pass to E_PUTCHAR (). Previous case China has always known that the C language is to press the address of the address into the stack to pass to the function. So in the standard C function, it is read the parameters by taking the value in the stack. The standard C function is both two lines. Push ?? BPMOV ?? BP, SP first saves the value of the BP, then passes the current stack pointer to the BP, and our access to the parameter to which the function is passed through BP. The first parameter value is placed in BP