X86 assembly language learning Notes (2) Author: Badcoffee
Email: blog.oliver@gmail.com
November 2004
Original article: http://blog.9cbs.net/yayong
Copyright: Please be sure to indicate the original source of the article in hyperlink, author information and this statement in the form of the author. The author will modify the error at any time and publish the new version on its own Blog site. Strictly speaking, this document is more focused on the knowledge of C language and C compiler. If it involves basic assembly language, you can refer to the relevant document.
from
X86 Assembly Language Learning Retention (1) Since the author's blog, I have received a lot of netizens' affirmation and encouragement, and the enthusiastic netizen pointed out the mistakes, the author has updated the error correction in the document in Blog on.
Previous article leads the following concepts by analyzing a simplex C program:
Stack frame stack frame and sfp stack frame pointer stack aligned stack
Calling convention call agreement and ABI (
Application Binary Interface Application Binary Interface This chapter will learn more about these concepts through further experiments. If you still don't understand these concepts, you can refer to
X86 assembly language learning incoming (1).
1. Stack allocation of local variables
The last article has been analyzed with the simplest C procedure,
Below we analyze how the C compiler handles the allocation of local variables, give this procedure before:
#vi test2.c
int main ()
{
INT I;
INT j = 2;
i = 3;
i = i;
Return I J;
}
Compile the program, generate binary files, and use MDB to observe the status of Stack in the run:
#gcc test2.c -o test2 # mdb test2 loading modules: [libc.so.1]> main :: dis be: pushl% EBP
Main 1: MOVL% ESP,% EBP
Main to main 1, create a Stack Frame
Main 3: SUBL $ 8,% ESP
For local variables I, J assign the stack space, and ensure that the stack 16 bytes are aligned
Main 6: Andl $ 0xF0,% ESP
Main 9: MOVL $ 0,% EAX
Main 0xe: Subl% EAX,% ESP
Main 6 to main 0xe, Keep the stack 16 bytes MAIN 0x10: MOVL $ 2, -8 (% EBP)
The value of the initialization of local variables J is 2
Main 0x17: MOVL $ 3, -4 (% EBP)
; Assign a value for local variable I to 3
Main 0x1e: LEAL-4 (% EBP),% EAX
; Put the address of the local variable I into the EAX register
Main 0x21: incl (% EAX)
; i
Main 0x23: MOVL -8 (% EBP),% EAX
; Load the value of J into EAX
Main 0x26: AddL -4 (% EBP),% EAX
; i j and store the result into Eax as return value
Main 0x29: Leave
Undo Stack Frame Main 0x2a: Ret; main function returns
>> Main 0x10: b;
Address main 0x10
Set breakpoint> main 0x1e: b;
Main 0x1e
Set breakpoint
> main 0x29: b;
in
Main 0x1e
Set breakpoint
> main 0x2a: b;
in
Main 0x1e
Set breakpoint
The following MDB's four commands are entered in one line, and the middle divided space is separated from the semicolon. The meaning of the command is given in the comment:
>: r; ; Run the program (: r command) MDB: Stop at main 0x10 ; With the ESP register as the start address, specify the output of 16 bytes of stack content in the format ( MDB: Target Stopped AT: ; The value of the EBP and EAX registers in the final output ( Command and command ) Main 0x10: MOVL $ 2, -8 (% EBP) After the program is running, the instruction is executed before the main 0x10 command, and the stack is not initialized after the stack assignment 0x8047db0: 0x8047db0: 0xddbebca0 This is the variable J, 4 bytes, not initialized, here is the top of the stack, The value of the ESP is 0x8047db0 0x8047DB4: 0xDDBE137F ; this is variable I, 4 bytes, not initialization 0x8047db8: 0x8047dd8 This is _Start's sfp (_start's EBP), 4 bytes, Pointing by main sfp 0x8047dbc: _Start 0x5d This is _Start calls Main before the lower instruction address, the main return will be restored to EIP 0x8047dc0: 1 0x8047dc4: 0x8047de4 0x8047dc8: 0x8047dec 0x8047DCC: _Start 0x35 0x8047dd0: _fini 0x8047dd4: ld.so.1`atexit_fini 0x8047DD8: 0 _Start's sfp pointing content is 0, prove _start is the entry of the program 0x8047DDC: 0 0x8047DE0: 1 0x8047de4: 0x8047eb4 0x8047DE8: 0 0x8047dec: 0x8047eba 8047DB8 This is the value of Main Current EBP register, that is, the SFP of Main. 0 The value of EAX, current 0>: C; MDB: Stop at main 0x1e MDB: Target Stopped AT: Main 0x1e: LEAL-4 (% EBP),% EAX The program is running to the breakpoint main 0x1e, at which point local variable I, J assignment has been completed 0x8047db0: 0x8047db0: 2 This is the variable J, 4 bytes, the value of 2, here is the top of the stack, The value of the ESP is 0x8047db0 0x8047DB4: 3 This is the variable I, 4 bytes, the value of 3 0x8047db8: 0x8047dd8 This is _Start's SFP, 4 bytes 0x8047dbc: _Start 0x5d This is the EIP that returns _Start 0x8047dc0: 1 0x8047dc4: 0x8047de4 0x8047dc8: 0x8047dec 0x8047DCC: _Start 0x35 0x8047dd0: _fini 0x8047dd4: ld.so.1`atexit_fini 0x8047DD8: 0 0x8047ddc: 0 0x8047DE0: 1 0x8047de4: 0x8047eb4 0x8047DE8: 0 0x8047dec: 0x8047eba 8047DB8 This is the value of Main Current EBP register, that is, the SFP of Main. 0 ; EAX value, current 0 >: C; Continue to run the program, print 16 bytes stack and EBP, EAX content MDB: Stop at main 0x29 MDB: Target Stopped AT: Main 0x29: Leave ; Stop from the breakpoint main 0x29, the calculation has been completed, will revoke Stack Frame 0x8047db0: 0x8047db0: 2 This is the variable J, 4 bytes, and the value is 2, Here is the top of the stack. The value of the ESP is 0x8047db0 0x8047DB4: 4 This is the variable I, 4 bytes of I , and the value is 3 0x8047db8: 0x8047dd8 This is _Start's SFP, 4 bytes 0x8047dbc: _Start 0x5d This is the EIP that returns _Start 0x8047dc0: 1 0x8047dc4: 0x8047de4 0x8047dc8: 0x8047dec 0x8047DCC: _Start 0x35 0x8047dd0: _fini 0x8047dd4: ld.so.1`atexit_fini 0x8047DD8: 0 0x8047ddc: 0 0x8047DE0: 1 0x8047de4: 0x8047eb4 0x8047DE8: 0 0x8047dec: 0x8047eba 8047DB8 This is the value of Main Current EBP register, that is, the SFP of Main. 6 ; EAX value , The return value of the function , Currently 6 >: C; Continue to run the program, print 16 bytes stack and EBP, EAX content MDB: Stop at main 0x2a MDB: Target Stopped AT: Main 0x2a: Ret ; Stopping to the breakpoint main 0x2a, Stack Frame has been revoked, main will return 0x8047dbc: 0x8047dbc: _Start 0x5d; Stack frame has been revoked, the top of the stack is returned _Start EIP, the main stack of Main has been released 0x8047dc0: 1 0x8047dc4: 0x8047de4 0x8047dc8: 0x8047dec 0x8047DCC: _Start 0x35 0x8047dd0: _fini 0x8047dd4: ld.so.1`atexit_fini 0x8047DD8: 0 0x8047ddc: 0 0x8047DE0: 1 0x8047de4: 0x8047eb4 0x8047DE8: 0 0x8047dec: 0x8047eba 0x8047df0: 0x8047ed6 0x8047df4: 0x8047edd 0x8047df8: 0x8047ee4 8047DD8 _Start's sfp, before storage address 0x8047db8 Place , Main STACK FRAME recovery 6 ; EAX value , The return value of the function , Currently 6 >: s; ; Single step execution of the lower instruction (: s command), print 16 bytes stack and EBP, EAX content MDB: Target Stopped AT: _Start 0x5D: AddL $ 0xc,% ESP At this time, Main has returned, _Start 0x5D once stored address 0x8047DBC Place 0x8047dc0: 0x8047dc0: 1 ; Main has returned , _Start 0x5D has been popped up 0x8047dc4: 0x8047de4 0x8047dc8: 0x8047dec 0x8047DCC: _Start 0x35 0x8047dd0: _fini 0x8047dd4: ld.so.1`atexit_fini 0x8047DD8: 0 ; _Start's sfp pointing content is 0, prove _start is the entry of the program 0x8047ddc: 0 0x8047DE0: 1 0x8047de4: 0x8047eb4 0x8047DE8: 0 0x8047dec: 0x8047eba 0x8047df0: 0x8047ed6 0x8047df4: 0x8047edd 0x8047df8: 0x8047ee4 0x8047dfc: 0x8047ef3 8047DD8 _Start's sfp, before storage address 0x8047db8 Place , Main STACK FRAME recovery 6 ; EAX value is 6, Still the return value of the main function > Observation and analysis of registers and stacks at runtime when MDB, you can derive access and allocation and release methods in the stack in the stack: 1. Distribution of local variables can be subtracted by ESP minus the number of bytes SUBL $ 8,% ESP 2. The release of local variables can pass the Leave instruction Leave 3. Access to local variables can subtract offset by EBP MOVL -8 (% EBP),% EAX AddL -4 (% EBP),% EAX Question: How to make a stack when there is more than 2 local variables? In the last article, mention SUBL $ 8,% ESP statement except for the assignment stack space, there is still a role to the stack. So in this case, since I and J are 8 bytes, if there is more than 2 local variables, how to meet spatial allocation and stacks at the same time? 2. Stack allocation of more than two local variables In the previous C program, increase the local variable definition K, the program is as follows: # vi test3.c int main () { INT I, J = 2, K = 4; i = 3; i = i; K = I J K; Return K; } After compiling the program, the following results were approved by MDB. # GCC Test3.c -o test3 # mdb test3 Loading Modules: [Libc.so.1] > Main :: DIS Main: pushl% EBP Main 1: MOVL% ESP,% EBP Main to main 1, create a Stack Frame MAIN 3: SUBL $ 0x18,% ESP For local variables I, J, K assigns a stack space, and ensure that the stack 16 bytes are aligned Main 6: Andl $ 0xF0,% ESP Main 9: MOVL $ 0,% EAX Main 0xe: Subl% EAX,% ESP Main 6 to main 0xe, the stack 16 bytes are aligned again Main 0x10: MOVL $ 2, -8 (% EBP) J = 2 Main 0x17: MOVL $ 4, -0xc (% EBP) ; k = 4 Main 0x1e: MOVL $ 3, -4 (% EBP) ; i = 3 Main 0x25: Leal -4 (% EBP),% EAX Reduce the address of i into EAX Main 0x28: incl (% EAX) ; i Main 0x2a: MOVL -8 (% EBP),% EAX ; Load J value into EAX Main 0x2d: MOVL-4 (% EBP),% EDX ; Load the value of i into the EDX Main 0x30: addl% EAX,% EDX J i, the result is deposited in EDX Main 0x32: Leal -0xc (% EBP),% EAX ; Loading K's address into EAX Main 0x35: addl% EDX, (% EAX) i j k, the result is deposited in the address EBP-0xc, K Main 0x37: MOVL -0XC (% EBP),% EAX ; Load K's value into EAX, as return value Main 0x3a: Leave Undo STACK FRAME Main 0x3b: Ret Main function returns > Question: Why is 3 variables allocated 0x18 bytes of stack space? When 2 variables are, the instructions of the stack space are: Subl $ 8,% ESP is in three partial variables, the instructions assigned the stack space are: SUBL $ 0x18,% ESP 3 integer variables only need 0xc bytes, why do you actually assign 0x18 bytes? The answer is: Keep the 16-byte stack alignment. in X86 Assembly Language Learning Intention (1), it has explained that the default compilation of GCC is to be 16-bytes stack alignment. SUBL $ 8,% ESP will align the stack 16 bytes, while 8-byte space can only meet 2 local variables, if the 4-byte is satisfied with the third local variable, the stack address is no longer 16-byte alignment. At the same time, the closer to satisfy the space needs and keep the 16-byte stack alignment is 0x18. If each, each define a 50-byte and 100-byte character array, in this case, how many stack space actually assigned? The answer is 0x8 0x40 0x70, which is 184 bytes. Let's verify the following: # vi test4.c int main () { CHAR STR1 [50]; CHAR STR2 [100]; Return 0; } # MDB Test4 Loading Modules: [Libc.so.1] > Main :: DIS Main: pushl% EBP Main 1: MOVL% ESP,% EBP MAIN 3: SUBL $ 0xB8,% ESP; assigns a stack space for two character array, while ensuring 16-byte alignment Main 9: Andl $ 0xF0,% ESP Main 0xc: MOVL $ 0,% EAX Main 0x11: SUBL% EAX,% ESP Main 0x13: MOVL $ 0,% EAX MAIN 0x18: Leave Main 0x19: RET > 0xB8 = D; 16 credit conversion 10 184 > 0x40 0x70 0x8 = x; expression calculation, the result is specified as 16-based B8 > Question: What is the stack assignment order when defining multiple local variables? The order in which the local variable stack assignment is in the order of the variable declaration, the variable declared in the same line is in the order from left to right, in Test2.c, the variable declaration is as follows: INT I, J = 2, K = 4; And the result of the anti-assessment: MOVL $ 2, -8 (% EBP) J = 2 MOVL $ 4, -0xc (% EBP) ; k = 4 MOVL $ 3, -4 (% EBP) i = 3 It is not difficult to see that the position in the stack of I, J, and K is shown below: -------------------------- ------> High Address | EIP (_Start Function Return Address) | ---------------------------- | EBP (EBP of _start function EBP) | <------ Main function EBP pointer (Ie SFP frame pointer) ---------------------------- | i (EBP-4) | ----- ----------------------- | J (EBP-8) | ----------------- ----------- | K (EBP-0XC) | ---------------------------- ------> Low address Figure 2-13. Summary This time passes through several test procedures, further understanding the distribution and release and location of local variables in the stack, and reviewing the following in the previous article Concept: SFP Stack Frame Pointer Stack aligned stack And, using the MDB tool provided by Solaris, intuitive observation of the dynamic changes in the stack in the operation, and the creation and revoking of the Stack Frame, according to the content given by the legend ( Figure 2-1 and Figure 1-1), can be more clearly understood by the Stack Layer in the IA32 architecture. Related documents: X86 assembly language learning incoming (1) Development Environment Installation and Settings on Solaris Linux AT & T Asficient Language Development Guide ELF Dynamic Resolution Symbol Process (Revised) Concerned: 10 new changes in Solaris 10