Big Master - Stack / Pile
Please indicate the source and the author contact: http://blog.9cbs.net/absurd
Author: Lixian Jing
l
As a basic data structure, I didn't feel surprised, used to implement function calls, which also simply saw practices. I was lamented in its cleverness until I tried to find another way to realize the recursive operation. To achieve recursive operation, no stack is not impossible, but can't find a way than it is more elegant.
Although most compilers are optimized, the commonly used parameters or local variables are placed in the register. However, the temporary variable (local variable and parameters) when managing the function call (local variables and parameters) is a general practice. The former is just auxiliary means, and only in the current function, the next layer function is called, these values will still be stored in the stack. Row.
Typically, the stack is down (low address), and each of the stacks is an element, the top of the stack is extended to the low address, and the top of the stack is an element in the stack. An interested issue: On the X86 platform, the top register is ESP, then the value of the ESP is modified before the PUSH operation, or it is still modified after the PUSH operation? What data is stored in this instruction in the stack? It is said that in the X86 series CPU, in addition to 286, it is first modified ESP and then stack. Since 286 does not have a CPUID instruction, some OS checks 286 models with this method.
The local variable in a function and the parameters of the next level function, the memory space occupied, as a basic unit, called a frame (FRAME). In GDB, the f command is used to view information about the specified frame. There is also other information between the two Frames, such as the boundary address (EBP) of the previous Frame, and the like.
About the basic knowledge of the stack, first introduce so much, let's take a look at some of the skills and applications of the stack: 1. BACKTRACE's basic function of the CallStack debugger, use this feature, you can see the functionality Call relationship. In GDB, this feature is called backtrace, and you can see the CallStack of the current function when you enter the BT command. How much interesting is achieved, we have studied here.
Let's take a look at the basic model of the stack.
Parameter n ↓ High address parameter ... Function parameter Forming the order of the stack and the specific call mode Related parameter 3 Parameter 1 EIP Returns this call, the next instructions save the EBP of the caller, then EBP points at this time Top of the stack. Temporary variable 1
Temporary variable 2
Temporary variable 3
Temporary variables…
Temporary variable 5 ↓ low address
To implement CallStack, we need to know the following information: l The command address (ie the EIP at that time) is called. l The source code code location corresponding to the command address. Regarding the first point, from the above table, we can see that there are values of EIP at all levels in the stack. We will take it out. Use the code below to easily implement:
#include
Int BackTrace (void ** buffer, int size)
{
INT n = 0;
INT * P = & n;
INT i = 0;
INT EBP = P [1];
INT EIP = P [2];
For (i = 0; i { Buffer [i] = (void *) EIP; P = (int *) EBP; EBP = P [0]; EIP = P [1]; } Return size; } #define n 4 STATIC VOID TEST2 () { INT i = 0; Void * buffer [n] = {0}; Backtrace (Buffer, N); For (i = 0; i { Printf ("% p / n", buffer [i]); } Return; } Static void test1 () { TEST2 (); } Static void test () { TEST1 (); } Int main (int Argc, char * argv []) { Test (); Return 0; } Program output: 0x8048460 0x804849c 0x80484a9 0x80484cc About the second point, how to put the command address and the line number, which is also very simple. Can be queried from the MAP file or ELF. Binutil has a gadget with ADDR2LINE to help achieve this. [root @ linux bt] # addr2line 0x804849c -e bt.exe /Root/test/bt/bt.c:42 2. Alloca's implementation Everyone knows that the dynamically allocated memory must be released, otherwise there will be memory leaks. Maybe some people know that the dynamically allocated memory can not be released. Alloca is such a function, the last A represents auto, that is, the meaning of automatic release. Alloca is allocated in the stack. Okey allocation is allocated in the stack, just like other temporary variables allocated in the stack, this memory is automatically released when the current function call is completed. As we have talked earlier, the size of the stack is limited, and the stack of ordinary threads is only 10m size, so when it is assigned, it is necessary to force, and do not assign excessive memory. Alloca may gradually exit the historical stage because the new C / C standard supports the growth array. For example, int Array [n], the old version of the compiler requires N is constant, and the new compiler allows N to be variables. This feature supported by the compiler can fully replace alloca. This is not a standard function, but most platforms such as Linux and Win32 are supported. Even if a few platforms are not supported, it is not difficult to achieve themselves. Here we briefly introduce the implementation of alloca. Let's take a look at a small program, then look at the assembly code it correspond, everything is clear. #include Int main (int Argc, char * argv []) { INT n = 0; INT * P = Alloca (1024); Printf ("& n =% P P =% P / N", & n, p); Return 0; } The assembly code is: Int main (int Argc, char * argv []) { 8048394: 55 push% EBP 8048395: 89 E5 MOV% ESP,% EBP 8048397: 83 EC 18 SUB $ 0x18,% ESP 804839A: 83 E4 f0 and $ 0xffffff0,% ESP 804839D: B8 00 00 00 MOV $ 0x0,% EAX80483A2: 83 C0 0F add $ 0xF,% EAX 80483A5: 83 C0 0F Add $ 0xF,% EAX 80483A8: C1 E8 04 SHR $ 0x4,% EAX 80483AB: C1 E0 04 SHL $ 0X4,% EAX 80483AE: 29 C4 SUB% EAX,% ESP INT n = 0; 80483B0: C7 45 FC 00 00 00 MoVL $ 0x0, 0xfffffffFC (% EBP) INT * P = Alloca (1024); 80483B7: 81 EC 10 04 00 00 SUB $ 0X410,% ESP 80483BD: 8D 44 24 0C LEA 0xc (% ESP),% EAX 80483C1: 83 C0 0F Add $ 0xF,% EAX 80483C4: C1 E8 04 SHR $ 0X4,% EAX 80483C7: C1 E0 04 SHL $ 0x4,% EAX 80483CA: 89 45 F8 MOV% EAX, 0xfffffffff8 (% EBP) Printf ("& n =% P P =% P / N", & n, p); 80483CD: 8B 45 F8 MOV 0xfffffff8 (% EBP),% EAX 80483D0: 89 44 24 08 MOV% EAX, 0x8 (% ESP) 80483D4: 8D 45 FC LEA 0xffffffc (% EBP),% EAX 80483D7: 89 44 24 04 MOV% EAX, 0x4 (% ESP) 80483db: C7 04 24 98 84 04 08 MOVL $ 0x8048498, (% ESP) 80483E2: E8 D1 Fe FF FF CALL 80482B8 Return 0; 80483E7: B8 00 00 00 MOV $ 0x0,% EAX } One of the key instructions is: SUB $ 0X410,% ESP It can be seen that the implementation of alloca is simply subtracting the ESP to the specified size, expand the stack space (the remember stack is downward), which is allocated. 3. Implementation of variable parameters. For novices, the function of variable parameters is also a magical. Still in a small program to illustrate its implementation. #include #include INT Print (const char * fmt, ...) { INT N1 = 0; INT N2 = 0; INT N3 = 0; VA_LIST AP; VA_START (AP, FMT); N1 = VA_ARG (AP, INT); N2 = VA_ARG (AP, INT); N3 = VA_ARG (AP, INT); VA_END (AP); Printf ("N1 =% D N2 =% D N3 =% D / N", N1, N2, N3); Return 0; } Int Main (int Arg, char Argv []) { Print ("% d / n", 1, 2, 3); Return 0; } Let's take a look at the corresponding assembly code: INT Print (const char * fmt, ...) { 8048394: 55 push% EBP 8048395: 89 E5 MOV% ESP,% EBP 8048397: 83 EC 28 SUB $ 0x28,% ESP INT N1 = 0; 804839A: C7 45 FC 00 00 00 00 MOVL $ 0x0, 0xffffffc (% EBP) INT N2 = 0; 80483A1: C7 45 F8 00 00 00 00 MOVL $ 0x0, 0xffffffffff8 (% EBP) INT N3 = 0; 80483A8: C7 45 F4 00 00 00 00 MoVL $ 0x0, 0xfffffff 4 (% EBP) VA_LIST AP; VA_START (AP, FMT); 80483AF: 8D 45 0C LEA 0xc (% EBP),% EAX 80483B2: 89 45 f0 MOV% EAX, 0xfffffff0 (% EBP) N1 = VA_ARG (AP, INT); 80483B5: 8B 55 F0 MOV 0xffffff0 (% EBP),% EDX 80483B8: 8D 45 F0 LEA 0xffffff0 (% EBP),% EAX 80483bb: 83 00 04 Add1 $ 0x4, (% EAX) 80483BE: 8B 02 MOV (% EDX),% EAX 80483C0: 89 45 FC MOV% EAX, 0xfffffffc (% EBP) N2 = VA_ARG (AP, INT); 80483C3: 8B 55 F0 MOV 0xffffff0 (% EBP),% EDX 80483C6: 8D 45 F0 LEA 0xffffff0 (% EBP),% EAX 80483C9: 83 00 04 Add1 $ 0x4, (% EAX) 80483CC: 8B 02 MOV (% EDX),% EAX 80483 CE: 89 45 F8 MOV% EAX, 0xfffffff8 (% EBP) N3 = VA_ARG (AP, INT); 80483D1: 8B 55 F0 MOV 0xFffffff0 (% EBP),% EDX 80483D4: 8D 45 F0 LEA 0xffffff0 (% EBP),% EAX 80483D7: 83 00 04 Add1 $ 0x4, (% EAX) 80483DA: 8B 02 MOV (% EDX),% EAX 80483DC: 89 45 F4 MOV% EAX, 0xfffffff4 (% EBP) VA_END (AP); Printf ("N1 =% D N2 =% D N3 =% D / N", N1, N2, N3); 80483DF: 8B 45 F4 MOV 0xFfffff4 (% EBP),% EAX 80483E2: 89 44 24 0c MOV% EAX, 0xc (% ESP) 80483E6: 8B 45 F8 MOV 0xfffffff8 (% EBP),% EAX 80483E9: 89 44 24 08 MOV% EAX, 0x8 (% ESP) 80483ed: 8B 45 FC MOV 0xfffffffc (% EBP),% EAX 80483F0: 89 44 24 04 MOV% EAX, 0x4 (% ESP) 80483F4: C7 04 24 F8 84 04 08 MOVL $ 0x80484F8, (% ESP) 80483fb: e8 b8 Fe FF FF CALL 80482B8 Return 0; 8048400: B8 00 00 00 MOV $ 0x0,% EAX } Int Main (int Arg, char Argv []) { 8048407: 55 push% EBP 8048408: 89 E5 MOV% ESP,% EBP 804840A: 83 EC 18 SUB $ 0x18,% ESP 804840D: 83 E4 F0 and $ 0xffffffff0,% ESP 8048410: B8 00 00 00 MOV $ 0x0,% EAX 8048415: 83 C0 0F add $ 0xf,% EAX 8048418: 83 C0 0F Add $ 0xF,% EAX 804841b: C1 E8 04 SHR $ 0X4,% EAX 804841e: C1 E0 04 SHL $ 0X4,% EAX 8048421: 29 C4 SUB% EAX,% ESP INT N = Print ("% d / n", 1, 2, 3); 8048423: C7 44 24 0C 03 00 00 MOVL $ 0X3, 0XC (% ESP) 804842a: 00 804842B: C7 44 24 08 02 00 00 MOVL $ 0x2,0x8 (% ESP) 8048432: 00 8048433: C7 44 24 04 01 00 00 MOVL $ 0X1, 0X4 (% ESP) 804843A: 00804843B: C7 04 24 0B 85 04 08 MOVL $ 0x804850B, (% ESP) 8048442: E8 4D FF FF FF CALL 8048394 8048447: 89 45 FC MOV% EAX, 0xffffffc (% EBP) Return 0; 804844A: B8 00 00 00 MOV $ 0x0,% EAX } From the assembly code, we can see that the parameters are in reverse. When taking the parameters, let the AP points to the first parameter, and because the stack is growing down, you can move the pointer upward movement. l Detailed explanation in the memory distribution algorithm section.