Before reading this article, if you don't even know anything, please read the basic knowledge behind the article. People who have been programmed to know that advanced languages can access data in memory through the variable name. So how do these variables stored in memory? How is the program uses these variables? This will be discussed below. If the C language code in the following is not specified, the Release version of VC compiled by default. First, it is to understand how the variable of the C language is in the internal depot. The C language has global variables, local variables (LOCAL), static variables, register variables. Each variable has a different way of allocation. Let's take a look at the following code: #include
INT G1 = 0, G2 = 0, G3 = 0; int Main () {static int S1 = 0, S2 = 0, S3 = 0; INT V1 = 0, V2 = 0, V3 = 0; // Print out The memory address of the variable printf ("0x% 08x / n", & v1); // Print the memory address of each local variable Printf ("0x% 08x / n", & V2); printf ("0x% 08x / n / n" , & v3); Printf ("0x% 08x / n", & g1); // Print the memory address of each global variable Printf ("0x% 08x / n", & g2); printf ("0x% 08x / n / n" , & g3); Printf ("0x% 08x / n", & S1); // Print the memory address of each static variable Printf ("0x% 08x / n", & S2); Printf ("0x% 08x / n / N" , & S3);} The result of the compiled execution is: 0x0012FF78 0x0012FF7C 0x0012FF80 0x004068D0 0x004068D4 0x004068D8 0x004068DC 0x004068E0 0x004068E4 The result of the output is the memory address of the variable. Where V1, V2, V3 are local variables, G1, G2, G3 are global variables, S1, S2, S3 are static variables. You can see that these variables are continuously distributed in memory, but local variables and global variable allocated memory addresses are 18,000 miles, while global variables and static variable allocation are continuous. This is because local variables and global / static variables are the results allocated in different types of memory areas. For memory spaces for a process, you can logically divide 3 partial: code area, static data area, and dynamic data area. The dynamic data area is generally "stack". "Stack" and "Heap" are two different dynamic data areas, and the stack is a linear structure, and the heap is a chain structure. Each thread of the process has private "stack", so although the code is the same, the data of local variables are not interfered. A stack can be described by the "base address" and "stack top" address. Global variables and static variables are allocated in a static data area, and local variables are allocated in the dynamic data area, ie in the stack. The program accesses local variables by stacking base sites and offsets. ├ ------- ┤ low-end memory area │ ... │ ├ ------- ┤ │ Dynamic Data Zone │ ├ ------- ┤ │ ... │ ----- - ┤ │ Code Zone │ │ ------- ┤ │ Static Data Area │ ├ ------- ┤ │ ... │ ├ ------ ┤ High-end memory area stack is an advanced The subsequent data structure, the top address of the stack is always less than the base address equal to the stack. We can first understand the process of function calls so that there is a more in-depth understanding of the role of the stack in the program. Different languages have different function call regulations, these factors have parameters of pressing rules and stack balance. Windows API call rules and ANSI C function call rules are different. The former is adjusted by the modulated function, and the latter adjusts the stack by the caller. Both are distinguished by "__stdcall" and "__cdecl" prefix. First look at the following code: #include
Void __stdcall func (int param1, int param2, int param3) {int var1 = param1; int var2 = param2; int var3 = param3; printf ("0x% 08x / n", m1); // Print each variable Memory address printf ("0x% 08x / n", m2); PrintF ("0x% 08x / n / n", m3); Printf ("0x% 08x / n", & var1); Printf ("0x% 08x / n ", & var2); Printf (" 0x% 08x / n / n ", & var3); return;} int main () {FUNC (1, 2, 3); return 0;} The resulting result after compilation : 0x0012FF78 0x0012FF7C 0x0012FF80 0x0012FF68 0x0012FF6C 0x0012FF70 ├ ------- ┤ <- Function Top (ESP), low-end memory area │ ... │ ------ ┤ │ VAR 1 │ ├ ------- ┤ │ VAR 2 │ ├ ------- ┤ │ VAR 3 │ ├ ------- ┤ │ RET │ ├ ------ ┤ <- "__ cdecl "Function Return Stack Top (ESP) │ Parameter 1 │ ├ ------- ┤ │ Parameter 2 │ ------ ┤ │ PARAMETER 3 │ ├ ------- ┤ < - "__ stdcall" function returns the top (ESP) │ ... │ ------- ┤ <- Stack (base EBP), high-end memory area The above picture is the stack in the function call. . First, three parameters are pressed into the stack from the order of the left, press "PARAM3", then press "param2", and finally press "param1"; then press the return address (RET), then jump to The function address is then executed (here you want to add a point, introducing the buffer in Unix) In the article, it is mentioned that after pressing the RET, continue to press the current EBP, and then use the current ESP instead of EBP. However, there is a function of Windows under Windows In the called article, there is also this step in the function call under Windows, but according to my actual debugging, I didn't find this step, which can also be seen from the 4-byte gap between PARAM3 and VAR1. Step 3, subtract a number of stacks (ESP), allocate memory space for local variables, and subtract 12 bytes (ESP = ESP-3 * 4, each INT variable occupies 4 bytes) The memory space of the local variable is then initialized. Since the "__stdcall" call is adjusted by the invoice, the stack is restored before the function returns, first recover the memory occupied by local variables (ESP = ESP 3 * 4), then remove the return address, fill in the EIP register, recycle the previous Press the memory (ESP = ESP 3 * 4), continue to execute the caller's code.
See the following assembly code: -------------------------------------------------------: 00401000 83ec0C Sub ESP, 0000000C // Create a local variable memory space: 00401003 8B442410 MOV EAX, DWORD PTR [ESP 10]: 00401007 8B4C2414 MOV ECX, DWORD PTR [ESP 14]: 0040100B 8B542418 MOV EDX, DWORD PTR [ESP 18]: 0040100F 89442400 MOV DWORD PTR [ESP], EAX: 00401013 8D442410 LEA EAX, DWORD PTR [ESP 10]: 00401017 894C2404 MOV DWORD PTR [ESP 04], ECX ........................ (省略): 00401075 83C43C ADD ESP, 0000003C; Restore Stack, Recycling Local Variable Memory Space: 00401078 C3 RET 000c; Function Returns, Recovery Parameters Occupied Memory Space; If it is "__cdecl", here is "RET", the stack will be recovered by the caller; ------------------- Function --------------------------; --- ----------- Main program calls the code of the func function --------------: 00401080 6A03 PUSH 00000003 // Pressing parameter param3: 00401082 6A02 PUSH 00000002 // Pressing parameters Param2: 00401084 6A01 PUSH 00000001 // Pressing parameter param1: 00401086 E875FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF function; if it is "__cdecl", the stack here will be restored here, "Add ESP, 0000000c" smart readers see Here, almost understand the principles of snoring overflow. Let's look at the following code: # include # include
Void __stdcall func () {char lpbuff [8] = "/ 0"; strcat (lpbuff, "aaaaaaaaaaaa); Return;} int main () {func (); return 0; Ha, "0x0000000000" memory referenced by "0x00414141" instruction. This memory cannot be "read". "," Illegal operation "! "41" is "a" 16-based ASCII code, it is obviously the problem with Strcat. "lpbuff" size is only 8 bytes, which is the end of '/ 0'. The STRCAT can only write up to 7 "A", but the program actually writes 11 "A" plus 1 '/ 0'. . Take a look at the picture above, more 4 bytes just cover the memory space of RET, causing the function to return to an error memory address, execute the error instruction. If you can carefully construct this string, it is divided into three parts. The former part is only the fill unintentional data to achieve overflow, and then a data override RET, followed by a SHELLCODE, then a RET The address can point to the first instruction of this shellcode, and the function can be executed when the function returns. But the software's different versions and different operating environments may affect this SHELLCODE in memory, then it is very difficult to construct this RET. A large number of NOP instructions are typically populated between RET and Shellcode, making the Exploit has stronger versatility. ├ ------- ┤ <- low-end memory area │ ... │ ------- ┤ <- Start 开始 │ │ │ │ │ │ │ │ │ │ │ - Fill in Useless Data │ │ ├ ------- ┤ │ RET │ <- Points to Shellcode, or NOP Directive range ├ ------ ┤ │ NOP │ │ ... │ <- NOP instructions filled in, it is Ret Point to range │ NOP │ ------- ┤ │ │ │ shellcode │ │ │ ├ ------ ┤ <- End of data by Exploit │ ... │ ----- - ┤ <- Dynamic data under high-end memory area Windows In addition to storage in the stack, you can also store it in the stack. Understanding C friends know that C can use the New keyword to dynamically allocate memory. To look at the C code below: # include # include
#include
Void func () {char [128]; char bufflocal [128]; static char buffstatic [128]; printf ("0x% 08x / n", buffer; // Memory address of the variable in the pile Printf ("0x% 08x / n", bufflocal; // Print the memory address of the local variable Printf ("0x% 08x / n", buffstatic); // Print the memory address of the static variable} void main () {Func ); RETURN;} The execution result is: 0x004107D0 0x0012FF04 0x004068C0 You can find that the memory allocated with the New keyword is not in the stack, nor is it still in the static data area. The VC compiler is "Heap" under Windows to implement the memory dynamic allocation of the New keyword. Before you speak "Heap", let's take a few API functions related to "Heap": Heapalloc Apply for Memory Space in HeapCreate to create a new heap object Heapdestroy Destroy a heap object HeapFree release application memory Heapwalk enumeration pile All memory block getProcessHeap get the default reactor object for the process GetProcessHeaps Get All Pile Objects Localalloc GlobalLoc When the process is initialized, the system automatically creates a default heap for the process, which is 1m in the memory of 1m. The heap object is managed by the system, which exists in a chain structure in memory. By the following code, you can apply for memory space by a pile: handle hHEAP = getProcessHeap (); char * buff = Heapalloc (hHEAP, 0, 8); where hHEAP is a handle of a heap object, BUFF is the address of the memory space to the application. What is this hHEAP? Is it meaningful? Take a look at the code below: #pragma Comment (Linker, "/ Entry: main") // Define the entry of the program #include
_CRTIMP INT (const char *, ...); // Define STL functions Printf / * ------------------------- -------------------------------------------------- Write here, we will review the knowledge mentioned earlier: (* Note) The PrintF function is the function in the standard library of the C language, and the VC standard function library is implemented by the MSVCRT.DLL module. As can be seen from the function, the number of PRINTFs is variable. The function cannot be pre-known in advance, and the function can only obtain the information of the first parameter string to obtain information of the first parameter string. Since the number of parameters here is dynamic, the stack must be balanced by the caller, and the __cdecl call rule is used here. BTW, the API function of the Windows system is basically __stdcall call form, only one API exception, that is, WSPrintf, which uses the __cdecl call rule, like the Printf function, because its parameter is the number of parameters. -------------------------------------------------- ------------------------- * / void main () {handle hheap = getProcessHeap (); char * buff = Heapalloc (hheap, 0,0x10 ); char * buff2 = Heapalloc (HHEAP, 0, 0X10); hModule hmsvcrt = loadLibrary ("msvcrt.dll"); printf = (void *) getProcaddress (HMSVCRT, "Printf"); Printf ("0x% 08x / n ", hHEAP; Printf (" 0x% 08x / n ", buff); Printf (" 0x% 08x / n / n ", buff2);} Execution result is: 0x00130000 0x00133100 0x00133118 HHEAP value how to be with the value of the buff So approaching? In fact, HHEAP's handle is the address pointing to the head of the Heap. In the user area of the process, a structure called PEB (Process Environment Block). This structure has some important information about the process, where ProcessHeap stored at PEB first address offset 0x18 is the address of the process default heap, and Offset 0x90 stores a pointer to the address list of all piles of the process. Windows has a lot of APIs using the default pile of processes to store dynamic data, such as all ANSI versions of Windows 2000 are used to apply for memory in the default stack to convert the ANSI string to the Unicode string. Access to a heap is in order, only one thread can access data in the stack at the same time, and when multiple threads have access requirements, it can only queue the wait, so that the program execution efficiency decreases. Finally, the data is aligned in memory. The bit data is aligned, refers to the memory address of the data must be an integer multiple of the data length, and the memory start address of DWORD data can be divided by 4, and the memory start address of Word data can be divided by 2, X86 CPU can Directly access the aligned data, when he tries to access an unlined data, a series of adjustments are performed inside, which is transparent to the program, but will reduce the running speed, so the compiler will compile the program. Try to guarantee data alignment. Similarly, let's take a look at the execution results of the program compiled with the three different compilers of VC, DEV-C and LCC: #include