Buffer overflow: 10 years of attack and defense weaknesses
By dspman
Source:
http://www.cse.ogi.edu/disc/peojects/immunix
I. Introduction
In the past decade, security vulnerabilities that overflow to buffer are the most common forms. More serious is that the buffer overflow vulnerability accounts for the vast majority of the remote network attack, which allows an anonymous Internet user to have the opportunity to get part or all of the controls of a host! Because such an attack makes anyone possible to obtain the control of the host, it represents a very serious security threat.
The reason why the buffer overflow attack becomes a common security attack means is that the buffer overflow vulnerability is too ordinary, and it is easy to implement. Moreover, the main means of buffer overflows becomes a remote attack. The reason is that the buffer overflow vulnerability gives an attacker everything he wants: colonial and execute the attack code. The collapped attack code runs a program that has a buffer overflow vulnerability at a certain permissions, thereby obtaining control of the attacked host.
For example, three of the five remote attacks used in the 1998 Lincoln Laboratory used to evaluate intrusion detection, and three are based on social engineering trust, and two are buffer overflow. In the 13 suggestions of CERT in 1998, it was related to the buffer overflow. In 1999, at least half of the recommendations were related to the buffer overflow. In Bugtraq's survey, 2/3 of the respondent believes that the buffer overflow vulnerability is a very serious security issue.
There are many forms of buffer overflow vulnerabilities and attacks, we will describe and classify them in the second part. According to the defense method, the defense method is also different, and we will put it in the third part, and its content includes a valid defense means for each attack type. We must also introduce the stack protection method, which is effective in solving the vulnerabilities in the buffer overflow and does not sacrifice the compatibility and performance of the system. In the fourth part, we have to discuss the integrated use of various defense methods. Finally, in the fifth part is our conclusion.
Second, buffer overflow vulnerabilities and attacks
The purpose of the buffer overflow attack is to disturb the functionality of the program with certain privilege runs, which makes the attacker obtain the control of the program. If the program has sufficient permissions, the entire host is controlled. In general, the attacker attacks the root program, then performs a "Exec (SH)" execution code to get the root shell, but not all this. In order to achieve this, the attacker must reach the following two goals:
1. Arrange the appropriate code in the address space of the program.
2. By appropriate initialization registers and memory, let the program jump to the address space we scheduled.
We classify the buffer overflow attacks based on these two goals. At the 2.1 section, we will describe how the attack code is put into the address space of the attacked program (this is the origin of the "Buffer" name). At 2.2, we introduce the attacker how to overflow a program's buffer, and perform the transfer to the attack code (this is the "overflow"). At 2.3, we introduce the technology of code arrangements and control procedures that are discussed in parts of 2.1 and 2.2.
2.1 Method for Arrange the appropriate code in the address space of the program
There are two ways to arrange an attack code in the attacked program address space:
Colonialization:
The attacker enters a string to the attacked program, and the program will put this string into the buffer. The data contained in this string is a command sequence that can be run on this attacked hardware platform. Here attackers use a buffer that is attacked to store the attack code. The specific way has the following two differences:
1. An attacker does not have to overflow any buffer for this purpose, you can find enough space to place an attack code 2. The buffer can be set anywhere: stack (automatic variable), heap (dynamically assigned) and static data area (Initialized or uninited data)
Use the already existing code:
Sometimes, the code that the attacker wants is already in the attacked program, and the attacker is only to pass some parameters for the code, and then make the program jump to our goal. For example, the attack code requires "EXEC (" / bin / sh ")", and the code in the libc library executes "EXEC (Arg)", where arg makes a pointer parameter to a string, then the attacker as long as the attacker The incoming parameter pointer changes to "/ Bin / SH" and then is then transferred to the corresponding instruction sequence in the libdu library.
2.2 Method for transferring procedures to attack code
All of these methods are seeking to change the execution process of the program, so that it jumps to the attack code. The most basic thing is to overflow a buffer without boundary check or other weaknesses, which disrupts the normal execution order of the program. By overflow a buffer, the attacker can rewrite the adjacent program space with near-violent ways and directly skip the system's inspection.
The reference to the classification is the type of spatial space that overflows from the bobbers of the attacker. In principle, it can be any space. For example, the original Morris Worm uses a buffer overflow of the fingerd program to disturb the name of the file to be executed. In fact, many buffer overflows to seek changing program pointers with violence. The different places of such procedures is the breakthrough and different positioning of memory space.
(figure 1)
Activate Records:
Whenever a function call occurs, the caller will leave a activation record in the stack, which contains the function of the function.
Returned address. The attacker points this return address to the attack code by overflowing these auto variables, as shown in Figure 1. By changing the process of returning the program, when the function call is over, the program jumps to the address set by the attacker, not the original address. Such buffer overflows are called "stack smashing attacks," so that the currently common buffer overflows the attack mode.
Function Pointers:
"VOID (* foo) ()" declares a variable foo in return value as a VOID function pointer. Function pointer can be used
Positions any address space, so an attacker simply finds a buffer that can overflow near the function pointer in any space, then overflow this buffer to change the function pointer. At a moment, when the program comes from the function pointer, the program is implemented according to the attacker's intention! Its attack example is the SuperProbe program under the Linux system.
Long Jump Buffer Safety (longjmp buffers):
In the C language contains a simple inspection / recovery system called setjmp / longjmp. It means in the inspection point
"SetJMP (Buffer", use "longjmp (buffer" to recover the inspection point. However, if an attacker can enter the space of the buffer, "longjmp (buffer" is actually a code that jumps to the attacker. Like a function pointer, the longjmp buffer can point to anywhere, so an attacker has to do to find a buffer that can overflow. A typical example is Perl 5.003, the attacker first enters the longjmp buffer used to restore the buffer overflow, then induce the recovery mode, so that the Perl interpreter jumps to the attack code!
2.3 Integrated Code Colonial and Process Control Technology
Now we study the technology of integrated code colonial and process control.
The simplest and common buffer overflow attack type is integrated in a string to enter the colonial and activation record. The attacker locates an automatic variable that can be overflow, then transmits a large string to the program, and collapse the code while the buffer overflow change activation record. This is a template that is attacked by Levy. Because C is in a very small buffer for users and parameters, this vulnerability attack is not in a small number. The code colonial and buffer overflow do not have to be completed within one action. An attacker can place code in a buffer, which is not overflowing the buffer. The attacker then transfer the program's pointer by overflowing another buffer. This method is generally used to solve the case where the buffer that can be overflow is not large (not all of the code).
If an attacker tries to use the code already resident instead of the code, they usually have a parameterization of the code. For example, some code segments in LIBC (almost all C processes must be connected) execute "Something", where Somthing is the parameter. The attacker then uses the buffer overflows the parameters of the program, and then exploits the other buffer overflow to point the program pointer to the specific code segment in the libc.
3. Protection method of buffer overflow
There are currently four basic methods to protect buffers from attacks and impacts from buffer overflow. Method for enforcing the correct code is introduced in 3.1. Table 3.2 describe the unauthorized cushioning of the buffer through the operating system, preventing the attacker from attacking the code. This method effectively prevents a lot of buffer overflow attacks, but the attacker does not have to enter the attack code to achieve the attack of the buffer overflow (see Section 2.1), so this method still has a weak point. In 3.3, we introduce the protection of the buffer using the boundary check of the compiler. This method makes the buffer overflow impossible, thus completely eliminating the threat of buffer overflow, but relatively considerable cost. In 3.4 we introduce an indirect method, this method performs an integrity check before the program pointer fails. Thus, although this method does not make all buffer overflow failures, it is indeed difficult to prevent the vast majority of buffer overflow attacks, and can escape this method protection buffer overflow is also difficult. Then at 3.5, we want to analyze the compatibility and performance advantages of this protection (with array boundary check).
3.1 Write the correct code
Writing the correct code is a very meaningful but time consuming job, and the special image writes the C language with programs that are easy to make a tendency (such as zero end of the string), this style is due to the pursuit of performance and ignores the correctness. Tradition caused by tradition. Despite the long time, people know how to write a safe program, the procedures with security vulnerabilities still appear. So people have developed some tools and techniques to help programmakers write safely correct procedures.
The easiest way is to use GREP to search for the source of libraries to generate a library of vulnerabilities, such as calls to StrCPY and SPRINTF, and these two functions do not check the length of the input parameters. In fact, there are such problems in the standard libraries of each version C.
In order to find some common vulnerabilities such as buffer overflow and operating system competition conditions, the code check team checks a lot of code. However, there is still a fishing net fish. Although strncpy and snprintf have been used to prevent the occurrence of buffer overflow, this situation will occur due to the problem of writing code. For example, the LPRM program is the best example, although it passes the safety inspection of the code, but there is still a problem that the buffer overflow exists.
In order to deal with these issues, people have developed some advanced check-in tools such as Fault Injection. The purpose of these tools is to spill out some buffers to find out security vulnerabilities in some buffers randomly. There are also some static analysis tools to detect the existence of buffer overflow.
Although these tools help programmers develop a safer program, these tools cannot find all buffer overflow vulnerabilities due to the characteristics of C language. Therefore, the detection error can only be used to reduce the possibility of buffer overflow, and it does not completely eliminate its existence. Unless the programmer guarantees that his program is unlucky, it is still necessary to use the contents of the following 3.2 to 3.4 parts to ensure the reliability of the program. 3.2 Non-executed buffer
By not executing the data segment address space of the attacker, the attacker is not possible to perform code that is collected into the attack program input buffer, which is referred to as non-executable buffer technology. In fact, many old UNIX systems are designed, but the recent UNIX and MS Windows systems are often dynamically placed in the data segment in the data segment due to better performance and functionality. So in order to keep the compatibility of the program, it is impossible to make the data segment of all programs cannot be implemented.
But we can set the stack data segment to be unauthorized, so that the compatibility of the program can be maximized. Linux and Solaris have released kernel patches in this regard. Because there is almost no legal program stores code in the stack, this approach does not generate any compatibility issues, except for two special cases in Linux, the executable code must be placed in the stack:
Signal delivery:
LINUX executes the code to send UNIX signals to the process by initiating an interrupt to the process stack release code and initiate an interrupt. The patch of non-performing buffers is to allow the buffer to be executed when the signal is transmitted.
GCC's online reuse:
The study found that GCC placed executable code in the stack area as an online reuse. However, shutting down this feature does not have any problems, only some of the features don't seem to be used.
The protection of non-executable stacks can effectively deal with buffer overflow attacks that collapse automatic variables, and there is no effect for other forms of attacks (see 2.1). This protection can be skipped by reference to a pointer to a resident program. Other attacks can be skipped in a stack or static data segment.
3.3 array boundary check
The collapse code causes buffer overflows in an aspect, and the execution process of disrupting programs is another aspect. Unlike non-performing buffer protection, array boundaries check the generation and attacks of buffer overflows. In this way, as long as the array cannot be overflow, the overflow attack will not talk. In order to implement array boundary checks, all read and write operations of all pairs should be checked to ensure that the operation of the array is within the correct range. The most direct way is to check all array operations, but some optimized techniques can be used to reduce the number of inspections. There are currently several inspection methods:
3.3.1 Compaq C Compiler
Compaq is a C compiler developed by Alpha CPU (CC on the TRU64 Unix Platform, which is CCC on the Alpha Linux platform) supports a limited boundary check (using the -Check_bounds parameter). These restrictions are:
• Only the displayed array reference is checked, such as "A [3]" will be checked, and "* (A 3)" will not.
· Since all C arrays are transmitted during transmission, the array passed to the function will not be checked.
· Library functions with hazards such as STRCPY will not perform border check when compilation, even specifying a boundary check.
This limitation is very serious because of the use of pointers in C language. Usually this boundary check is used to check the error, and cannot guarantee that there is no buffer overflowing vulnerability.
3.3.2 Jones & Kelly: CERE: C 'Array Boundary Check
Richard Jones and Paul Kelly have developed a GCC patch to implement a complete array boundary check for the C program. Since there is no change of the meaning of the pointer, it is good compatibility with other GCC modules. Further, they thus export a "base" pointer from the expression without the pointer, and then detect whether the result of the expression is within the allowable range. Of course, the cost of such a performance is huge: for a frequent use of a pointer such as a vector multiplication, the speed is 30 times slower than that of the pointer.
This compiler is still very immature; some complex programs (such as ELM) cannot be compiled above, execute. However, under its updated version, it can compile the encrypted package that executes the SSH software. Its achievement is 12 times.
3.3.3 Purify: Memory Access Check
Purify is a tool used by the C program to view the tool instead of a dedicated security tool. PURIFY uses "Target Code Insert" technology to check all memory access. By connecting to the PURIFY connection tool, the code can be executed when the array is executed to ensure its legality. The loss caused by this will drop 3-5 times.
3.3.4 Type - Safety Language
All buffer overflow vulnerabilities are derived from the C language lack type security. If only type-safe operation can be permitted, it is impossible to have a mandatory operation of the variable. If as a newbie, you can recommend using type-secure languages such as Java and ML.
However, the Java virtual machine as a Java execution platform is a C program, so a way to attack JVM is overflowing the JVM buffer. Therefore, in the system, buffer overflow defense technology is used to use a mandatory type - secure language can receive unexpected effects.
3.4 Program Pointer Integrity Check
Program pointer integrity checks and boundary checks are slightly different. Unlike the prevention program pointer, the program pointer integrity check detects its change before the program pointer is referenced. Therefore, even if an attacker successfully changed the program's pointer, this pointer will not be used because the system detects the change in the pointer in advance.
This method cannot solve all buffer overflow problems than array boundary inspections; this detection can be avoided by other buffer overflow methods. But this method has a great advantage in performance and is also very compatible.
The program integrity check has three research directions. In 3.4.1, the SNARSKII developed a set of customized LIBCs that can determine the buffer overflow by monitoring the CPU stack for FreeBSD. In 3.4.2, we will introduce a compiler developed by our own stack protection method, which can automatically generate integrity detection code when the function is called. Finally, at 3.4.3, we introduce the pointer protection methods under development, which is similar to stack protection, providing protection of integrity of all program pointers.
3.4.1 Handwritten Stack Monitoring
Snarskii develops a set of customized LIBCs that can determine the buffer overflow by monitoring the CPU stack. This app is completely written in hand-saving, and only protects the current effective record function in the LIBC. This application has achieved design requirements, which has good defense for attacks based on Libc library functions, but cannot defend other ways to attack.
3.4.2 Stack Protection: Effective Record Integrity Detection of Compiler Generation
Stack protection is a compiler technology that provides program pointer integrity checks, implemented by checking the return address in the function activity record. Stack protection as a small patch of GCC, in each function, joined the function to establish and destroy the code. The joined function creation code actually returns some additional bytes after the function returns the function in the stack, as shown in Figure 2. When the function returns, first check whether this additional byte is changed. If an attack of the buffer overflow occurs, then this attack is easy to detect before the function returns.
(figure 2)
However, if an attacker is foreseen that the existence of these additional bytes can be fabricated in the same manner, he can successfully skip the detection of stack protection. Usually, we have the following two programs to deal with this deception: termination symbol:
Using the termination symbols such as 0 (NULL), CR, LF, -1 (EOF) in the C language, such as the common string function, since these functions have ended the function process.
Random symbol:
Using a 32-bit random number generated when the function is called, the attacker is impossible to guess the contents of additional bytes. Moreover, each time is called, the content of the additional byte is changed, and it cannot be predicted.
The stack protection method of the integrity of the stack is evolved from the SyntHetix method. The SYNTHETIX method ensures the correctness of a particular variable by using quasi-invariat. The change in these specific variables is that the program is able to achieve presence, and can only be changed in meeting a certain condition. This variable we call the quasi-invariant. SYNTHETIX has developed some tools to protect these variables.
The changes arising from the attacker overflow through the buffer can be used as an illegal action. In some extreme cases, these quasi-invariant may be illegally changed, which is a need for stack protection to provide more complete protection.
(Table I)
The data of the experiment shows that stack protection has a good protection for buffer overflow attacks of various systems, and maintains better compatibility and system performance. It is listed in the table one by the earlier priority. Subsequently, we re-construct a complete Linux system with stack protection (Red Hat 5.1). Then we attack this with XFree86-3.3.2-5 and LSOF, the results show that this system effectively resists these attacks. These analyzes indicate that stack protection can effectively resist the current and future stack-based attacks.
Stack Protection Version Red Hat Linux 5.1 has been running over a variety of systems, including personal laptops and working group file servers. You can get this version from our web server and have 55 members in our mailing list. Out of an exception, this system is exactly the same, which indicates that stack protection does not have a large impact on the compatibility of the system.
We have used various performance tests to evaluate the performance of stack protection. The results of mircobenchmarks indicate that the system's overhead increases in the call and stack protection. In the test of the network (where you need to use stack protection), this overhead is not very large.
Our first test object is SSH, which provides strong encryption and authentication to replace Berkeley's R series instructions. SSH uses software encryption, so the bandwidth of the system is not large, we use a large file between networks to test bandwidth:
SCP Bigsource Localhost: BigDest
The test results show that the stack protection has little affecting the network throughput of SSH.
The second test uses the Apache Web server. If such a server exists based on a stack attack, then an attacker can easily obtain the control of the web server, allowing attackers to read secret contents and tamper with the contents of the home page. At the same time, the web server is also a server component that requires high performance and bandwidth requirements.
We tested with WebStone with and without stack-protected Apache web servers. The results of the test are listed in Table II.
Like SSH, their performance is almost different. In the case where the number of customers is small, the protected server performance is slightly better than the protection, and when the number of clients is more, it is better than the protection. In the worst case, the belt-protected server is maintained at an average delay in an average delay than the non-protected server. As before, we concluded these to the impact of noise. Therefore, our conclusion is that stack protection has no significant impact on Web server system performance. 3.4.3 Pointer Protection: Compiler Generating Program Pointer Integrity Check
When the stack protection is designed, the impact stack constitutes a common form of the buffer overflow attack. Some people speculate that there is a template to constitute these attacks (in 1996). Since then, many simple vulnerabilities are discovered, implemented, and patch, many attackers begin to implement buffer overflow attacks with more general methods described in the second part.
The pointer protection is a promotion of stack protection for this situation. Check the legality before being called by placing additional bytes after all code pointers. If the test fails, the execution of the alarm signal and the exit program will be issued, just like the behavior in stack protection. There are two points of this solution to pay attention:
Additional byte positioning:
The space of the additional byte is assigned when the protected variable is assigned, while being initialized during the initialization of the protected byte. This brings problems; in order to maintain compatibility, we do not want to change the size of the protected variable, so we cannot simply add additional words in the structural definition of the variable. Also, there are different types of additional bytes for various types.
Check additional bytes:
Check the integrity of additional bytes every program pointer is quoted. This also has problems; because "read from Accessor" does not speak in the compiler; the compiler is more concerned about the use of pointers, and various optimization algorithms tend to read variables from the memory.
There is also a different type of variable, the method of reading is also different.
We have developed a prototype of pointer protection (or GCC), which protects the static allocated function pointer by additional bytes, but does not apply to structural and array types. This plan is far from completion. Once this project is complete, the executable code composed of it and stack protection will not be attacked by the buffer overflow.
So far, only a few part of the attack using non-pointer variables can escape the test of pointer protection. However, the detection can be achieved by joining an additional byte for a variable on the compiler, and the programmer needs to be manually added to the corresponding protection.
3.5 Considering compatibility and performance
Program pointer integrity checks compared to boundary checks and cannot prevent all buffer overflow issues. However, there is a considerable advantage in performance and compatibility:
performance:
Boundary check must be checked once when each array element is operated. In contrast, the program pointer check is only implemented when it is referenced. Whether in C or in C , this flower is always smaller than that of the program pointer reference is smaller than the array.
Application efficiency:
The most difficult implementation of border check is in the C language, it is very effective to determine the boundaries of the array. This is due to the concept of an array in C, which is caused by the mix of the universal pointer. Since a pointer is an independent object, there is no association with a specific boundary condition, only one system machine word is stored, but the data identified boundary information is not stored. Therefore, special methods are required to restore this information; the references of the array will not be a simple pointer, but a pointer group description to the buffer.
Compatibility with existing code:
Some border check methods have been lost in the performance of the system in order to remain compatible with the existing code. And others use other methods to achieve the purpose. In this way, the traditional C conversion rules are broken, and there is a new C compiler that can only compile a subset of C, and some can not use the pointer or other changes.
4. Effective combination
Here we study, comparing the various vulnerabilities in the second section and the defense methods described in the third part to determine what combination can completely eliminate buffer overflow issues. We columbrly in Table 3 on Table 3, but we didn't calculate the boundary inspection because it can effectively prevent all buffers from overflow, but the expenditure is also amazing. The most common buffer overflow is the attack activity record and then enter the code in the stack. This type of attack has many records in 1996. Instead of performing stacks and stack protection can effectively defend this attack. Non-executable stacks can defend all the attack methods of colonizing the stack, stack protection can defend all the ways of changing the record. These two methods are compatible with each other and can simultaneously defend a variety of possible attacks.
The remaining attacks can basically defense with pointer protection, but in some special occasions need to be manually implemented. Fully automatic pointer protection needs to add additional bytes for each variable, which makes the pointer boundary check in some cases advantage.
The most interesting thing is that the first buffer overflow vulnerability - MORRIS worm uses all methods that have been effectively defense today, but few people are used, maybe this method is too complicated.
5 Conclusion
In this article, we describe and analyze the attack and defense methods of buffer overflow. Since this attack is currently a common attack method, this research work is meaningful and effective. The results of the study show that the stack protection method and non-execution buffer method can effectively defense in most of the current majority of attacks, and the method of pointer protection can effectively defense for the remaining attacks. Finally, the declaration is an attack on Morris worms, and there is no effective defense resort to date.