Expression practice: spilling from the buffer
SpaceSoft [Dark Night Sand]
At present, most popular security vulnerabilities in popular software come from buffer overflow. In 1999, the buffer overflow accounted for more than 50% of all major security errors proposed by the CERT / CC. This problem seems to have been long been stretched, everyone thinks about a variety of ways to avoid, but they will prevent it. Solving the buffer overflow is also no silver bomb, it needs more meticulous and serious.
This paper hopes to use this tricky problem as an example to explore some issues related to functional expression and functional parameters.
The principle of buffer overflow is quite complicated. A typical overflow appears: When calling a function, press the function parameters, local variables, and returns the pointer to the function stack, if you run, write more than this array capacity to a part of the array Data, so many data over the stack, override the boundary of this array, cover other variables, and even override the return pointer of the function. If we covers the original return pointer with a pointer to another valid function, we change the operation of the original program. However, the original function is compiled, how to change its local variable array size? Of course, it is a parameter, so that the design and processing of the parameters becomes the first line of defense against the overflow of the buffer, which is the origin of our discussion.
If you want to further understand the principle of buffer overflow, you can refer to the article on IBM's developerWorks China website "Let your software run: substantive issues and destroy attack."
In order to more illustrative description, we will describe an example of a requirement: We want to implement a function, read Sections in the specified INI file, use C language implementation, In order to more typically, we stipulate that use strings (char *) to do parameters and cannot use String classes.
Ok, let's take a closer analysis, the problem we encountered under this demand: The information we have obtained is the designated file name. This is a string. We have to output a series of sections, we also make it String output. Well, friends who are familiar with the API should be thought of, this is the legend that API --GetPrivateProfileeSectionNames. Yes, it is it, in order to focus on our concern of the function parameters, we don't have to have anything else to influence our ideas, so that we have to discuss how to put it well. Now we see that this function is input is a string array. The output is also a string array. The work is to be switched in string processing, which has met the typical easy buffer overflow: repeatedly with the string array Deal.
When discussing the memory buffer, the most basic problem is to apply for problems, of course, without considering the automatic management means of running period resources such as garbage collection and reference count, we only have two options: the caller application or function itself. . The typical example of the former is the GetPrivateProfileeSectionNames API itself, so that the case can be used in advance to know or estimate the size of the buffer in advance. The typical example of the latter is the API function NetSharegetInfo. The caller does not know how much buffer needs to be allocated, and it has to be assigned by the called function.
The benefits of assigning by the caller are not needed to understand or estimate how much buffers need to be needed, and allocated buffers are always suitable, and they will not cause overflow (of course not absolute). A disadvantage is that the caller does not necessarily understand the application method of the resource, is LOCALALLOC or GLOBALLOC? Not necessarily known. So don't know how to release these resources. So the programmer written by the called function has to write a function to supply the call to release the resource, such as the NetShareGetInfo mentioned earlier corresponding to the NetApibUFFREE function to release the resource. Therefore, in the usual programming, we are more about seeing everyone using the previous way: the caller applies for the resource of the buffer, and the caller decides when to release it. So everyone has to carefully handle the handling parameters to ensure that the modified function does not write data larger than the buffer length to the buffer, resulting in overflow. As we said, however, the way to distribute buffers with the caller will not cause the buffer overflow, since there is no overflow, then what do we do this article? So we still designed our function parameters into a previous form. So, we get the prototype of our function parameters:
Int MygetPrivateProfileeSECTIONNames (const char * pszpilename, char * PMYBUFFER);
And slow! We discuss that the buffer overflows! What is the performance of this parameter expression in front of this topic in the buffer overflow?
Not good, very bad. From the perspective of the function itself, did this parameter table tell me how big this buffer? Some people say that you can use the SizeOf buffer to get the size of the buffer. Yes? Try you? SizeOf is made of char * This pointer is not the size of the buffer! To know, SizeOf gets compile information and cannot help us determine the length and validity of a buffer pointing to a pointer.
From the caller's perspective? Look at this parameter table, it is in fact expressing the opposite of us to express: there is no length requirement of Buffer, that is, "The length of the buffer you will not take", it is not to be called Does the tube? If I look at the code, I will start to find myBufferFree () :)
The meaning of language expression is expressed through the conventional customs, if an expression is used by many people, then you use the same expression, it is easier to understand.
OK, then we add a parameter to indicate the size of the incoming buffer. Because the buffer size cannot be negative, this size should be described using the number of UNSIGN types, so others will not pass. So now our function becomes like this:
Int MygetPrivateProfileeSectionNames (const char * pszpilename); DWORD dwbufferlen;
Is it the same as getPrivateProfilesectionNames? Hey, it seems that the API is really a good thing, however, don't forget, using everyone is using the expression, the expression effect is best. In this way, the parameters of the buffer pointer buffer length, everyone can see more, it is easier to accept.
Ok, now I have to get the information, are we preventing the expiration of the buffer from overflowing? of course not.
Can you guarantee that the caller of the function is as you wish each time. If you can't believe your users, be careful to prevent them from crashing your software, how can you guarantee that each caller of your function is not wrong when calling? How do you perform parameter check when you write a program? A very common way is like this:
Int MygetPrivateProfileseNames (const char * pszpilename, char * pszmybuffer, dword dwbufferlen) {if (! pszpilename) | (! pszmybuffer) Return false; // do something you need}
Nice, very strong written, but we care about: Is it a good expression?
The answer I gave is negative. After a failure called, the information obtained by the caller is: The call failed, returned a false, then where is the wrong point, so the caller has to follow into the function in order to find "Oh, I have passed an empty pointer. " If this function itself does something that is likely to fail, such as calling gethostbyname, then such a return error is not easy to distinguish between specific error reasons, and make developers ignore some important situations.
This is a more typical example "too strict protection" example, so that the program is very strong, it is difficult to collapse, and it is easy to hide a lot of difficult mistakes. For example, the caller may have an error in some special case, passing an empty pointer, if there is no such defense, then such an error is likely to remind the developer's wrong in the test stage, After adding such a strict defense, that error is likely to be covered, until one day broke out, let the user go crazy.
Where is the problem? The problem is that this code does not have a clear statement in the critical statement: "If I want to work, you can't get empty pointer to me." An unidentified returning a false does not extend his claim. Defense code should try not to cover up the errors that may be brought by logic issues. At the same time, we should use the code to display those possible logic errors to the compilation phase and the debug phase. For example, two assertions can help us discover the parameter pointers empty errors during the debug phase. And clearly express this meaning: "These two pointers can absolutely can't be empty, otherwise this function cannot be running." No documentation is required, the code has expressed its own needs.
Int MygetPrivateProfileeSECTIONNames (const char * pszpilename, char * pszmybuffer, dword dwbufferlen) {assert (! pszpilename); assert (! pszmybuffer); // do something you need}
The use of the assertion can help us find some "not there" when compiling and debug, but asserts that it cannot replace normal defense code, such as code used to handle an exception situation such as memory allocation, because of the assertion It does not exist in the Release version. For assertion timing, use of use, and relatively further discussions, you can refer to some articles about programming "contract", such as Myan's "What is the view of Contract - Effel".
Of course, we have to further express our needs: "Yes, I took a valid pointer to me, but I don't think there is something like a pointer. I have to, I am not a casual function." Then we can check the effectiveness of the pointer. For example, use IsbadwritePtr (PszmyBuffer, dwbufferlen) this API to test. It should be noted: no matter what reason, there should be that DWBufferlen is much larger than pszmybuffer, or although this pointer is not empty, it is not a buffer, it is not a buffer. "mistake. It should be thoroughly eliminated in the test phase and should not have additional burden on the release version. So this check We include it in an assertion: assert (! Isbadwriteptr (pszmybuffer, dwbufferlen);
IsbadWritePtr You can check all bytes of the specified buffer, whether you have check permissions. Can be used to check the validity of the buffer.
Hey, from the buffer overflow, we talked to the principle of buffer overflow, the design of the function parameters, and how to express your design ideas and function conditions. But in the end, our focus is on how to express design ideas and function restrictions. Compared with the specific technology's drilling, it may be relatively less than the design idea of the specific technology. Perhaps this is also what we are more and more by the document, but in order to maintain a lot of comments, we will race your brain.
Anyway, I still insist on this point: from essentially, programming is expressing, so paying attention to code expression, will help provide code readability, helping us to share our ideas with others easier And work.
Welcome to the author's personal homepage: http://www.mrspace.net/