Excerpt from 9cbs http://dev.9cbs.net/develop/Article/22/Article/22/Article/22/Article/22/Article/22/22766.shtm Talking about Memory Leakage (1) For a C / C program For the convenience, memory leak is a common and headache. There have been many technologies to be studied to deal with this problem, such as Smart Pointer, Garbage Collection, etc. Smart Pointer technology is relatively mature, and the STL already contains Class of Smart Pointer, but its use does not seem broad, and it does not solve all the problems; Garbage Collection technology is more mature in Java, but in C / C field The development is not smooth, although some people think about the support of GC in C . The real world is like this, as a C / C programmer, memory leak is your feelings in your heart. However, in now there are many tools to help us verify the existence of memory leaks, find out the code that has problems. Definition of memory leakage
Generally, the memory leakage is the leakage of the reactor memory. The stack memory is that the program is allocated from the heap, the size is arbitrary (the size of the memory block can be determined in the program running period), and the released memory must be displayed after use. Applications generally allocate functions such as Malloc, Realloc, New, from the heap, after use, the program must be responsible for the corresponding call free or delete release the memory block, otherwise, this memory cannot be used again, we will Said this memory leak. The following small program demonstrates the situation of leakage in the stack of memory:
Void myfunction (int nsize)
{
Char * p = new char [nsize];
IF (! GetStringFrom (p, nsize)) {
MessageBox ("error");
Return;
}
... // using the string pointed by p;
Delete P;
}
Example
When the function getStringFrom () returns zero, the memory pointed by the pointer P is not released. This is a common situation in which memory leaks occur. The program allocates memory at the entrance, releases memory at the exit, but the C function can exit anywhere, so once the memory that should be released at an exit, the memory leak will occur.
Generalized saying that memory leaks not only contains leaks of stacks, but also contains resource leaks, such as core states, GDI Object, Socket, Interface, etc., fundamentally, these objects allocated by the operating system are also The memory is consumed, and if these objects, the leakage will ultimately lead to the leakage of memory. Moreover, some object consumes core state memory, which will cause the entire operating system unstable when it is severely leaked. Therefore, the leakage of system resources is more serious than the leakage of the stack memory. Leakage of GDI Object is a common resource leak:
Void CMYVIEW :: OnPaint (CDC * PDC)
{
CBITMAP BMP;
CBITMAP * POLDBMP;
Bmp.LoadBitmap; IDB_MYBMP;
PoldBMP = PDC-> SELECTOBJECT (& BMP);
...
IF (Something ()) {
Return;
}
PDC-> SELECTOBJECT (POLDBMP);
Return;
}
Example
When the function Something () returns non-zero, the program does not select the PoldBMP back to the PDC before exiting, which will lead to the HbitMap object to PoldBMP. If this program is run for a long time, it may cause the entire system. This problem is easier to expose under Win9X because Win9x's GDI stack is much smaller than Win2K or NT. Memory leakage:
Classify in the way, memory leaks can be divided into 4 categories:
1. Avantime memory leak. The code that occurs in memory leaks will be executed multiple times, and a memory leak will be caused each time it is executed. For example, if the Something () function has returned True, the HbitMap object points to PoldBMP always leaks.
2. Appetitive memory leak. The code that occurs in memory leaks only occurs under certain specific environments or operations. For example, if the Something () function returns TRUE in a particular environment, the HBitMap object pointing to PoldBMP does not always leak. Avantages and occasionalities are relative. For a specific environment, the suspended may become a normal. So the test environment and test methods are critical to detect memory leaks.
3. Disposable memory leakage. The code that has a memory leak will only be executed once, or due to the defects on the algorithm, there will always be only a piece of only and one memory leak. For example, allocate memory in the constructor of the class, but does not release the memory in the destructuring function, but because this class is a Singleton, the memory leak will only occur once. another example:
Char * g_lpszfilename = null;
Void setFileName (const char * lpcszfilename)
{
IF (g_lpszfilename) {
Free (g_lpszfilename);
}
g_lpszfilename = strdup (lpcszfilename);
}
Example three
If the program does not release the string pointing to the g_lpsfilename at the end, even if setFileName () multiple times, there will always be a memory, and there is only one memory that leaks.
4. Implicit memory leaks. The program is constantly allocated during the running process, but the memory is released until the end. Strictly speaking, there is no memory leak, because the final program releases all application memory. But for a server program, it is necessary to run a few days, several weeks or even months, and the memory may not be released in time may result in all memory of the final depletion system. Therefore, we call such memory leaks for implicit memory leaks. for instance:
Class Connection
{
PUBLIC:
Connection (socket s);
~ Connection ();
...
Private:
Socket_socket;
...
}
Class ConnectionManager
{
PUBLIC:
ConnectionManager () {
}
~ ConnectionManager () {
List
For (it = _Connlist.begin (); it! = _Connlist.end (); IT) {
Delete (* it);
}
_Connlist.clear ();
}
Void OnClientConnected (Socket S) {
Connection * P = New connection (s);
_Connlist.push_back (p);
}
Void OnClientDisconnected (Connection * PConn) {_ connlist.remove (pconn);
DELETE PCONN;
}
Private:
List
}
Example four
Suppose After the client is open from the Server end, Server does not call the onClientDisconnected () function, then the Connection object representing the connection will not be deleted in time (when the Server program exits, all Connection objects will be analyzed in ConnectionManager. The constructor is deleted). When there is a continuous connection, it happens when it is disconnected.
From the perspective of the user using the program, the memory leak itself does not happen, as a general user, it does not feel the existence of memory leaks. Really harmful is the accumulation of memory leaks, which will eventually exhaust all the memory of the system. From this perspective, there is no harm of disposable memory leaks because it does not accumulate, and the harmful memory leakage is very large, because it is more difficult to detect more than a haired and occasive memory. .
Talking about Memory Leak (2)
Test memory leakage:
The key to detect memory leaks is to capture the call to the function of allocating memory and release memory. Intercept these two functions, we can track the life cycle of each memory, for example, after the successful assignment of memory, add its pointer to a global list; whenever you release a memory, then put it The pointer is removed from the list. Thus, when the program ends, the remaining pointers in the List are pointing to those that are not released. Here is just a simple description of the basic principles for detecting memory leaks, detailed algorithms can be found in the << Writing Solid Code >> of Steve Maguire.
If you want to detect the leak of the stack, you can intercept Malloc / Realloc / Free and New / Delete can be used (in fact new / delete ultimately using Malloc / Free, so as long as you intercepting a group). For other leaks, a similar approach can be used to capture the corresponding allocation and release functions. For example, to detect the leak of BSTR, you need to intercept Sysallocstring / Sysfreestring; to detect HMENU leaks, you need to intercept CreateMenu / DestroyMenu. (Some resource allocation functions have multiple, the release function is only one, for example, SysallocStringlen can also be used to assign BSTR, and then intercept multiple allocation functions)
Under the Windows platform, the tools that detect memory leaks are commonly used, MS C-Runtime Library built-in detection function; plunched detection tool, such as Purify, BoundSchecker, etc .; Performance Monitor that comes with Windows NT. The three tools have advantages and disadvantages, although the MS C-Runtime Library is functionally weak, but it is free; Performance Monitor does not indicate the code of the problem, but it can detect implicit The existence of memory leaks, this is where two other types of tools are unable to force.
Here we discuss these three test tools in detail:
Detection method of memory leak under VC
Using the application developed by MFC, after compiling in the Debug mode, the detection code of the memory leak will be automatically added. After the program is over, if a memory leak has occurred, all the information of all leaks occurred in the debug window, the following two lines show information of the leak-leaked memory block: E: /TESTMEMLEAK/testdlg.cpp 70): {59} Normal Block AT 0x00881710, 200 BYTES Long.
Data:
The first line shows that the memory block is allocated by the TestDlg.cpp file, the 70r-line code, the address is 0x0081710, the size is 200 bytes, and {59} refers to the request order of the memory allocation function. For more information about it, see MSDN Help in _CRTSetBreakalloc (). The second line shows the contents of the 16 bytes of the memory block, and the angle brackets are displayed in an ASCII mode, followed by a 16-way method.
Generally, everyone is mistaken to think that the detection function of these memory leaks is provided by the MFC, which is not. The MFC is only packaged and utilized with the Debug Function of MS C-Runtime Library. Non-MFC programs can also use the Debug Function of MS C-Runtime Library to join memory leak detection. MS C-Runtime Library has built a memory leak detection function when implementing functions such as Malloc / Free, StrDUP.
Note Observe the project generated by the MFC Application Wizard, there is such a macro definition at the head of each CPP file:
#ifdef _Debug
#define new debug_new
#undef this_file
Static char this_file [] = __file__;
#ENDIF
With this definition, when compiling the Debug version, all NEWs in this CPP file are replaced into debug_new. So what is debug_new? Debug_new is also a macro, here is taken from Afx.h, 1632 lines
#define debug_new new (this_file, __LINE__)
So if there is such a line of code:
Char * p = new char [200];
After the macro replacement becomes:
CHAR * P = New (this_file, __line __) char [200];
According to C standards, for the above New's method, the compiler will find this defined Operator New:
Void * Operator New (size_t, lpcstr, int)
We found a so Operator new implementation in AfxMem.cpp 63 lines.
Void * AFX_CDECL Operator New (size_t nsize, lpcstr lpszfilename, int nline)
{
Return :: Operator New (nsize, _normal_block, lpszfilename, nline);
}
Void * __cdecl operator new (size_t nsize, int NTYPE, LPCSTR LPSZFILENAME, INT NLINE)
{
...
PRESULT = _malloc_dbg (nsize, ntype, lpszfilename, nline);
IF (PRESULT! = NULL)
Return PRESULT;
...
}
The second Operator New function is relatively long. For the sake of simplicity, I only extracted the part. Obviously the last memory distribution is also implemented through the _malloc_dbg function, this function belongs to the debug function of MS C-Runtime Library. This function not only requires the size of the memory, but also two parameters of the file name and the line number. The file name and the line number are used to record which code caused by this assignment. If there is no release before the program ends, then this information will be output to the Debug window. Here, this_file, __ file and __line__ are encompasced here. __File__ and __line__ are all macros defined by the compiler. When it comes to __file__, the compiler will replace __file__ to replace it with a string, which is the path name of the current compiled file. When it comes to __line__, the compiler will replace __line__ to a number, this number is the line number of the current line code. In the definition of debug_new, it is not directly used in __file__, but used this_file, the purpose is to reduce the size of the target file. Assuming that there is 100 in a CPP file, if you use __file__ directly, the compiler generates 100 constant strings, which is the path name of this CPP file, which is clearly very redundant. If you use this_file, the compiler will only generate a constant string, and the 10 calls used by New are pointers that point to the constant string.
Ob again observing the project generated by the MFC Application Wizard, we will find that only the NEW is only mapped in the CPP file. If you use the malloc function directly in the program, call Malloc's file name and the line number will not be recorded. of. If this memory has leaks, MS C-Runtime Library can still detect, but when the information of this memory block is output, the file name and line number allocated there is no.
It is very easy to open memory leak in non-MFC programs. You only need to add the following lines of code at the entrance of the program:
INT TMPFLAG = _CRTSETDBGFLAG (_CRTDBG_REPORT_FLAG);
TMPFLAG | = _CRTDBG_LEAK_CHECK_DF;
_CRTSETDBGFLAG (TMPFLAG);
In this way, after the end of the program, it is also after the WinMain, Main or DLLMAIN function returns, if there is still memory blocks, their information will be printed into the Debug window.
If you try to create a non-MFC application, add the above code at the entrance to the program, and deliberately do not release some memory blocks in the program, you will see the following information in the debug window:
{47} Normal Block AT 0x00c91c90, 200 Bytes long.
Data: <> 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
Memory leaks have indeed detected, but the file names and line numbers are missing compared to examples of the MFC program above. For a relatively large program, there is no such information, and the problem will become very difficult.
In order to be able to know where the leak memory block is allocated, you need to implement a mapping function similar to the MFC, map New, MAOLLOC and other functions to the _malloc_dbg function. Here I am not described again, you can refer to the source code of the MFC.
Since the debug function is implemented in the MS C-Runtimelibrary, it can only detect the leaks of the heap memory and is limited to the memory allocated by Malloc, Realloc or StrDUP, and those system resources, such as Handle, GDI Object, or not pass The memory allocated by C-Runtime Library, such as the leak of Variant, BSTR, which is impossible to detect, this is a major limitations of this test method. In addition, in order to record where the memory block is allocated, the source code must cooperate, which is very troublesome to debug some old procedures. After all, the source code is not a worries, this is another test method. A limiter. For developing a large program, the detection function provided by MS C-Runtime Library is far less than enough. Let's take a look at the external test tool. What is much more is BoundSchecker, one because it is more comprehensive, more important is its stability. Such tools are not stable, but will be busy with it. In the end, it is from the famous Numega, I have basically no big problem.
Talking about Memory Leak (3)
Use BoundSchecker to detect memory leaks:
Boundschecker uses a technology called Code Injection to intercept the call to allocate memory and release memory. Simply put, when your program starts running, BoundSchecker's DLL is automatically loaded into the address space (this can be implemented via System-Level), then it will modify the function call to memory allocation and release in the process. Let these calls first transfer to its code, then perform the original code. When BoundSchecker is doing these actions, there is no need to modify the source code or engineering profile of the debugged program, which makes it easy and straightforward.
Here we take the malloc function as an example, intercepting other function methods is similar.
The function that needs to be intercepted may be in the DLL, or in the code of the program. For example, if the C-Runtime Library is static, then the code of the Malloc function will be connected to the program. In order to seize the call to such a function, BoundSchecker dynamically modifies the instructions of these functions.
The following two paragraphs of assembly code, a paragraph without Boundschecker, another paragraph has the intervention of BoundSchecker:
126: _CRTIMP VOID * __CDECL MALLOC (
127: SIZE_T NSIZE
128:)
129: {
00403C10 PUSH EBP
00403C11 MOV EBP, ESP
130: Return _nh_malloc_dbg (nsize, _newmode, _normal_block, null, 0);
00403C13 PUSH 0
00403C15 PUSH 0
00403C17 Push 1
00403C19 MOV EAX, [__ newmode (0042376C)]
00403C1E PUSH EAX
00403C1F MOV ECX, DWORD PTR [nsize]
00403C22 PUSH ECX
00403C23 CALL _NH_MALLOC_DBG (00403C80)
00403C28 Add ESP, 14H
131:}
The following code has BoundSchecker intervention:
126: _CRTIMP VOID * __CDECL MALLOC (
127: SIZE_T NSIZE
128:)
129: {
00403C10 JMP 01F41EC8
00403C15 PUSH 0
00403C17 Push 1
00403C19 MOV EAX, [__ newmode (0042376C)]
00403C1E PUSH EAX
00403C1F MOV ECX, DWORD PTR [nsize]
00403C22 PUSH ECX
00403C23 CALL _NH_MALLOC_DBG (00403C80)
00403C28 Add ESP, 14H
131:}
When BoundSchecker is involved, the top three assembly instructions of the function Malloc are replaced with a JMP instruction, and the original three instructions are moved to address 01f41ec. When the program enters Malloc, first JMP to 01F41EC8, execute the original three instructions, and then the BoundSchecker's world. The return address of the function is roughly recorded (the return address of the function is on the stack, so it is easy to modify), then point the return address to the code belonging to the Boundschecker, then jumps to the original directive of the malloc function, that is, at 00403C15 . When the Malloc function is over, since the return address is modified, it will return to the BoundSchecker code, and BoundSchecker records the pointer to the memory allocated by MalloC, and then jumps to the original return address.
If the Memory Allocation / Release Function is in the DLL, BoundSchecker uses another method to intercept the call to these functions. Boundschecker lets the function address in Table points to your address by modifying the function address in Table to the purpose of the interception. Regarding how to intercept Windows system function, "Programmer" Magazine 2002, "API Hook revealed (below)," a summary of the modified import address table. I will not go to it again.
Intercept these allocation and release functions, BoundSchecker can record the life cycle of the assigned memory or resource. The next question is how the source code is related, that is, when BoundSchecker detects memory leaks, how do it report which code assignment of this memory block. The answer is debug information. When we compile a Debug version, the compiler records the correspondence between the source code and the binary code, put it into a separate file (.pdb) or directly into the target program. With this information, the debugger can complete the functions such as breakpoint settings, single-step execution, and variables. Boundschecker supports multiple debug information formats, which file can be assigned a source code to allocate a block of memory by directly read the debugging information. Which line is on. Using Code Injection and Debug Information so that BoundSchecker can not only record the source code of the call assignment function, but also record the source code location of the Call Stack when allocated, and the source code location of the function on the Call Stack. This is very useful when using class libraries like MFC, here I use an example:
Void showXItemMenu ()
{
...
CMenu Menu;
Menu.createPopupnupmenu ();
// add menu items.Menu.trackpropupnu ();
...
}
Void showyItemMenu ()
{
...
CMenu Menu;
Menu.createPopupnupmenu ();
// add menu items.
Menu.trackpropupnupmenu ();
Menu.Detach (); // this will cause Hmenu Leak
...
}
Bool CMenu :: CreatePopupmentu ()
{
...
HMENU = CREATEPOPUPMENU ();
...
}
When calling ShowyItemMenu (), we deliberately cause the HMENU's leakage. However, for BoundSchecker, the leak HMENU is assigned in Class CMenu :: CreatePopUpMenu (). Suppose your program has many places to use cMenu's createPopUpMenu () function, if you just tell you that the leak is caused by cmenu :: createpopupmenu (), you still can't confirm that the root knot is where, in ShowXItemMenu () Still in ShowyItemMenu (), or other places also use createPopupMenu ()? With the information of Call Stack, the problem is easy. Boundschecker will report the leakage of HMENU as follows:
Function File Line Cmenu :: CreatePopUpMenu E: /8168/VC98/mfc/mfc/include/AFXWIN1.INL 1009 ShowyItemMenu E: /TESTMEMLEAK /MYTEST.CPP 100 Here you omitted other function calls
In this way, we can easily find a function of a problem that ShowyItemMenu (). When programming class libraries such as MFC, most of the API calls are encapsulated in the class library, with the Call Stack information, we can very easy tracking to truly leak.
Recording the Call Stack information makes the running of the program very slow, so BoundSchecker does not record Call Stack information by default. You can open the option switch of the recorded Call Stack information as follows:
1. Open menu: Boundschecker | Setting ...
2. In the ERROR Detection page, select Custom in the List of Error Detection Scheme
3. Select Pointer and Leak Error Check in Category's Combox
4. Hook on the Report Call Stack checkbox
5. Click OK
Based on Code Injection, BoundSchecker also provides the verification function of API parameter, Memory Over Run. These features are very beneficial for the development of procedures. Since these contents do not belong to this theme of this article, it is not detailed here.
Although BoundSchecker is so powerful, it is still pale with implicit memory leaks. So let's take a look at how to detect memory leaks with Performance Monitor.
Use Performance Monitor to detect memory leaks
NT core has added system monitoring functions during the design, such as CPU usage, memory usage, usage of I / O operations, as a COUNTER, the application can learn about the entire system by reading these Counter Or a certain process health. Performance Monitor is such an application.
In order to detect memory leaks, we can generally monitor the Handle Count, Virutal Bytes, and Working Set of the Process object. The Handle Count records the number of Handle, which is currently open. Monitor this Counter helps us find whether the program has a Handle leak; Virtual Bytes records the size of the virtual memory used in the virtual address space, NT memory allocation The method of two steps, first, retain a space on the virtual address space, when the operating system does not assign physical memory, just retains a piece of address. Then, then submit this space, then the operating system is allocated physical memory. So, Virtual Bytes is generally greater than the program's Working SET. Monitoring Virutal Bytes can help us find some problems under the system; Working Set records the total amount of memory that the operating system has submitted, this value and the total amount of memory applications have a close relationship. If the program has a memory leak, this The value will continue to increase, but Virtual Bytes is an increase in jumping. Monitoring these Counters allow us to understand the situation of the process using memory, if a leak occurs, even implicit memory leaks, these Counters will continue to increase. However, we know that there is a problem, but we don't know where there is a problem, so it usually uses Performance Monitor to verify that there is a memory leak, and use BoundSchecker to find and solve the problem.
When Performance Monitor displays a memory leak, BoundSchecker cannot be detected, there are two possibilities: the first, an interpressive memory leak has occurred. At this time, you have to make sure that the running environment and operation method of the program are consistent with Performance Monitor and using BoundSchecker. The second, an implicit memory leak has occurred. At this time, you have to re-examine the design, then carefully study the changes of the value of the counter recorded in Performance Monitor, analyze the relationship between the changes and procedures running logic, and find some possible reasons. This is a painful process, full of hypothesis, guess, verification, failure, but this is also a great opportunity to accumulate experience.
to sum up
Memory leaks are a big and complicated issue, even with the Gabarge Collection mechanism, there is a possibility of leakage, such as implicit memory leaks. Due to the limitations of space and ability, this article can only be a shallow research on this topic. Other problems, such as leak detection under multi-module, how to analyze memory usage at the time of operation, etc., is the topic of research. If you have any ideas, suggestions or find some errors, welcome to communicate with me.
(All rights reserved, please indicate when reprint)
Author Blog:
http://blog.9cbs.net/johnnyxia/