A cross-platform C ++ memory leak detector (reproduced)

xiaoxiao2021-03-06 38

A cross-platform C memory leak detector

Wu Yuhao (

Adah@netstd.com)

March 2004

Memory Leaks can also be counted as a eternal topic for C / C programmers. Under Windows, a very useful feature of the MFC is to report whether memory leaks have occurred at the end of the program. Under Linux, it is not so easy to use, existing tools, ease of use, additional overhead, and performance like MPATROL. This article implements a C memory leak detector that is very easy to use and cross-platform. And discuss the relevant technical issues.

Basics

For one of the following simple programs Test.cpp:

int main ()

{

INT * p1 = new int;

Char * p2 = new char [10];

Return 0;

}

Our basic demand is of course two memory leaks for the program report. To do this, it is very simple, just compile Debug_New.cpp, you can go in. Under Linux, we use:

G Test.cpp debug_new.cpp -o test

The output is as follows:

Leaked Object AT 0x805E438 (Size 10, : 0)

Leaked Object AT 0x805E410 (Size 4, : 0)

If we need a clearer report, it is also very simple, starting at Test.cpp.

#include "debug_new.h"

I.e. The output of the line is as follows

Leaked Object AT 0x805E438 (Size 10, Test.cpp: 5)

Leaked Object AT 0x805E410 (Size 4, Test.cpp: 4)

very simple!

background knowledge

In the new / delete operation, C creates a call to the Operator New and Operator Delete for the user. This is the user can't change. The prototype of Operator New and Operator delete is as follows:

Void * Operator new (SIZE_T) THROW (std :: bad_alloc);

Void * Operator new [] (size_t) throw (std :: bad_alloc);

Void Operator Delete (Void *) throw ();

Void Operator delete [] (void *) throw ();

For "new int", the compiler generates a call "Operator New", and for "New Char [10]", the compiler will generate "Operator New [] (Sizeof (Char) * 10) "(If the NEW is followed by a class name, it is of course called the constructor of this class). Similarly, for "Delete PTR" and "Delete [] PTR", the compiler generates "Operator Delete (PTR)" call (Operator Delete [] (PTR) call (if the PTR type is pointer to the pointer to the object The destructive function of the object is called before Operator Delete). When the user does not provide these operators, the compilation system automatically provides its definition; while the user provides these operators, it covers the version provided by the compile system, so that precise tracking and control of the dynamic memory allocation operation can be obtained. At the same time, we can also use the Placement New operator to adjust the behavior of the Operator NEW. The so-called Placement New means a New operator with additional parameters, for example, when we provide a prototype

Void * Operator New (size_t size, const char * file, int line);

When the operator, we can use "New (" Hello ", 123) Int" to generate a call "Operator New," Hello ", 123). This can be quite flexible. Another example, a Placement New operator provided by the C standard requires the compiler is

Void * Operator New (size_t size, const st :: nothrow_t &);

Among them, Nothrow_t is usually an empty structure (defined as "struct nothrow_t {};"), its unique purpose is to provide a compiler that can identify specific calls according to the overload rule. Users generally use "New (std :: nothrow) type (Nothrow is a constiling_t type constant) to call this Placement New operator. It is the difference between the standard NEW is that NEW will throw an exception when allocating the memory fail, and "new (std :: nothrow" returns an empty pointer when the memory fails.

It should be noted that there is no corresponding "delete (std :: nothrow) PTR" syntax; however, it will mention another related issue.

To learn more about the C language characteristics, see [Stroustrup1997], especially 6.2.6, 10.4.11, 15.6, 19.4.5 and B. 3.3.4. These C language features are the key to understanding this implementation.

Detection principle

Similar to some other memory leak detection, the debug_new provides Operator New overload and uses macros to replace in the user program. The relevant parts of Debug_New.h are as follows:

Void * Operator New (size_t size, const char * file, int line);

Void * Operator new [] (size_t size, const char * file, int line);

#define new debug_new

#define debug_new new (__ file__, __line__) Take the above to join Debug_New.h, "New Char [10]" will become "New Char [10]" after pre-treatment, "New (Test.cpp", 4) char [10] ", the compiler will generate a" Operator New [] (SIPEOF (CHAR) * 10, "Test.cpp", ". By customizing "Operator New (Void *)" in debug_new.cpp (and "Operator New [] ..."; to avoid The text is not particularly pointed out, saying that Operator New and Operator Delete contain an array version at the same time), I can track all memory allocation calls and alarms the mismatched New and Delete operations on the specified checkpoint. Implementation can be quite simple, using the MAP record all assigned memory pointers: New Time to Map plug in a pointer and its corresponding information, delete delete the pointer and corresponding information; DELETE If the pointer does not exist in the MAP Delete; if the program exits If there is an unlealed pointer in the MAP, memory leaks are described.

However, if it does not contain debug_new.h, this method can't play a role. Not only that, some files contain debug_new.h, and some do not include debug_new.h is not feasible. Because although we used two different Operator New - "Operator New (size_t, const char *, int)" and "operator new (size_t)" - but available "operator delete" is only one! Using our custom "Operator Delete", when we delete pointers allocated by "Operator New", the program will consider that the deletion is an illegal pointer! We are in a two difficulties: either generate a false positive for this situation or no alarm twice for repeatedly deleting the same pointer: is not acceptable.

It seems that the custom global "Operator new (size_t)" is inevitable. In Debug_new, I did this:

Void * Operator New (size_t size)

{

Return Operator New (Size, "", 0);

}

However, the above-described way to implement memory leak detector, in some C implementations (such as SGI STL with GCC 2.95.3) work normally, but will collapse in other implementations. The reason is not complicated, SGI STL uses a memory pool, assigns a large memory once, thus making it possible to use MAP; but other implementations may not do this, add data in MAP to call Operator New, and Operator New will be Add data to Map to form a dead cycle, causing memory overflow, the application crashes immediately. Therefore, we have to stop using the convenient STL template, and use the manually built data structure:

Struct new_ptr_list_t

{

NEW_PTR_LIST_T * NEXT; Const Char * file;

Int line;

SIZE_T SIZE;

}

My initial implementation method is to call Malloc multi-allocate SizeOf (new_ptr_list_t) by one by using new allocation memory, and stringed allocated memory into one linked list (using the next field), put the file name, line number, The object size information is stored in the file, line, and the size field, and then returns (pointers returned by MalloC SizeOf (new_ptr_list_t)). When DELETE, search in the linked list, if you find a list pointer sizeof (new_ptr_list_t) == to release the pointer), adjust the linked list, release the memory, and find the Report Remove the illegal pointer and Abort .

As for the automatic detection of memory leaks, my approach is to generate a static global object (according to the Life Life of C , the constructor of the object is called when the program is initialized, and the destructor of the object is called when it exits), in The function of detecting memory leakage is called in its destructuring function. The user manually calls the memory leak detection function is of course possible.

This is the case.

Availability improvement

The above scheme was initially working very well until I started to create a large number of objects. Since you need to search in the linked list each time you delete, the average search number is (linked list length / 2), and the program will slowly climb like a turtle. Although it is just used for debugging, it is unacceptable to speed too slow. So I made a small change, change the new_ptr_list to the header head to an array, one of which list of object pointers is placed in a hash value. - The user can change the definition of macro debug_new_hash and debug_new_hashtablesize to adjust the behavior of Debug_New. Their current value is the definition that I have tested.

In use, we found that in some special cases (please see the comments about the debug_new_filename_len section), the file name pointer is invalid. Therefore, the default behavior of the current debug_new copies the header of the file name, not just a pointer to the file name. Also, please note that the length of the original new_ptr_list_t is 16 bytes, and now is 32 bytes, which guarantees that in general, memory is aligned.

In addition, in order to allow the program to work with new (std :: nothrow), I also overloaded Operator New (Size_T, Const std :: Nothrow_t &) throw (); otherwise, Debug_New will consider NEW (Nothrow) DELETE call delete is an illegal pointer. Since Debug_new does not throw an exception (the program is directly alarm exit), this overloaded operation is only called Operator New (SIZE_T). This will not have to say more.

As mentioned earlier, you have to get an accurate memory leak detection report, you can include "debug_new.h" on the file. My usual practice can be used as a reference:

#ifdef _Debug

#include "debug_new.h"

#ENDIF

The included position should be as early as possible, unless the system's header file (typical is STL's header file) has a conflict. In some cases, you may not want the debug_new to redefine new, then define debug_new_no_new_redefinition before containing debug_new.h, in which DEBUG_NEW should be used in the user application instead of NEW (by way of manner, no define debug_new_no_new_redefinition You can also use Debug_New instead of New). I may write this in the source file: #ifdef _debug

#define debug_new_no_new_redefinition

#include "debug_new.h"

#ELSE

#define debug_new new

#ENDIF

And use Debug_New when you need to track memory allocation (consider using global replacement).

Users can choose Debug_new_emulate_malloc so Debug_New.h uses debug_new and delete to simulate Malloc and Free operations such that Malloc and Free operations in the user program can also be tracked. When using some compilers (such as Digital Mars C Compiler 8.29 and Borland C Compiler 5.5.1), users must define no_placement_delete, otherwise compiling cannot be passed. Users can also use two global Boolens to adjust the behavior of debug_new: New_verbose_flag, default to false, defined as TRUE can display track information to standard error output at each new / delete; new_autocheck_flag, default is TRUE, ie When the program exits, Check_leaks check the memory leak, if the user must manually call Check_leaks to check the memory leak.

It should be noted that because the automatic call case_leaks is a static object destructor in debug_new.cpp, it is not possible to ensure that the destructure operation of the user's global object occurs before the check_leaks call. For MSVC on Windows, I use "#pragma init_seg (lib)" to adjust the order of the object assignment release, but unfortunately, I don't know in other compilers (especially, I have not succeeded in GCC How to do this in this issue). In order to reduce false alarm, the way I take is to set new_verbose_flag to True automatically when CHECK_LEAKS is automatically invoked; in this way, even if the memory leak is reported, the subsequent DELETE operation will still be printed. As long as the leak report is consistent with the content of the Delete report, we can still judge that there is no memory leak.

Debug_new can also detect errors that repeat the DELETE (or Delete Invalid Pointer) for the same pointer. The program will display the wrong pointer value and force the ABORT to exit.

Another problem is an abnormal handling. This is worth mentioning with a special section.

Abnormality in the constructor

Let's take a look at the following simple programs:

#include

Void * Operator new (size_t size, int line)

{

Printf ("Allocate% u Bytes on line% d // n", size, line;

Return Operator New (size);

Class Obj {

PUBLIC:

OBJ (INT N);

Private:

INT_N;

}

Obj :: obj (int N): _n (n)

{

IF (n == 0) {

Throw std :: runtime_error ("0 not allowed");

}

int main ()

{

Try {

Obj * p = new (__ line__) OBJ (0);

Delete P;

} catch (const st :: runtime_error & e) {

Printf ("Exception:% s // n", E.WHAT ());

}

See what there is a problem in the code? In fact, if we compile with MSVC, the compiler's warning message has told us what happened:

Test.cpp (27): Warning C4291: 'Void * __ cdecl operator new (unsigned int, int):

No matching operator deletefact; memory will not be freed if initialization throws an exception

Ok, put the debug_new.cpp link. The results of the operation are as follows:

Allocate 4 Bytes on line 27 Exception: 0 Not Allowed Leaked Object AT 00342Be8 (Size 4, : 0)

Ah, the memory is not!

Of course, this situation is not very common. However, as the object is getting more and more complex, who can ensure that the constructor of an object is a constructor or an object call all functions called in the constructor? Moreover, the method of solving this problem is not complicated, just need to have a compiler that has good support to the C standard, allowing the user to define the Placement delete operator ([C 1998], 5.3.4; the 1996 standard can be found. Draft, such as the following URL

http://www.comnets.rwth-aachen.de/doc/c std/expr.html#expr.new). In my test compiler, GCC (2.95.3 or higher, Linux / Windows) and MSVC (6.0 or higher) have no problem, while Borland C Compiler 5.5.1 and Digital Mars C Compiler (to V8. 38 All versions of this item do not support this feature. In the above example, if the compiler is supported, we need to declare and implement Operator Delete (void *, int) to recover the memory allocated by the NEW. If the compiler is not supported, it is necessary to use the macro to ignore the relevant statement and implementation. If you want Debug_New compiled in Borland C Compiler 5.5.1 or Digital Mars C Compiler, the user must define macro NO_PLACEMENT_DELETE; of course, users have to pay attention to this problem in the constructor.

compare plan

IBM DeveloperWorks published a memory leak detection method on a Linux designed by Mr. Hong Hao ([Hong Hhong 2003]). My program is compared to it, the main difference is as follows: Advantages:

Ÿ Cross-platform: Use only standard functions, and are proclaimed in multiple compilers such as GCC 2.95.3 / 3.2 (Linux / Windows), MSVC 6, Digital Mars C 8.29, Borland C 5.5.1. (Although Linux is my main development platform, I found that sometimes it is still very convenient to compile the code under Windows.)

Ÿ Easy to use: Due to heavy-duty Operator New (Size_T) - Mr. Hong Hao only overloads Operator New (Size_T, Const Char *, Int) - Even if I don't include my header file, the memory leak can be detected; You can automatically detect memory leaks when exiting; memory leakage generated in Malloc / Free in the user program (excluding system / library file) can be detected.

Flexibility: There are multiple flexible configurable items that can be selected when compiling using macro definitions.

Reproduction: No global variables, no nested delete issues.

Abnormal Security: In the case of compiler support, it is possible to handle an exception thrown in the constructor without memory leakage.

Disadvantages:

Single-threaded model: cross-platform multi-thread implementation is more troublesome, according to the actual needs of the project, the code is clear and simple, my scheme is not a thread security; in other words, if multiple threads are simultaneously performed, the new or delete operation The consequences are undefined.

Unreported memory leak detection report mechanism: Did not encounter this demand J; however, if you want to manually call the check_leaks function, it is not difficult, just cross-platformability is a bit problem.

It cannot be detected that the [] operator and the mismatch without [] operator mix: Mainly a demand problem (if it is not difficult to modify the implementation).

You can't display the file name and line number when the wrong Delete is called: It should be not a big problem; because I overload the operator new (size_t), you can guarantee that the delete error will inevitably have problems, so I don't just show the warning information, and will force Program Abort, you can check the call of the program when you check the program, check the problem.

In addition, there is now many commercial and Open SOURCE memory leak detectors, which don't plan to make a comparison one by one. DEBUG_NEW is still weak compared to them, but its good ease of use and cross-platform, low additional overhead are also great.

Summary and discussion

The above paragraphs basically explained the main features of Debug_New. Let's make a small summary.

Overloaded operator:

Operator new (size_t, const char *, int)

Operator new [] (size_t, const char *, int)

Operator new (size_t)

Operator new [] (size_t)

Operator new (size_t, const std :: nothrow_t &)

Operator new [] (size_t, const st :: nothrow_t&)

Operator delete (void *)

Operator delete [] (void *)

Operator delete (void *, const char *, int)

Operator delete [] (void *, const char *, int) Operator delete (void *, const std :: nothrow_t&)

Operator delete [] (void *, const st :: nothrow_t &)

Function provided:

Check_leaks ()

Check if a memory leak occurs

Global variable provided

NEW_VERBOSE_FLAG

Whether in New and Delete display information

NEW_AUTOCHECK_FLAG

Whether it is automatically detected a memory leak in the program exit

Refined macro:

NO_PLACEMENT_DELETE

Suppose the compiler does not support Placement delete (globally valid)

Debug_new_no_new_redefinition

Don't redefine new, assuming users will use debug_new yourself (which is valid when you contain Debug_New.h)

Debug_new_emulate_malloc

Refine Malloc / Free, use new / delete to simulate (which is valid when defbug_new.h)

Debug_new_hash

Change the algorithm of the memory block chain list hash value (valid when compiling debug_new.cpp)

Debug_new_hashtable_size

Change the size of the memory block chain list (valid when compiling debug_new.cpp)

Debug_new_filename_len

If you copy the file name when you allocate the memory, the file name is reserved;

Debug_new_no_filename_copy is valid when compiling debug_new.cpp; see Note in the file)

Debug_new_no_filename_copy

The file name is not performed when the memory is allocated, but it is only saved; the efficiency is high (it is effective when compiling debug_new.cpp; see the comment in the file)

I think that debug_new is currently a major defect that does not support multi-threaded. For a particular platform, it is not difficult to join multi-threaded support, it is difficult to get in general (of course, the conditional compilation is a way, although not elegant). This issue may be more perfectly solved when the thread model is included in the C standard. Another way is to use thread packages in a library like Boost, however, this will increase dependence on other libraries - after all, Boost is not part of the C standard. If the item itself does not need to boost, it doesn't matter if you use another library for this purpose. Therefore, I don't have this further improvement for the time being.

Another possible modification is to retain an exception behavior of standard Operator New, throw an exception (normal case) in the case of insufficient memory, or return null (nothrow), rather than terminating the program like now (see Debug_New. " The source code of CPP). The difficulty of this practice is mainly the latter: I didn't think about what method, I can reserve the syntax of New (notHrow), and can report the file name and the line number and can also use ordinary new. However, if the standard syntax is not used, debug_new and debug_new_nothrow are used, it is also very easy to implement.

If you have improved opinions or other ideas, welcome to letter discussions.

The source code of Debug_new is currently available

Download at DBG_New.zip.

After the writing of this article, I finally realized a version of thread security. This version uses a lightweight cross-platform mutex FAST_MUTEX (currently supporting Win32 and POSIX threads, using GCC (Linux / MingW), MSVC automatically detects the thread type by command line parameters). Interested words can be downloaded at http://mywebpage.netscape.com/yongweiwu/dbg_new.tgz.

Reference

[C 1998] ISO / IEC 14882. Programming Languages-C 1st Edition ,. International Standardization Organization, International Electrotechnical Commission, American National Standards Institute, and Information Technology Industry Council, 1998

[Stroustrup1997] Bjarne Stroustrup. The C Programming Language, 3rd Edition. Addison-Wesley, 1997

[Hong Hao 2003] Hong Hao.

"How to detect memory leak under Linux", IBM DeveloperWorks China website.

About author

Wu Yizhen, currently engaged in research and development of high-performance intrusion detection systems on Linux. There is a strong interest in developing cross-platform, high performance, reusable C code.

Adah@sh163.net can contact him.

转载请注明原文地址:https://www.9cbs.com/read-44255.html

9cbs

New Post(0)