CSTRING Working Principlet (ZZ)

xiaoxiao2021-03-06  66

I saw a lot of people writing programs, including some of my own code, found a big part of the bug is about the incorrect use of CString in the MFC class. This error is mainly the implementation mechanism for CString is not too understanding. . CString is a package for a string type in the original standard C. Because, through a long time programming, we found that many of the multi-program's bug multi-string is related, typical: buffer overflow, memory leak, etc. And these bugs are fatal, which will cause the system's paralysis. Therefore, C is specifically made to maintain a string pointer. Standard C string categories are string, which is used in the Microsoft MFC class library is the CString class. Through the string class, you can greatly avoid those problems on the string pointer in C. Here we simply see how CString in Microsoft MFC is implemented. Of course, it is best to take the principle and analyze it directly to analyze it. Most of the implementation of CString classes in the MFC is in strcore.cpp. CSTRING is a buffer for storing strings and an operation package applied to this string. That is, the cstring needs to have a buffer for storing strings, and a pointer points to the buffer, the pointer is LPTSTR M_PCHDATA. However, some string operations will increase or reduce the length of the string, so in order to reduce frequent application memory or release memory, CSTRING will apply for a large memory block to store strings. Thus, when the string length increases, if the total length of the total length does not exceed the length of the memory block of the predetermined block, it is not necessary to apply for memory. When the increased string length exceeds the pre-requested memory, the CString first releases the original memory before reapplying a larger memory block. Similarly, when the string length is reduced, no more memory space is released. However, when accumulating to a certain extent, it will only be released once. Also, when using a CString object A to initialize another CString object B, in order to save space, the new object B does not assign space, it is just to point your own pointer to the memory space of the object A, only when When you need to modify the strings in the object a or b, you will apply for memory space for the new object B, which is called write replication technology. In this way, only the specific cases of this memory cannot be completely described by a pointer, and more information is required to describe it. First, there is a need to have a variable to describe the total size of the current memory block. Second, a variable is needed to describe the current memory block has been used. That is, the length of the current string is additionally, there is a need to describe the case where the memory block is referenced by other CString. There is an object to reference the memory block, add this value to one. CString in a specifically defined to describe information structure: struct CStringData {long nRefs; // reference count int nDataLength; // length of data (including terminator) int nAllocLength; // length of allocation // TCHAR data [nAllocLength] TCHAR * DATA () // Tchar * to managed data {return (TCHAR *) (THIS 1);}}; The size of the memory block size of the structure is not fixed in CString, in CSTRING The memory block head is placed. SIZEOF (CStringData) from the head of the memory block is a real memory space for storing strings.

The application method of this structure's data structure is achieved: pdata = (cstringdata *) New byte [sizeof (cstringdata) (Nlen 1) * sizeof (tchar)]; pdata-> nalloclength = Nlen; where Nlen is Used to illustrate the size of the memory space that requires a disposable application. It can be easily seen from the code. If you want to apply for a 256 TCHAR memory block for storing strings, the actual application size is: sizeof (cstringdata) byte (Nlen 1) Tchar in front of SIZEOF CStringData) One Byte is used to store CStringData information. The back NLEN 1 TCHAR is truly used to store strings, and a more coming is used to store '/ 0'. All Operations in CString are for this buffer. For example, LPTSTR CSTRING:: GetBuffer (int NMINBUFLENGTH), its implementation method is: First get the pointer to the CStringData object by cstring :: getData (). This pointer is to offset SizeOf (CStringData) by the pointer M_PCHDATA of the string, resulting in the address of the CStringData. A CStringData object is then re-instantified according to the value given by the parameter nminbuflength, so that the string buffer in the new object can meet nminbuflength. Then reset some of the description values ​​in the new CStringData. Finally, return the string bursts in the new CSTRINGDATA object to the caller. These processes are described in C code: if (getData () -> nrefs> 1 || nminbuflength> getData () -> nalloclength) {// we have to grow the buffer cstringData * PoldData = getData (); int noldlen = getData () -> nDataLength; // AllocBuffer will tromp it if (nMinBufLength data (), (nOldLen 1) * sizeof (TCHAR)); GetData () -> nDatalength = noldlen; cstring :: release (PoldData);} assert (getdata () -> nrefs <= 1); // Return a Pointer to the Character Storage for this string assert (m_pchdata! = Null) Return M_PCHDATA; Many times, we often copy a copy of the large quantity string, etc., CString uses CopyBeForeWrite technology. Using this method, when another object B is instantiated with a CString object A, the value of the two objects is exactly the same, but if it is simple to apply for memory, there are only a few, a few Ten bytes of strings have nothing, if it is a few k or even a few M, it is a big waste. Therefore, CString is simply simply simply pointing the string address M_PCHDATA of the new object B directly to another object A. The string address m_pchdata.

The additional job made is to add one of the memoryData :: Nrefs of the object A. CSTRING :: CString & StringSrc) {m_pchdata = stringsrc.m_pchdata; interlockedIncrement (& getData () -> nrefs);} When modifying the character string content of the object A or object B, first check the value of the cstringdata :: nrefs, If it is greater than one (equal to one, "Only you apply this memory space), indicating that the object references other object memory or its own memory is applied by others, the object is first subtracted by the application value, then the memory is handed over. Other object management, reapply, copy the contents of the original memory. The simple code of its implementation is: void cstring :: copybeforeWrite () {if (getData () -> nrefs> 1) {cstringdata * pdata = getData (); release (); allocbuffer (pdata-> nDatales); memcpy (m_pchdata PDATA-> DATA (), (PDATA-> NDATALENGTH 1) * SIZEOF (TCHAR));}} where Release is used to determine the reference situation of the memory. Void cstring :: release () {if (getData ()! = _afxdatanil) {if (& getData () -> nrefs) <= 0) FREEDATA (GetData ());}} When multiple objects share the same memory When this memory belongs to multiple objects, not that object that belongs to the original application of this memory. However, each object is first subtracted by the reference to this memory, and then judges this reference value. If it is less than or equal to zero, it will be released, otherwise, it will be given to another is incorporated herein. Object control of block memory. CString uses this data structure, for large data volume string, saves many times frequently requiring release of memory to help improve system performance. Through the above analysis, we have a roughly understanding of the internal mechanism of CString. In general, CString in MFC is successful. However, since the data structure is more complicated (using CStringData), there is a lot of problems in use, and the most typical one is used to describe the attribute values ​​and actual values ​​of the memory block properties. The reason for this problem is that cstring provides some Operations for convenience, which can directly return directly to the address value of the string in the memory block, and the user can modify the address pointing to this address value, but after modification There is no call to the corresponding Operations1 to keep the value in CStringData. For example, the user can first get the string address through Operations, then add some new characters to this string, so that the length of the string increases, but since it is modified directly through the pointer, CStringData is described. NDATALENGTH is still the original length, so when the string length is obtained by getlength, the return will inevitably incorrect. There is an account of Operations in these issues.

1. The most typical one in GetBuffer is CString :: getBuffer (). Checked MSDN, description for this Operation is: Returns a Pointer to the Internal Character Buffer for the cstring Object. The return LPTSTR IS NOT Const and thus allows Direct Modification of Cstring Contents. This is a clear note, for the string pointer returned by this Operation, we can directly modify the values: CSTRING STR1 ("this is the string 1"); -------------- --1 int noldlen = str1.getlength (); ----------------- 2 char * pstr1 = str1.getbuffer; --------- ----- 3 STRCPY (PSTR1, "Modified"); -------------------- 4 int nnewlen = str1.getlength (); ----- ------------ 5 By setting breakpoints, let's run and track this code, when running to three, the value of Str1 is "this is the string 1", and Noldlen's value is 20. When running to 5, found that the value of STR1 becomes "modified". That is, the string pointer returned to getBuffer, we passed it as a parameter to strcpy, trying to modify the address pointed to by this string pointer, the result is the success, and the value of the CString object STR1 also responds into the value of the CSTRING object STR1 "Modified". However, we will then call Str1.GetLength (), but unexpectedly discovered that its return value is still 20, but in fact, the string in STR1 has become "modified", that is, the value returned at this time should be Is the length of the string "modified"! Instead of 20. Now CString work is not normal! How is this going? Obviously, STR1 work is not normal to make a string copy after a pointer returned by getBuffer. Look at the description of the operation of the MSDN, you can see there are so many words: If you use the pointer returned by GetBuffer to change the string contents, you must call ReleaseBuffer before using any other CString member functions of the original in GetBuffer. ReleaseBuffer needs to be called after the pointer is used so that other CString's Operations can be used. In the above code, we have built a row of code at 4-5: str2.releasebuffer (), then observe nnewlen, found this time it is already worth 8. It can also be seen from the CString mechanism: getBuffer returns the first address of the string buffer in the CStringData object. According to this address, we modify the value in this address, which is changed, only the value in the string buffer in CStringData, and other values ​​of the attributes used to describe the string buffer in CStringData are not correct. For example, CStringData :: NDATALENGTH is obviously still 20, but now the length of the string is already 8. That is to say, we also need to modify other values ​​in CStringData.

This is the reason for calling ReleaseBuffer (). As we expect, the releasebuffer source code is exactly what we guess: copybeforewrite (); // Just In Case GetBuffer Was Not Called if (nnewlength == -1) nnewlength = lstrlen (m_pchdata); // Zero TERMINATED Assert (nnewlength <= getData () -> Nalloclength; getData () -> ndatalength = nnewlength; m_pchdata [nnewlength] = '/ 0'; where CopyBeForeWrite is implementing writing technology, regardless of it. The following code is to reset the property value of the descriptive string length in the CSTRINGDATA object. First get the length of the current string, then get the CStringData's object pointer and modify the NDATALEGTH member value in the inside. However, the problem is that, although we know the wrong reason, we know that when you modify the value points to the pointer to getBuffer, you need to call ReleaseBuffer to use the other Operations of CString, we can avoid this mistake. the answer is negative. This is like a person who knows some programming knowledge needs to be released through the DELETE, but the reason is very simple, but the last actual result is still due to calling Delete. Memory leakage. In actual work, it is often modified by the value returned by GetBuffer, but finally I forgot to call ReleaseBuffer to release. Moreover, because this error is not like New and Delete everyone knows and pays attention to, there is no inspection mechanism to specifically check, so the error caused by forgetting to call ReleaseBuffer due to the final program is taken to the release. To avoid this mistake, there are many ways. But the simplest is also the most effective is to avoid this use. Many times, we don't need this kind of use, we can do it through other safety methods.

For example, the code above, we can write this: CString str1 ("this is the string 1"); int noldlen = str1.getlength (); str1 = "modified"; int nnewlen = str1.getlength (); but sometimes Really need, such as: We need to convert a string in a CString object, which is done by calling a function translate in a DLL, but if you want, don't know why, this function is used CHAR *: DWORD TRANSLATE (CHAR * PSRC, CHAR * PDEST, INT NSRCLEN, INT NDESTLEN); this time we may need this method: cstring strDest; int NDestlen = 100; dword dwret = translate (_STRSRC.GETLENGTH ()), strDest.getBuffer (NDestlen), _STRSRC.GETLENGTH (), ndestlen; _ strDest.ReleaseBuffer (); strDest.releaseBuffer (); if (SuccessCall (dwret) {} if (Failedcall) {} ) {} Indeed, however, I still recommend trying to avoid this usage, if you do need to use, please do not use a special pointer to save the value returned by GetBuffer, because this often forget us Calling ReleaseBuffer. As the code above, we can call ReleaseBuffer immediately to adjust the CString object immediately after calling getBuffer. 2. LPCTSTR About LPCTSTR errors often happen to beginners. For example, the method of the initiator often uses in the call function DWord Translate; INT Nlen = _STRSRC.GETLENGTH (); DWORD DWRET = Translate ((char *) (Lpctstr) _STRSRC), (char *) (lpctstr) _STRSRC), NLEN, NLEN; if (SuccessCall (dwret)) {} IF (failedcall (dwret)) {} His original original intention is to convert the string after conversion Still placed in _strsrc, however, when returned after transferring Translate, it was found that _STRSRC was not working properly. Check the code, but if you can't find the problem, where is it. In fact, this problem is the same as the first question. The CSTRING class has been overloaded by LPCTST. LPCTST is actually an Operation in CString. The call to LPCTST is actually similar to getBuffer, directly returns the first address of the string buffer in the CSTRINGDATA object. Its C code implementation is: _AFX_INLINE CSTRING :: Operator LPCTSTR () const {return m_pchdata;} Therefore, ReleaseBuffer () is also required after use. However, who can see this? In fact, the essential cause of this problem is on the type conversion. LPCTSTR returns a const char * type, so use this pointer to call Translate compilation that it cannot pass. For an initiator, or a person with a long programming experience will convert const char * to char * by an forcibly type conversion.

转载请注明原文地址:https://www.9cbs.com/read-89492.html

New Post(0)