CSTRING structure

xiaoxiao2021-03-06 51

CSTRING? If you have been touched by VC / MFC, huh, this name must have seen. Then you look boldly look at the following example ---------------------------------- a simple code is as follows : ----------------------------------

Void C ... DLG :: Onok () {CSTRING STR;

STRCPY (LPCTSTR) STR, "Hello!"); AFXMessageBox (STR);

Cstring strl = ""; int N = strl.getlength (); afxMessageBox (Strl);

CDIALOG :: Onok ();} Description: Use (lpstr) (lpctstr) This weird method is actually intentionally, the purpose is to imitate the real problem. The reality is the call to the STRCPY function to do in the DLL, because the functions in the call DLL do not have a parameter type check, so there is no way to convert this, but there is no way, just to imitate the call of Strcpy. ---------------------------------- Everyone can track: First Str gets Hello! No problem is strange Next, the second CString type object is defined, and assigns the initial value as an empty character, but tracking observations found that Strl is also Hello! The pointer address is the same, but it is strange that getlength () is 0. Since the length is 0, but it value, and it can be used normally, such as displaying with AFXMessageBox. But if I assign a value, there is no problem. The above problem is actually very simple, but I have already met it when I met. So now, I will write down this text, the purpose is to target programmers who have not paid attention to the CString structure, as for masters, huh, I hope I can point me one or two, I am not very grateful! To solve the above problems, first explain a noun write replication technology: When using a CString object A to initialize another CString object B, in order to save space, the new object B does not assign space, it is only to do. Your own pointer points to the memory space of the object A, only the memory space is applied for the new object B only when you need to modify the string in the object a or b. This technology is actually Microsoft to improve the efficiency of the idea, since we have learned. Next, analyze the CString structure: CString can roughly understand the following structure | | || Header | Data || | | That is, CString actually includes a header (data head) and DATA (data area) cstring is for Packaging of a string type in standard C. Because, through a long time programming, we found that many of the multi-program's bug multi-string is related, typical: buffer overflow, memory leak, etc. And these bugs are fatal, which will cause the system's paralysis. Therefore, C is specifically made to maintain a string pointer. Standard C string categories are string, which is used in the Microsoft MFC class library is the CString class. Through the string class, you can greatly avoid those problems on the string pointer in C.

Here we simply see how CString in Microsoft MFC is implemented. Of course, it is best to take the principle and analyze it directly to analyze it. Most of the implementation of CString classes in the MFC is in strcore.cpp. CSTRING is a buffer for storing strings and an operation package applied to this string. That is, the cstring needs to have a buffer for storing strings, and a pointer points to the buffer, the pointer is LPTSTR M_PCHDATA. However, some string operations will increase or reduce the length of the string, so in order to reduce frequent application memory or release memory, CSTRING will apply for a large memory block to store strings. Thus, when the string length increases, if the total length of the total length does not exceed the length of the memory block of the predetermined block, it is not necessary to apply for memory. When the increased string length exceeds the pre-requested memory, the CString first releases the original memory before reapplying a larger memory block. Similarly, when the string length is reduced, no more memory space is released. However, when accumulating to a certain extent, it will only be released once.

Also, when using a CString object A to initialize another CString object B, in order to save space, the new object B does not assign space, it is just to point your own pointer to the memory space of the object A, only when When you need to modify the strings in the object a or b, you will apply for memory space for the new object B, which is called write replication technology (CopyBeForeWrite). The above is over

In this way, only the specific cases of this memory cannot be completely described by a pointer, and more information is required to describe it.

First, there is a need to have a variable to describe the total size of the current memory block. Second, a variable is needed to describe the current memory block has been used. That is, the length of the current string is additionally, there is a need to describe the case where the memory block is referenced by other CString. There is an object to reference the memory block, add this value to one.

CString in a specifically defined to describe information structure: struct CStringData {long nRefs; // reference count int nDataLength; // length of data (including terminator) int nAllocLength; // length of allocation // TCHAR data [nAllocLength]

Tchar * DATA () // Tchar * to managed data {return (tchar *) (THIS 1);}};

When actually used, the occupied memory block size of the structure is not fixed, and the structure is placed in the internal memory block head in CString. SIZEOF (CStringData) from the head of the memory block is a real memory space for storing strings. The application method of this structure's data structure is achieved: pdata = (cstringdata *) New byte [sizeof (cstringdata) (Nlen 1) * sizeof (tchar)]; pdata-> nalloclength = Nlen; where Nlen is Used to illustrate the size of the memory space that requires a disposable application.

It can be easily seen from the code. If you want to apply for a 256 TCHAR memory block to store strings, the size of the actual application is: sizeof (cstringdata) byte (Nlen 1) Tchar

The front sizeof (cstringdata) byte is used to store CStringData information. The back NLEN 1 TCHAR is truly used to store strings, and a more coming is used to store '/ 0'. All Operations in CString are for this buffer. For example, LPTSTR CSTRING:: GetBuffer (int NMINBUFLENGTH), its implementation method is: First get the pointer to the CStringData object by cstring :: getData (). This pointer is to offset SizeOf (CStringData) by the pointer M_PCHDATA of the string, resulting in the address of the CStringData. A CStringData object is then re-instantified according to the value given by the parameter nminbuflength, so that the string buffer in the new object can meet nminbuflength. Then reset some of the description values in the new CStringData. C Finally returns a string buffer in the new CSTRINGDATA object to the caller.

These processes are described in C code: if (getData () -> nrefs> 1 || nminbuflength> getData () -> nalloclength) {// we have to grow the buffer cstringData * PoldData = getData (); int noldlen = getData () -> nDataLength; // AllocBuffer will tromp it if (nMinBufLength data (), (nOldLen 1) * sizeof (TCHAR)); GetData () -> nDatalength = Noldlen; cstring :: release (PoldData);} assert (getData () -> nrefs <= 1);

// Return a Pointer to the Character Storage for this string assert (m_pchdata! = null); return m_pchdata;

Many times, we often copy a large number of strings, etc., CString uses CopyBeForeWrite technology. Using this method, when another object B is instantiated with a CString object A, the value of the two objects is exactly the same, but if it is simple to apply for memory, there are only a few, a few Ten bytes of strings have nothing, if it is a few k or even a few M, it is a big waste. Therefore, CString is simply simply simply pointing the string address M_PCHDATA of the new object B directly to another object A. The string address m_pchdata. The additional job made is to add one of the memoryData :: Nrefs of the object A. CSTRING :: CString (const cstring & stringsrc) {m_pchdata = stringsrc.m_pchdata; interlockedIncrement (& getdata () -> nrefs);

When modifying the character string content of the object A or object B, first check the value of CStringData :: NREFS, if it is greater than one (equal to one, the instruction only one application of the memory space), indicating that the object references other object memory or Your own memory is applied by others, the object first subtracts the application value, then handed over the memory to other object management, reapplying the memory, copy the contents of the original memory. The simple code of its implementation is: void cstring :: copybeforeWrite () {if (getData () -> nrefs> 1) {cstringdata * pdata = getData (); release (); allocbuffer (pdata-> nDatales); memcpy (m_pchdata PDATA-> DATA (), (PDATA-> NDATALENGTH 1) * SIZEOF (TCHAR));}} where Release is used to determine the reference situation of the memory. Void cstring :: release () {if (getData ()! = _afxdatanil) {if (& getData () -> nrefs) <= 0) FREEDATA (GetData ());}} When multiple objects share the same memory When this memory belongs to multiple objects, not that object that belongs to the original application of this memory. However, each object is first subtracted by the reference to this memory, and then judges this reference value. If it is less than or equal to zero, it will be released, otherwise, it will be given to another is incorporated herein. Object control of block memory.

CString uses this data structure, for large data volume string, saves many times frequently requiring release of memory to help improve system performance.

Through the above analysis, we have a roughly understanding of the internal mechanism of CString. In general, CString in MFC is successful. However, since the data structure is more complicated (using CStringData), there is a lot of problems in use, and the most typical one is used to describe the attribute values and actual values of the memory block properties. The reason for this problem is that cstring provides some Operations for convenience, which can directly return directly to the address value of the string in the memory block, and the user can modify the address pointing to this address value, but after modification There is no call to the corresponding Operations1 to keep the value in CStringData. For example, the user can first get the string address through Operations, then add some new characters to this string, so that the length of the string is increased, but since it is modified directly through the pointer, CStringData is described in the length of the string. NDATALENGTH is still the original length, so when the string length is obtained by getlength, the return will inevitably incorrect.

There is an account of Operations in these issues.

GetBuffer

Many typical one in the wrong usage is CString :: getBuffer (). Checked MSDN, description for this Operation is: Returns a Pointer to the internal character buffer for the cstring object. The return LPTSTR IS NOT Const and thus Allows Direct Modification of Cstring Contents. This is a clear note, for the string pointer returned by this Operation, we can directly modify the values: CSTRING STR1 ("this is the string 1"); -------------- --1 int noldlen = str1.getlength (); ----------------- 2 char * pstr1 = str1.getbuffer; --------- ----- 3 STRCPY (PSTR1, "Modified"); -------------------- 4 int nnewlen = str1.getlength (); ----- ------------ 5 By setting breakpoints, let's run and track this code, when running to three, the value of Str1 is "this is the string 1", and Noldlen's value is 20. When running to 5, found that the value of STR1 becomes "modified". That is, the string pointer returned to getBuffer, we passed it as a parameter to strcpy, trying to modify the address pointed to by this string pointer, the result is the success, and the value of the CString object STR1 also responds into the value of the CSTRING object STR1 "Modified". However, we will then call Str1.GetLength (), but unexpectedly discovered that its return value is still 20, but in fact, the string in STR1 has become "modified", that is, the value returned at this time should be Is the length of the string "modified"! Instead of 20. Now CString work is not normal! How is this going?

Obviously, STR1 work is not normal to make a string copy after a pointer returned by getBuffer.

Look at the instructions on this Operation on MSDN, you can see such a word inside: if you use the pointer returned by getBuffer to change the string contents, You Must call ReleaseBuffer Before Using Any Other Cstring Member functions.

It turns out that ReleaseBuffer needs to be called after the pointer returned to getBuffer, so that other cstring Operations can be used. In the above code, we have built a row of code at 4-5: str2.releasebuffer (), then observe nnewlen, found this time it is already worth 8.

It can also be seen from the CString mechanism: getBuffer returns the first address of the string buffer in the CStringData object. According to this address, we modify the value in this address, which is changed, only the value in the string buffer in CStringData, and other values of the attributes used to describe the string buffer in CStringData are not correct. For example, CStringData :: NDATALENGTH is obviously still 20, but now the length of the string is already 8. That is to say, we also need to modify other values in CStringData. This is the reason for calling ReleaseBuffer ().

As we expected, the releasebuffer source code is exactly what we guess: copybeforewrite (); // Just in case getBuffer Was Not Calledif (nnewlength == -1) nnewlength = lstrlen (m_pchdata); // ZERO TERMINATED

Assert (nnewlength <= getdata () -> nalloclength; getData () -> nDatalength = nnewlength; m_pchdata [nnewlength] = '/ 0'; where CopyBeForeWrite is implementing writing a copy technology, here it is.

The following code is to reset the property value of the descriptive string length in the CSTRINGDATA object. First get the length of the current string, then get the CStringData's object pointer and modify the NDATALEGTH member value in the inside.

However, the problem is that, although we know the wrong reason, we know that when you modify the value points to the pointer to getBuffer, you need to call ReleaseBuffer to use the other Operations of CString, we can avoid this mistake. the answer is negative. This is like a person who knows some programming knowledge needs to be released through the DELETE, but the reason is very simple, but the last actual result is still due to calling Delete. Memory leakage. In actual work, it is often modified by the value returned by GetBuffer, but finally I forgot to call ReleaseBuffer to release. Moreover, because this error is not like New and Delete everyone knows and pays attention to, there is no inspection mechanism to specifically check, so the error caused by forgetting to call ReleaseBuffer due to the final program is taken to the release.

To avoid this mistake, there are many ways. But the simplest is also the most effective is to avoid this use. Many times, we don't need this kind of use, we can do it through other safety methods. For example, the code above, we can write this: CSTRING STR1 ("this is the string 1"); int noldlen = str1.getlength (); str1 = "modified"; int nnewlen = str1.getlength ();

But sometimes it needs, for example, we need to convert a string in a CString object, which is done by calling a function translate in a DLL, but if you want, don't know why, this function The parameter uses a char *: DWORD TRANSLATE (Char * Psrc, CHAR * PDEST, INT NSRCLEN, INT NDESTLEN); this time we may need this method: cstring strDest; int NDestlen = 100; dword dwret = translate _strSrc.GetBuffer (_strSrc.GetLength ()), strDest.GetBuffer (nDestLen), _strSrc.GetLength (), nDestlen); _ strSrc.ReleaseBuffer (); strDest.ReleaseBuffer (); if (SUCCESSCALL (dwRet)) {} if ( FailedCall (dwret)) {}

Indeed, this situation is existing, but I still recommend trying to avoid this usage. If you do need to use, please do not use a special pointer to save the value returned by GetBuffer, because this often let us forget to call ReleaseBuffer. As the code above, we can call ReleaseBuffer immediately to adjust the CString object immediately after calling getBuffer. 2. LPCTSTR

Errors about LPCTSTR often happen to beginners. For example, the method of the initiator often uses in the call function DWord Translate; INT Nlen = _STRSRC.GETLENGTH (); DWORD DWRET = Translate ((char *) (Lpctstr) _STRSRC), (Char *) (LPCTSTR) _STRSRC), NLEN, NLEN; if (SuccessCall (dwret)) {}}} {}

His original original intention is to put the conversion string in _strsrc, but when the translate is used to use _STRSRC, it is found that _STRSRC has been working abnormal. Check the code, but if you can't find the problem, where is it.

In fact, this problem is the same as the first question. The CSTRING class has been overloaded by LPCTST. LPCTST is actually an Operation in CString. The call to LPCTST is actually similar to getBuffer, directly returns the first address of the string buffer in the CSTRINGDATA object. Its C code implementation is: _AFX_INLINE CSTRING :: Operator LPCTSTR () const {return m_pchdata;}

Therefore, it is also necessary to call ReleaseBuffer () after use. But who can you see this?

In fact, the essential cause of this problem is on the type conversion. LPCTSTR returns a const char * type, so use this pointer to call Translate compilation that it cannot pass. For an initiator, or a person with a long programming experience will convert const char * to char * by an forcibly type conversion. It eventually caused CString to work abnormal, and this is also easy to cause buffer.

We can make us better use of CString through the description of the CString mechanism and some readable errors. In fact, each CString assignment, if the buffer is much larger, (String each assigned memory is room for retaining, it is bigger than the demand, in addition to the default initialization), it will be redistributed, but through STRCPY (LPCTSTSTR) STR "Hello!"); Crossed the reassigned link, saying that the popular point is actually the memory copy! ! Are you clear?

转载请注明原文地址:https://www.9cbs.com/read-76134.html

9cbs

New Post(0)