CString Management (all actions concerning the CString) Author: FreeEIMCStrings are a useful data type They greatly simplify a lot of operations in MFC, making it much more convenient to do string manipulation However, there are some special techniques to using CStrings, particularly.. Hard for people coming from a pure-c background to learn. This Essay Discusses Some of these Techniques.
.
CString concatenation Formatting (including integer-to-CString) Converting CStrings to integers Converting between char * to a CString char * to CString CString to char * I: Casting to LPCTSTR CString to char * II: Using GetBuffer CString to char * III: Interfacing to a control CString to BSTR BSTR to CString (30-Jan-01) VARIANT to CString (24-Feb-01) Loading STRINGTABLE resources (22-Feb-01) CStrings and temporary objects CString efficiency String ConcatenationOne of the very convenient features of Cstring is the ability to constnate two strings. For example if we have
CSTRING GRAY ("gray"); cstring cat ("cat"); cstract graycat = gray cat; is a lot nicr Than Having to do something like:
Char gray [] = "gray"; char cat [] = "cat"; char * graycat = malloc (gray) Strlen (CAT) 1); strcpy (graycat, gray); strcat (graycat, cat) Formatting (Including Integer-to-CString) Rather Than Using Sprintf or Wsprintf, You Can do Formatting for a cstring by using the format method:
CString S; S.Format (_T ("Total IS% D"), TOTAL); The Advantage Here Is That You don't have to worry about WHETHER OR NOT THE BUFFER IS LARGE ENOUGH to HOLD The Formatted Data; this IS Handled for you by the formatting routines.Use of formatting is the MOMMON WAY OF Converting from Non-string Data Types to a cstring, for example, conveilting an integer to a cstring:
CString S; S.Format (_T ("% D"), Total); I always use the the_t () Macro Because I design My Program To Be at Least Unicode-Aware, But That A Topic for Some Other Essay. The Purpose Of_t () is to compile a string for an 8-bit-character Application As:
#define _t (x) x // non-unicode versionwhereas for a Unicode Application IT IS Defined AS
#define _t (x) L ## x // uncode versionionso in Unicode The Effect Is as if i Had Written
S.Format (L "% D", TOTAL); if you ever think you might Ever Possibly Use Unicode, Start Coding in a Unicode-aware fashion. for example, never, ever usdeof () to get the size of a character Buffer, Because IT Will Be Offe A Factor of 2 in a Unicode Application. We cover Unicode In Some Detail in Win32 Programming. When I Need A Size, I Have A Macro Called Dim, Which Is Defined In A File Dim.h That I include everywhere:
#define Dim (x) (SizeOf ((x)) / sizeof ((x) [0])) this is not only useful for DEALING WITH UNICODE BUFFERS WHOSE IS FIXED AT Compile Time, But Any Compile-Time Defined Table.
Class whatver {...}; whatver data [] = {{...}, ... {...},};
For (int I = 0; i TCHAR DATA [20]; LSTRCPYN (Data, Longstring, Sizeof (DATA) - 1); // Wrong! Lstrcpyn (Data, Longstring, Dim (Data) - 1); // RightWriteFile (F, Data, DIM (Data) , & bytesWritten, NULL);! // WRONG WriteFile (f, data, sizeof (data), & bytesWritten, NULL);. // RIGHTThis is because lstrcpyn wants a character count, but WriteFile wants a byte count Also note that this always writes out the entire contents of data If you only want to write out the actual length of the data, you would think you might doWriteFile (f, data, lstrlen (data), & bytesWritten, NULL);. // WRONGbut that will not work in a Unicode Application. Instead, You Must Do WriteFile (f, data, lstrlen (data) * sizeof (TCHAR), & bytesWritten, NULL);. // RIGHTbecause WriteFile wants a byte count (For those of you who might be tempted to say "but that means I'll always be multiplying by 1 for ordinary applications, and that is inefficient ", you need to understand what compilers actually do No real C or C compiler would actually compile a multiply instruction inline;. the multiply-by-one is simply discarded by the compiler as being a silly think do. and if you think by the unicode trying by 2, Remember That this is just a bit-shift left by 1 bit, Which the compiler is also happy to do INSTEAD of the MULTIPLICATION). Using _T does not create a Unicode application It creates a Unicode-aware application When you compile in the default 8-bit mode, you get a "normal" 8-bit program;.. When you compile in Unicode mode, you get a Unicode (16-bit-character) Application. Note That A CSTRING IN A Unicode Application IS A String That Holds 16-bit Characters. Converting a CString to an integerThe simplest way to convert a CString to an integer value is to use one of the standard string-to-integer conversion routines.While generally you will suspect that _atoi is a good choice, it is rarely the right choice. If you play to be Unicode-ready, you should call the function _ttoi, which compiles into _atoi in ANSI code and _wtoi in Unicode code. you can also consider using _tcstoul (for unsigned conversion to any radix, such as 2, 8, 10 OR 16) OR_TCStol (for Signed Conversion To Any Radix). For Example, Here Area Some Examples: CString HEX = _T ("fab"); cstring decimal = _t ("4011"); assert (_tcstoul (hex, 0, 16) == _ttoi (decimal)); Converting Between Char * and cstringthis is the MOST Common Set of questions beginners have on the CString data type. Due largely to serious C magic, you can largely ignore many of the problems. Things just "work right". The problems come about when you do not understand the basic mechanisms and then do not Understand Why Sometting That Seem Obvious Doesn't Work. For Example, Having Noticed The Above Example You Might Wonder Why You Can't Write CString graycat = "gray" "cat"; OR CString graycat ( "Gray" "Cat");.? In fact the compiler will complain bitterly about these attempts Why Because the operator is defined as an overloaded operator on various combinations of the CString and LPCTSTR data types, but not between two LPCTSTR DATA TYPES, WHICH Are Underlying Data Types. You can't Overload C Operators on base Types like intid and char, or char *. What Will Work IS CString graycat = cstring ("gray") CString ("cat"); or evencstring graycat = cstring ("gray") "cat"; if you study these, you will see That The always applies to at Least One CString And one lpcstr. Char * to cstringso you have a char *, or a string. HOW you create a cstring. Here Are Some Examples: Char * p = "this is a test" OR, IN Unicode-Aware Applications TCHAR * P = _t ("this is a test") or LPTSTR P = _t ("this is a test"); You can write any of the folload: CString S = "this is a test"; // 8-bit online); // unicode-awarecstring s ("this is a test"); // 8-bit OnlyCString S (_T ( "This is a test"); // Unicode-awareCString s = p; CString s (p);. Any of these readily convert the constant string or the pointer to a CString value Note that the characters assigned are always copied Into the cstring so that you can do something like TCHAR * P = _T ("gray"); cstring s (p); p = _t ("cat"); s = p; and be sure what the resulting string is "graycat". ..................... .. CString to char * I:. Casting to LPCTSTRThis is a slightly harder transition to find out about, and there is lots of confusion about the "right" way to do it There are quite a few right ways, and probably an equal number of wrong Ways. The first thing you have to understand about a CString is that it is a special C object which contains three values:. A pointer to a buffer, a count of the valid characters in the buffer, and a buffer length The count of the number of characters can be any size from 0 up to the maximum length of the buffer minus one (for the NUL byte). The character count and buffer length are cleverly hidden.Unless you do some special things, you know nothing about the size of the buffer that is associated with the CString. Therefore, if you can get the address of the buffer, you can not change its contents. you can not shorten the contents, and you absolutely must not lengthen the contents. This leads to some at-first-glance odd Workarounds. The operator LPCTSTR (or more specifically, the operator const TCHAR *), is overloaded for CString. The definition of the operator is to return the address of the buffer. Thus, if you need a string pointer to the CString you can do something like CString S ("graycat"); LPCTSTR P = S; and IT Works Correctly. This is because of the rules About how casting is doone in c; When a cast is required, c rules allow the cast to be self-esd. For example, You Could Define (FLOAT) AS A CAST ON A Complex Number (a Pair of Floats) and define it to return only the first float (Called the "real part") of the complex number so you could Complex C (1.2F, 4.8F); float realpart = C; and expect to see, if the the value of realpart is now 1.2. THIS WORKS for you in all kinds of places. For example, any function this kind of the COERCION, SO THAT You Can Have A Function (Perhaps In A DLL you bought): BOOL DoSomethingCool (LPCTSTR s); and call it as followsCString file ( "c: // myfiles // coolstuff") BOOL result = DoSomethingCool (file); This works correctly because the DoSomethingCool function has specified that it wants an LPCTSTR and therefore the . But what if you want to format it? CString Graycat ("graycat"); CString S; S.Format ("MEW! I Love% S", Graycat); Note That Because The Value APPEARS IN THE VARIABLE-Argument List (The List Designated by "..." in That there is no implicit coercion operator. What are you watch to get? Well, Surprise, You Actually Get The String "Mew! I love GrayCat" because the MFC implementers carefully designed the CString data type so that an expression of type CString evaluates to the pointer to the string, so in the absence of any casting, such as in a Format or sprintf, you will Still Get The Correct Behavior. The Additional Data That Describes A CString Actually Lives In The Addresses Below The Nominal Cstring Address. What You can't do is modify the string. For example, you might try to do something "." "By a" (DON 'THIS WAY, You Should Use The National Language Support Features for Decimal Conversions if You Care About Internationalization, But this makes a Simple Example: CString V ("1.00"); // Currency Amount, 2 Decimal PlaceslpctStr P = V; P [Lstrlen (P) - 3] = ','; if you try to do this, The Compiler Will Complain That You Are Assigning To A constant string. this is the correct message. it Would Also Complain if you tried STRCAT (P, "EACH"); BECAUSE STRCAT WANTS An LPTSTSTSTS ITS First Argument and You Gave it an lpctstr.don't try to defeat these error message. You will get yourself Into Trouble! The reason is that the buffer has a count, which is inaccessible to you (it's in that hidden area that sits below the CString address), and if you change the string, you will not see the change reflected in the character count for the buffer. Furthermore, if the string happens to be just about as long as the buffer physical limit (more on this later), an attempt to extend the string will overwrite whatever is beyond the buffer, which is memory you have no right to write ( Right?) And You'll Damage Memory You Don't OWN. Sure Recipe for a dead application. CString to char * II: Using GetBufferA special method is available for a CString if you need to modify it This is the operation GetBuffer What this does is return to you a pointer to the buffer which is considered writeable If you are only going... To change characters or shorten the string, you are now free to do so: CString S ("File.ext"); LPTSTR P = S.GetBuffer (); lptstr Dot = Strchr (p, '.'); // OK, Should Have Used S.Find ... if (p ! = Null) * p = _t ('/ 0'); S.ReleaseBuffer (); this is the first and simplest use of getBuffer. You don't support, so the default of 0 is buy, Which means " give me a pointer to the string; I promise to not extend the string "When you call ReleaseBuffer, the actual length of the string is recomputed and stored in the CString Within the scope of a GetBuffer / ReleaseBuffer sequene, and I emphasize this.. : you Must not, Ever, Use Any Method of CString on the CString whose buffer you have The reason for this is that the integrity of the CString object is not guaranteed until the ReleaseBuffer is called Study the code below:!. CString s (. ..); LPTSTR P = s.getBuffer (); // ... LOTS of Things happen via the pointer pint n = s.getlength (); // bad !!!!! Probably will Give Wrong Answer !!! S.trimright (); // bad !!!!! No Guarantee It Will Work !!!! s.releasebuffer (); // TH ings are now OKint m = s.GetLength (); // This is guaranteed to be corrects.TrimRight (); // Will work correctlySuppose you want to actually extend the string In this case you must know how large the string will get. This is just like declaring Char buffer [1024]; knowing That 1024 is more Thanugh Space for anything you are going to do. The equivalent in the cstring world is LPTSTR P = S.GetBuffer (1024); This Call Gives You Not Only a Pointer To The Buffer But Guarance That The Buffer Will Be (At Least) 1024 BYtes in Length. Also, note that if you have a pointer to a const string, the string value itself is stored in read-only memory; an attempt to store into it, even if you've done GetBuffer, you have a pointer to read-only memory , so an attempt to store into the string will fail with an access error. I have not verified this for CString, but I've seen ordinary C programmers make this error frequently.A common "bad idiom" left over from C programmers is To Allocate a Buffer of Fixed Size, Do A Sprintf Into IT, And Assign It To A CString: Char buffer [256]; sprintf (buffer, "% ...", args, ...); // ... means "LOTS of Stuff Here" CString S = Buffer; While the Better Form Is To DO CString S; S.Format (_T ("% ....", args, ...); note that this always works; if you string Shanghai 256 bytes you don't clobber the stack! Another Common Error Is To Be Clever And Realize That A Fixed Size Won't work, so the program, this is Even Sillier: INT LEN = LSTRLEN (PARM1) 13 LSTRLEN (PARM2) 10 100; char * buffer = new char [len]; sprintf (buffer, "% s is equal to% s, valid data", PARM1, PARM2) CString S = Buffer; .... delete [] buffer; WHERE IT CAN be Easily Written As CString S; S.Format ("% s is equals to% s, valid data"), PARM1, PARM2; Note That The Sprintf Examples Are Not Unicode-Ready (Alth You Could Use TsPrintf and Put_t () Around The Formatting String, But The Basic Idea Is Still That You Are Doing Far More Work Than Necessary, And It Is Error-Prone. CString to char * III:. Interfacing to a controlA very common operation is to pass a CString value in to a control, for example, a CTreeCtrl While MFC provides a number of convenient overloads for the operation, but in the most general situation you use the "raw" form of the update, and therefore you need to store a pointer to a string in the TVITEM which is included within the TVINSERTITEMSTRUCT: TVINSERTITEMSTRUCT tvi; CString s; // ... assign something to stvi.item.pszText = s; // Compiler yells at you here // ... other stuffHTREEITEM ti = c_MyTree.InsertItem (& tvi); Now why did the compiler complain It looks like a perfectly good assignment But in fact if you look at the structure,?! You Will See That The Member Is Declared in The TVItem Structure As Shown Below: .......................... Ok, you say, i can deval with what, and you write tvi.item.pszText = (LPCTSTR) s;! // compiler still complains What the compiler is now complaining about is that you are attempting to assign an LPCTSTR to an LPTSTR, an operation which is forbidden by the rules of C and C . you may not use this technique to accidentally alias a constant pointer to a non-constant alias so you can violate the assumptions of constancy. If you could, you could potentially confuse the optimizer, which trusts what you tell it when deciding how to optimize your Program. for example, if you do Const int I = ...; // ... do lots of stuff ... = a [i]; // usage 1 // ... LOTS more stuff ... = a [i]; // usage 2Then the compiler can trust that, because you said const, that the value of i at "usage1" and "usage2" is the same value, and it can even precompute the address of a [i] at usage1 and keep the value around for Later use at usage2, rather Than computing it has time. if you believe = ...; int * p = & i; // ... do lots of stuff ... = a [i]; / / usage 1 // ... LOTS more stuff (* p) ; // Mess over compiler's assumption // ... and other stuff ... = a [i]; // usage 2the the compiler Would Believe in the constancy of i, and consequently the constancy of the location of a [i], and the place where the indirection is done destroys that assumption. Thus, the program would exhibit one behavior when compiled in debug mode (no optimizations) and another behavior When Compiled in Release Mode (Full Optimization). this is not good. there. The Attempt T O Assign The Pointer to I to a modifiable reference is diagnosed by the compiler as being bogus. This is why the (lpctstr) Cast Won't really help. Why not just declare the member as an LPCTSTR? Because the structure is used both for reading and writing to the control. When you are writing to the control, the text pointer is actually treated as an LPCTSTR but when you are reading from the control you NEED A WRITEABLE STRINGUISH ITS Use for Input from Its Use for Output. Therefore, you will offten find in My code Something That Looks Like TVi.Item.psztext = (lpctstr) S; this casters the cstring to an lpctstr, thus giving me That Address of the string, Which I the Force to be an lptstr so i can assign it. Note That Is Valid only if you are using the value as data to a Set or Insert style method! you can not do this when you are trying to retrieve data! you need a slightly different method when you are trying to retrieve data, such as the value stored in a Control. for example, for a ctreectrl sale, I want to get the text of the item. I know what the text is no more Than MY_LIMIT in size. Therefore, i can Write Something TVITEM tvi; // ... assorted initialization of other fields of tvitvi.pszText = s.GetBuffer (MY_LIMIT); tvi.cchTextMax = MY_LIMIT; c_MyTree.GetItem (& tvi); s.ReleaseBuffer (); Note that the code above works for any type of Set method also, but is not needed because for a Set-type method (including Insert) you are not writing the string. But when you are writing the CString you need to make sure the buffer is writeable. That's what the GetBuffer Does. CString to BSTR When programming with ActiveX, you will sometimes need a value represented as a type BSTR. A BSTR is a counted string, a wide-character (Unicode) string on Intel platforms and can contain embedded NUL characters. You Can Convert At CString To A Bstr by Calling The CString Method Allocsystring: CString s; s = ...; // whateverBSTR b = s.AllocSysString (); The pointer b points to a newly-allocated BSTR object which is a copy of the CString, including the terminal NUL character This may now be passed. to whatever interface you are calling that requires a BSTR Normally, a BSTR is disposed of by the component receiving it If you should need to dispose of a BSTR, you must use the call :: SysFreeString (b);.. to free the string . The story is that the decision of how to represent strings sent to ActiveX controls resulted in some serious turf wars within Microsoft. The Visual Basic people won, and the string type BSTR (acronym for "Basic String") was the result. BSTR to CString Since a BSTR is a counted Unicode string, you can use standard conversions to make an 8-bit CString Actually, this is built-in;.. There are special constructors for converting ANSI strings to Unicode and vice-versa You can Also Get Bstrs As Results in a Variant Type, Which Is A Type Returned by Various COM and Automation Calls. For Example, IF you do, in an afflication, BSTR B; B = ...; // Whatvercstring S (b) Works Just Fine for A Single-String Bstr, Because The the a Special Construction That Takes An LPCWSTR (Which Is What A BSTR is) and converts it to an ANSI string. The special test is required because a BSTR could be NULL, and the constructors Do not Play Well with NULL inputs (thanks to Brian Ross for pointing this out!). This also only works for a BSTR that contains only a single string terminated with a NUL;. you have to do more work to convert strings that contain multiple NUL characters Note that embedded NUL characters generally do not work well in CStrings and generally should be avoided.Remember, According to the rules of C / C , if you have an LPWSTR IT WILL MATCH A Parameter Type of LPCWSTR (IT DOESN'T WORK The Other WAY!). In Unicode Mode, this Is Just The Constructor CSTRING :: CString (LPCTSTST); As Indicated Above, in Ansi Mode There IS A Special Construction CString :: CString (LPCWSTR);. This calls an internal function to convert the Unicode string to an ANSI string (In Unicode mode there is a special constructor that takes an LPCSTR, a pointer to an 8-bit ANSI string, and widens it To a unicode string!). Again, Note The Limitation Imposed by The Need To Test for A BStr Value Which is Null. There is an additional problem as pointed out above:.. BSTRs can contain embedded NUL characters; CString constructors can only handle single NUL characters in a string This means that CStrings will compute the wrong length for a string which contains embedded NUL bytes You need to Handle this yourself. if You Look at the constructors in strcore.cpp, you will see what all do an lstrlen or equivalent to compute the length. Note that the conversion from Unicode to ANSI uses the :: WideCharToMultiByte conversion with specific arguments that you may not like. If you want a different conversion than the default, you have to write your own. IF you are compiling as unicode, THEN IS A Simple Assignment: CString Convert (BSTR B) {IF (B == Null) Return CString (_T (")); CString S (b); // in Unicode Mode Return S;} if you are in Ansi Mode, you need to convelectr the string in a more complex fashion. This will accomplish it. Note that this code uses the same argument values to :: WideCharToMultiByte that the implicit constructor for CString uses, so you would use this technique only if you wanted to change these parameters to do The Conversion in Some Other Fashion, for Example, Specifying A Different Default Character, A Different Set of Flags, ETC. CString Convert (BSTR B) {CSTRING S; if (b == null) Return S; // EMPTY for NULL BSTR # ifdef Unicode S = B; #ELSE LPSTR P = S.GetBuffer (Systringlen (b) 1); :: widechartomultibyte (cp_acp, // Ansi code page 0, // no flags b, // source widechar string -1, // assume nul-terminated p, // target buffer systringlen (b) 1, // target buffer length NULL, // use system default char NULL); // do not care if default used s.ReleaseBuffer (); # endif return s;} Note that I do not worry about what happens if the BSTR contains Unicode characters that do NOT MAP To the 8-bit Character Set, Because I Specify Null As The Last Two Parameters. This is the sort of shing you might want to change.variant to cstring Actually, I'VE Never Done this; I don't work in Com / Ole / ActiveX WHERE this Is An Issue. But I Saw a Posting by Robert Quirk on The Microsoft.public.vc.mfc NewsGroup on now to do this, And It develop silly not to include it in this Essay, So Here it is, with a bit More explanation and elaboration. Any Errors Relative to what he wrote is. A VARIANT is a generic parameter / return type in COM programming. You can write methods that return a type VARIANT, and which type the function returns may (and often does) depend on the input parameters to your method (for example, in Automation, Depending On Which Method You Call, Idispatch :: Invoke May Return (Via One of Its Parameters) A Variant Which Holds A Byte, A Word, An Float, A Double, A Date, A Bstr, And About Three Dozen Other Types (SeeE the specifications of the VARIANT structure in the MSDN). in the example below, it is assumed that the type is known to be a variant of type BSTR, which means that the value is found in the string referenced by bstrVal. This takes advantage of the fact that there is a constructor which, in an ANSI application, will convert a value referenced by an LPCWCHAR to a CString (see BSTR-to-CString). in Unicode mode, this turns out to be the normal CString constructor. See the Caveats About The Default :: Widechartomultibyte Conversion and WHether OR Not You Find these Acceptable (MOSTLY, You Will) .variant Vadata; Vadata = m_com.Yourmethodhere (); assert (vadata.vt == vt_bstr); Cstring strdata; Note That You Could Also made a more generic conversion routine That LOOKED AT The VT Field. In this case: CString VariantTostring (Variant * VA) {CString S; Switch (VA-> VT) {/ * vt * / case vt_bstr: return cstring (vadata-> bstrval); case vt_bstr | vt_byref: return cstring (* vadata-> pbstrval) Case vt_i4: s.format (_T ("% D"), VA-> LVAL); RETURN S; CASE VT_I4 | VT_BYREF: S.Format (_T ("% D"), * VA-> PLVAL); case Vt_r8: s.format (_t ("% f"), VA-> DBLVAL); RETURN S; ... Remaining Cases Left as an adrise for the reader default: assert (false); // unknown variant type (this assert is optional) return CString ( "");} / * vt * /} Loading STRINGTABLE values If you want to create a program that is easily ported to other languages, you must not include native-language strings in your source code (For. Since, I'll Uses, Since That Is My Native Language (Aber Ich Kann Ein Bischen Deutsch Sprechen). So it is very bad phactice to writecstring s = "there is an error" Instead, You Should Put All Your Language-Specific Strings (Except, Perhaps, Debug Strings, Which Are Never in A Product Deliverable). This Means That IS Fine To Write S.Format ("% D-% S"), Code, Text); In Your Program; That Literal String is not language-sensitive. However, You Must Be Very Careful To NOT USE STRINGS LIKE // fmt is "error in% s file% s" // readorwrite is "Reading" or "Writing" S.Format (FMT, ReadorWrite, FileName); I Speak of this from Experience. in My First Internationalized Application I Made this error, and in spite of the fact that I know German, and that German word order places the verb at the end of a sentence, I had done this. Our German distributor complained bitterly that he had to come up with truly weird error messages in German to get the format codes to do the right thing. It is much better (and what I do now) to have two strings, one for reading and one for writing, and load the appropriate one, making them string parameter-insensitive, that IS, INSTEAD OF Loading The strings "Reading" or "Writing", Load the whole format: // fmt is "error in reading file% s" // "Error in Writing File% S" S.Format (FMT, FileName) Note That if You Have More One Substitution, You Should Make Sure That If The Word ORDER of The Substitutions Does Not Matter, for Example, Subject-Object Subject-Verb, or Verb-Object, in english. For now, I will not talk about FormatMessage, which actually is better than sprintf / Format, but is poorly integrated into the CString class. It solves this by naming the parameters by their position in the parameter list and allows you to rearrange them in The output string. So how do we accomplish all this? By storing the string values in the resource known as the STRINGTABLE in the resource segment. To do this, you must first create the string, using the Visual Studio resource editor. A string is given a string ID ., typically starting IDS_ So you have a message, you create the string and call it IDS_READING_FILE and another called IDS_WRITING_FILE They appear in your .rc file asSTRINGTABLE IDS_READING_FILE "Reading file% s" IDS_WRITING_FILE "Writing file% s" ENDNote:. these resources are always stored as Unicode strings, no matter what your program is compiled as. They are even Unicode strings on Win9x platforms, which otherwise have no real grasp of Unicode (but they do for resources!). Then you go to where you had stored The strings // previous code cstring fmt; if (...) fmt = "reading file% s"; Else FMT = "Writing file% s"; ... // Much Later CString S; S.Format (FMT, FileName) And INSTEAD DO // Revised code cstring fmt; if (...) fmt.loadstring (ids_reading_file); else fmt.loadstring; ... // Much Later CString S; S.Format (FMT, FileName); Now Your Code Can Be Moved To Any Language. The Loadstring Method Takes A String Id and Retrieves The StringTable Value It Repesents, and Assigns That Value To the cstring. There is a clever feature of the CString constructor that simplifies the use of STRINGTABLE entries. It is not explicitly documented in the CString :: CString specification, but is obscurely shown in the example usage of the constructor! (Why this could not be part of the formal documentation and has to be shown in an example escapes me!). The feature is that if you cast a STRINGTABLE ID to an LPCTSTR it will implicitly do a LoadString. Thus the following two examples of creating a string value produce the same effect, and the ASSERT will not trigger in debug mode compilations: CString s; s.LoadString (IDS_WHATEVER); CString t ((LPCTSTR) IDS_WHATEVER); ASSERT (s == t); Now, you may say, how can this possibly Work? How can IT Tell A Valid Pointer from A StringTable ID? Simple: All string ids are in the range 1..65535. This Means That The High-Order Bits of The Pointer Will BE 0. Sounds Good, But What if i HAVE VALID DATA IN A LOW Address? Well, The Answer IS, You Can't. The Lower 64k of Your Addre ss space will never, ever, exist. Any attempt to access a value in the address range (0..65535) will always and forever give an access fault 0x00000000 through 0x0000FFFF. These addresses are never, ever valid addresses. Thus a value in That Range (Other Than 0) Must Necessarily Represent A StringTable ID. I tend to use the MAKEINTRESOURCE macro to do the casting. I think it makes the code clearer regarding what is going on. It is a standard macro which does not have much applicability otherwise in MFC. You may have noted that many methods take either a UINT or an LPCTSTR as parameters, using C overloading. This gets us around the ugliness of pure C where the "overloaded" methods (which are not really overloaded in C) required explicit casts. This is also useful in assigning resource names to various other structures.CString s; s.LoadString (IDS_WHATEVER); CString t (MAKEINTRESOURCE (IDS_WHATEVER)); ASSERT (s == t); Just to give you an idea:. I practice what I preach here you will rarely if ever Find A Literal String in My Program, Other Than The Occasional Debug Output Messages, And, Of Course, Any Language-Independent String. CStrings and temporary objects Here's a little problem that came up on the microsoft.public.vc.mfc newsgroup a while ago. I'll simplify it a bit. The basic problem was the programmer wanted to write a string to the Registry. So he Wrote: I am Trying to set a registry value using regsetvalueex () and it is the value there is ing i declare a variable of char [] IT Works Fine. However, I am Trying to convelectr from a cstring and i get Garbage. "ýý ... ýýýýý" to Be Exact. I have tried to char *, lpcstr. The return of getBuffer (from debug) is The Correct string Buthen i assign it to a char * (or LPCSTR) IT IS Garbage. FOLLOWING IS A PIECE OF MY CODE: Char * szname = getName (). getBuffer (20); RegSetValueex (HKEY, "Name", 0. REG_SZ, (const Byte *) szname, Strlen (SZNAME 1)); The name string is less 20 Chars Long, SO I don't think the getBuffer parameter is to black. IT is very frustrating and any help is appreciated. Dear frustrate, You Have Been Done in by a fairly subtle error, caused by trying to be a bit to clever. What happened Was That You Fell Victim to knowing Too Much. The Correct Code Is Shown Below: CString name = getName (); RegSetValueex (HKEY, _T ("name"), 0, reg_sz, (const byte *) (lpctstr) Name, (Name.getLength () 1) * sizeof (tchar)); Here's why my code works and yours didn't. When Your Function GetName Returned A CSTRING, IT RETURNED A "Temporary Object". See The C Reference Manual §12.2. In some circumstances it may be necessary or convenient for the compiler to generate a temporary object. Such introduction of temporaries is implementation dependent. When a compiler introduces a temporary object of a class that has a constructor it must ensure that a construct is called for the Temporal Object. Similarly, The Destructor Must Be Called for a Temporary Object of a class where a destructor is declared. The compiler must ensure that a temporary object is destroyed. The exact point of destruction is implementation dependent .... This destruction must take place before exit from the scope in which the temporary is created. Most compilers implement the implicit destructor for a temporary at the next program sequencing point following its creation, that is, for all practical purposes, the next semicolon. Hence the CString existed when the GetBuffer call was made, but was destroyed following the semicolon. ( As an aside, there was no reason to provide an argument to GetBuffer, and the code as written is When the destructor was incorrect since there is no ReleaseBuffer performed). So what GetBuffer returned was a pointer to storage for the text of the CString. called at the semicolon, the basic CString object was freed, along with the storage that had been allocated to it. The MFC debug storage allocator then rewrites this freed storage with 0xDD, which is the symbol "Ý". By the time you do the write to the Registry, the string contents have been destroyed.There is no particular reason to need to cast the result to a char * immediately. Storing it as a CString means that a copy of the result is made, so after .................... In Addition, My Code Is Unicode-Ready. The Registry Call Wants A Byte Count. Note Also That The Call Lstrlen (Name 1) Returns a Value That Is Too Small by 2 for an ANSI String, Since It Doesn't Start Until THE Second Character of the String. What You Meant To Write Was Lstrlen (Name) 1 (OK, I Admit It, I've Made The Same Error!). HOWEVER, IN Unicode, Where All Characters Are Two Bytes Long, WE need to cope with this The Microsoft documentation is surprisingly silent on this point:.? is the value given for REG_SZ values a byte count or a character count I'm assuming that their specification of "byte count" means exactly that, and you have to compensate.CString Efficiency One problem of CString is that it hides certain inefficiencies from you. On the other hand, it also means that it can implement certain efficiencies. you may be tempted to say of the following code CString S = SomeCString1; S = SomeCstring2; S = SomeCstring3; S = ","; S = SomeCstring4; That it is horribly inefficient company company, SAY, SAY SAY INEFFCIENT COMPARED TO Char S [1024]; LSTRCPY (S, SomeString1); LSTRCAT (S, SomeString2); LSTRCAT (S, SomeString 3); LSTRCAT (S, ","); LSTRCAT (S, SomeString4); After ALL, You Might Think , first it allocates a buffer to hold SomeCString1, then copies SomeCString1 to it, then detects it is doing a concatenate, allocates a new buffer large enough to hold the current string plus SomeCString2, copies the contents to the buffer and concatenates the SomeCString2 to it .,......................................... ...Cl. ,.s.fffqing thies.bribly ineffectient with all those copies. The truth is, it probably never copies the source strings (the left side of the =) for most cases.In VC 6.0, in Release mode, all CString buffers are allocated in predefined quanta. These are defined as 64, 128, 256 , and 512 bytes. This means that unless the strings are very long, the creation of the concatenated string is an optimized version of a strcat operation (since it knows the location of the end of the string it does not have to search for it , as strcat would;. it just does a memcpy to the correct place) plus a recomputation of the length of the string So it is about as efficient as the clumsier pure-C code, and one whole lot easier to write and maintain.. And understand. Those of you who are not sure this is what is really happening, look in the source code for CString, strcore.cpp, in the mfc / src subdirectory of your vc98 installation. Look for the method ConcatInPlace which is called from all the = Operators. AHA! SO CSTRING ISN't Really "Efficient!" For example, if i create CSTRING CAT ("MEW!"); THEN I don't get a nice, Tidy Little Buffer 5 Bytes Long (4 Data Bytes Plus the Terminal NUL). INSTEAD THE SYSTEM WASTES All That Space by Giving ME 64 BYTES AND WASTING 59 of Them. IF this is how you think, be prepared to reeducate yourself. Somewhere in Your Career Somebody Taught You That You Always Had To Use As Little Space As Possible, And this Was a good string. It is INCORRECT. IT IGNORES SOME SERIOSLY IMPORTANT ASPECTS OF REALITY. If you are used to programming embedded applications with 16K EPROMs, you have a particular mindset for doing such allocation. For that application domain, this is healthy. But for writing Windows applications on 500MHz, 256MB machines, it actually works against you, and creates programs that perform far worse than what you would think of as "less efficient" code.For example, size of strings is thought to be a first-order effect. It is Good to make this small, and Bad to make it large. Nonsense . The effect of precise allocation is that after a few hours of the program running, the heap is cluttered up with little tiny pieces of storage which are useless for anything, but they increase the storage footprint of your application, increase paging traffic, can actually slow down the storage allocator to unacceptable performance levels, and eventually allow your application to grow to consume all of available memory. Storage fragmentation, a second-order or third-order effect, actually domin ATES System Performance. Eventually, IT Completely Unacceptable. Note this. Assume your application is going to run for months at a time. For example, I bring up VC , Word, PowerPoint, FrontPage, Outlook Express, Forté Agent, Internet Explorer, and a few other applications, and essentially never close them. I ' ve edited using PowerPoint for days on end (on the other hand, if you've had the misfortune to have to use something like Adobe FrameMaker, you begin to appreciate reliability; I've rarely been able to use this application without it crashing four to six times a day! and always because it has run out of space, usually by filling up my entire massive swap space!) Precise allocation is one of the misfeatures that will compromise reliability and lead to application crashes.By making CStrings be multiples of some quantum, the memory allocator will end up cluttered with chunks of memory which are almost always immediately reusable for another CString, so the fragmentation is minimized, allocator performance is enhanced, application footprint remains almost a SMALL As Possible, And You Can Run for Weeks or Months without Problem. Aside: Many years ago, at CMU, we were writing an interactive system Some studies of the storage allocator showed that it had a tendency to fragment memory badly Jim Mitchell, now at Sun Microsystems, created a storage allocator that maintained running statistics about.. allocation size, such as the mean and standard deviation of all allocations. If a chunk of storage would be split into a size that was smaller than the mean minus one s than the prevailing allocation, he did not split it at all, thus avoiding cluttering up the allocator with pieces too small to be usable. He actually used floating point inside an allocator! His observation was that the long-term saving in instructions by not having to ignore unusable small storage chunks far and away exceeded the additional cost of doing A Few Floating Point Operations ON AN Allocation Operation. He Was Right.Never, EVER Think About "Optimization" in Terms of Small-Fast Analyzed ON A Per-Line-of-Code Basis. Optimization Should Mean small-and-fast analyzed at the complete application level (if you like New Age buzzwords, think of this as the holistic approach to program optimization, a whole lot better than the per-line basis we teach new programmers). At the complete application Level, Minimum-Chunk String Allocation is about the Worst Method You Could Possibly. If you think optimization is something you do at the code-line level, think again Optimization at this level rarely matters Read my essay on Optimization:.. Your Worst Enemy for some thought-provoking ideas on this topic. Note That The = Operator is Special-Cased; if you want to write: CString s = SomeCString1 SomeCString2 SomeCString3 "," SomeCString4; then each application of the operator causes a new string to be created and a copy to be done (although it is an optimized version, since the length of the string is known and the inefficiencies of strcat do not come into play) .Summary These are just some of the techniques for using CString. I use these every day in my programming. CString is not a terribly difficult class to deal with, but generally the MFC materials Do Not Make All of this apparent, leaving you to Figure ip.. Acknowledgements Special thanks to Lynn Wallace for pointing out a syntax error in one of the examples, Brian Ross for his comments on BSTR conversions, and Robert Quirk for his example of VARIANT-to-BSTR conversion.