Conversion between strings in VC . Net
I. BSTR, LPSTR and LPWSTR
In all programming methods of Visual C . Net, we often use some basic string types such as BSTR, LPSTR, and LPWSTR. These data types similar to those described above are due to data exchange between different programming languages and support for ANSI, Unicode and multi-character set (MBCS).
So what is BSTR, LPSTR, and LPWSTR?
BSTR (Basic String, Basic String) is an OLECHAR * type Unicode string. It is described as a type with automation compatibility. Since the operating system provides the corresponding API function (such as sysallocstring) to manage it and some default scheduling code, BSTR is actually a COM string, but it is widely used in many occasions other than automation technology. Figure 1 depicts the structure of the BSTR, where the DWORD value is the number of bytes that actually occupied in the string, and its value is twice the Unicode character in the string.
LPSTR and LPWSTR are a string data type used by Win32 and VC . LPSTR is defined as a 8-bit ANSI character array pointer to NULL ('/ 0'), and LPWSTR is a 16-bit double-word character array pointer that is pointed to NULL. In VC , there are similar string types, such as LPTSTR, LPCTSTR, etc., the meaning of them is shown in Figure 2.
For example, LPCTSTR refers to "Long Pointer TO A constant gener string", which means "a long pointer type pointing to the general string constant", which is mapped with C / C const char *, and LPTSTR is mapped to char *.
Generally, there is also the following types of definitions:
#ifdef unicode
Typedef lpwstr lptstr;
Typedef lpcwstr lpctstr;
#ELSE
Typedef lpstr lptstr;
Typedef lpcstr lpctstr;
#ENDIF
Second, cstring, cstringa and cstringw
Visual C . NET Cstringt as a "general" string class for shared by ATL and MFC, which has three forms of CString, CStringA, and CStringW, which are different from different character types. These character types are TCHAR, CHAR, and WCHAR_T. TCHAR is equivalent to Wchar (16-bit Unicode characters) in the Unicode platform, in ANSI medium price in char. Wchar_t is usually defined as unsigned short. Since CString is often used in the MFC application, it is no longer repeated.
Third, Variant, Colevariant and _variant_t
In OLE, ACTIVEX, and COM, the Variant data type provides a very effective mechanism, since it contains both the data itself, and the type of data is included, so it can achieve various different automated data transmission. Let's take a look at a simplified version of the Variant definition in the OAIDL.H file:
Struct tagvariant {
VARTYPE VT;
Union {
Short ival; // vt_i2.
Long lval; // vt_i4.
Float fltval; // vt_r4.
Double dblval; // vt_r8.
Date Date; // vt_date.
BSTR BSTRVAL; // vt_bstr.
...
Short * pival; // vt_byref | vt_i2.
Long * plval; // vt_byref | vt_i4.float * pfltval; // vt_byref | VT_R4.
Double * pdblval; // vt_byref | vt_r8.
Date * pdate; // vt_byref | vt_date.
BSTR * PBSTRVAL; // vt_byref | vt_bstr.
}
}
Obviously, the Variant type is a C structure that contains a type member VT, some reserved bytes, and a large Union type. For example, if VT is VT_i2, we can read the value of Variant from IVAL. Similarly, when it assigns a Variant variable, it must first specify its type. E.g:
Variant VA;
:: Variantinit (& VA); // Initialization
INT A = 2002;
va.vt = vt_i4; // Indicate long data type
Va.lval = a; // assignment
In order to facilitate the processing of Variant type variables, Windows also provides some very useful functions:
VariantInit - initialize the variable to vt_empty;
VARIANTCLEAR - Eliminating and initializing Variant;
VariantchangeType - Change the type of Variant;
VariantCopy - Releases memory connected to target Variant and copy the source Variant.
The Colevariant class is a package of Variant structure. Its constructor has a powerful feature that first calls VariantInit when the object is constructed, then according to
The standard type in the parameter calls the corresponding constructor and uses VariantCopy to convert the assignment operation. When the Variant object is not in a valid range, its destructor is automatically called, because the destructor is called VariantClear, so the corresponding memory It will be automatically cleared. In addition, Colevariant's assignment operator provides us with great convenience in the conversion with Variant type. For example, the following code:
Colevariant V1 ("This Is A Test"); / / Direct Construction
Colevariant v2 = "this is a test";
/ / The result is the VT_BSTR type, the value is "this is a test"
Colevariant V3 ((long) 2002);
Colevariant V4 = (long) 2002;
/ / The result is the VT_i4 type, the value is 2002
_variant_t is a Variant class for COM, which is similar to Colevariant. However, in Visual C . NET's MFC application needs to add the following two sentences in front of the code file:
#include "comutil.h"
#pragma comment (lib, "comsupp.lib")
Fourth, CCOMBSTR and _BSTR_T
CCOMBSTR is an ATL class package for the BSTR data type package, which is more convenient. E.g:
CCOMBSTR BSTR1;
BSTR1 = "bye"; // Direct assignment
OLECHAR * STR = OLESTR ("ta ta"); // length is 5 wide characters
CCOMBSTR BSTR2 (WCSLEN (STR)); // Defining length is 5
WCSCPY (BSTR2.M_STR, STR); // Copy the wide string to BSTR
CCOMBSTR BSTR3 (5, Olestr ("Hello World"); CCOMBSTR BSTR4 (5, "Hello World");
CCOMBSTR BSTR5 (OLESTR ("Hey There");
CCOMBSTR BSTR6 ("Hey There");
CCOMBSTR BSTR7 (BSTR6);
// Copy the content, content is "Hey There"
_BSTR_T is C to the package of BSTR, its constructor and destructive functions call the sysallocstring and sysfreestring functions, and other operations are borrowed the BSTR API function. Similar to _variant_t, add comutil.h and comsupp.lib when using it.
V. BSTR, CHAR * and CSTRING conversion
(1) Char * Convert to CString
If char * is converted into CString, in addition to direct assignment, CString :: Format can be used. E.g:
Char charray [] = "this is a test";
Char * p = "this is a test";
or
LPSTR P = "this is a test";
Or in the UNICODE that has been defined
TCHAR * P = _t ("this is a test");
or
LPTSTR P = _t ("this is a test");
CSTRING THESTRING = Charray;
THESTRING.FORMAT (_T ("% s"), charray);
TheString = P;
(2) CSTRING is converted into char *
If the CString class is converted to a char * (lpstr) type, the following three methods are often used:
Method 1. Use forced conversion. E.g:
CString TheString ("This Is A Test");
LPTSTR LPSZ = (LPTSTSTSTSTR) TheString;
Method 2, using strcpy. E.g:
CString TheString ("This Is A Test");
LPTSTR LPSZ = New Tchar [theString.getLength () 1];
_TCSCPY (LPSZ, THESTRING);
It should be noted that the second of Strcpy Unicode / MBCS _TCSCPY
The parameters are const wchar_t * (unicode) or const char * (ANSI), and the system compiler will automatically convert it.
Method 3, using cstring :: getBuffer. E.g:
CString S (_T ("this is a test");
LPTSTR P = S.GetBuffer ();
/ / Add code using P here
IF (p! = null) * p = _t ('/ 0');
S.ReleaseBuffer ();
// Release after using it in order to use other CSTRING member functions
(3) BSTR is converted into char *
Method 1, using ConvertBSTRTSTRING. E.g:
#include
#pragma comment (lib, "comsupp.lib")
INT _TMAIN (int Argc, _tchar * argv []) {bstr bstrtext = :: sysallocstring (l "test");
Char * lpsztext2 = _com_util :: ConvertBSTRTSTRING (BSTRTEXT);
Sysfreestring (bstrtext); // is released
DELETE [] LPSZTEXT2;
Return 0;
}
Method 2, using _BSTR_T's assignment operator overload. E.g:
_BSTR_T B = BSTRTEXT;
Char * lpsztext2 = B;
(4) CHAR * Convert to BSTR
Method 1. Use the API functions such as SysallocString. E.g:
BSTR BSTRTEXT = :: Sysallocstring (l "test");
BSTR BSTRTEXT = :: SysallocStringlen (l "test", 4);
BSTR BSTRTEXT = :: SysallocStringBytelen ("Test", 4);
Method 2, use Colevariant or _Variant_t. E.g:
// Colevariant Strvar ("This Is A Test");
_variant_t strvar ("this is a test");
BSTR BSTRTEXT = STRVAR.BSTRVAL;
Method 3, using _bstr_t, this is the easiest way. E.g:
BSTR BSTRTEXT = _BSTR_T ("This Is A Test");
Method 4 uses CCOMBSTR. E.g:
BSTR BSTRTEXT = CCOMBSTR ("This Is A Test");
or
CCOMBSTR BSTR ("This Is A Test");
BSTR BSTRTEXT = BSTR.M_STR;
Method 5 uses ConvertStringTOBSTR. E.g:
Char * lpsztext = "test";
BSTR BSTRTEXT = _COM_UTIL :: ConvertStringTOBSTR (LPSZTEXT);
(5) CString Convert to BSTR
Usually achieved by using CStringt :: Allocsystring. E.g:
CString STR ("this is a test");
BSTR bstrtext = str.allocsystring ();
...
Sysfreestring (bstrtext); // is released
(6) BSTR is converted into cstring
Generally, according to the following methods:
BSTR BSTRTEXT = :: Sysallocstring (l "test");
CSTRINGA STR;
Str.empty ();
Str = BSTRTEXT;
or
CSTRINGA STR (BSTRTEXT);
(7) Conversion between ANSI, Unicode and Wide characters
Method 1. Use multibyTetowideCha to convert the ANSI characters to Unicode characters and use WideChartomultibyte to convert Unicode characters to an ANSI character.
Method 2, use "_t" to convert the ANSI to "General" type string, use "L" to convert ANSI to Unicode, and String an ANSI string is converted to string * objects in a managed C environment. For example: tchar tstr [] = _t ("this is a test");
Wchar_t wszstr [] = l "this is a test";
String * str = s "this is a test";
Method 3, using ATL 7.0 conversion macro and classes. ATL7.0 improves and adds many string conversion macros in the original 3.0 and provides the corresponding classes, which have a unified form shown in Figure 3:
Where the first C represents "class" to facilitate the difference between the ATL 3.0 macro, the second C represents the constant, 2 indicates "to", and the EX indicates to open up a certain size buffer. SourceType and DestinationType can be A, T, W and OLE, which are ANSI, Unicode, "General" types and OLE strings, respectively. For example, CA2CT is a string of ANSI to convert ANSI to a general type. Here are some sample code:
LPTSTR TSTR = Ca2Tex <16> ("This Is A Test");
LPCTSTR TCSTR = Ca2CT ("This Is A Test");
Wchar_t wszstr [] = l "this is a test";
CHAR * Chstr = CW2A (WSZSTR);
Sixth, conclusion
Almost all procedures are used to use strings, while Visual C . NET is more frequent due to powerful, and the conversion between strings is more frequent. This article is almost involved in all current conversion methods. Of course, for the .NET framework, the CONVERT and TEXT classes can be used to perform mutual conversion between characters and character encoding.