Unicode: Wide Biode Character Set
1. How do I get a character number of characters that contain both single-byte characters and a double-byte character?
The run library of Microsoft Visual C can be called to include a function_MBslen to operate multiple bytes (including both single-byte also including double-byte) strings.
Call the Strlen function, can't really understand how many characters exactly in the string, can only tell you how many bytes before the end of the end.
2. How do I operate the DBCS (Dual-Dimming Character Set) string?
Function description
PTSTR CharNext (LPCTSTR); Returns the address of the next character in the string
PTSTR Charprev (LPCTSTR, LPCTSTSTR); Returns the address of the last character in the string
Bool isdbcsleadByte (byte); if the byte is the first byte of the DBCS character, return non-0 value
3. Why use Unicode?
(1) Data exchange can be easily exchanged between different languages.
(2) Enable you to assign a single binary .exe file or DLL file that supports all languages.
(3) Improve the operating efficiency of the application.
Windows 2000 is developed using Unicode, if you call any Windows function and send it an ANSI string, then the system first converts the string into Unicode and then passes the Unicode string to the operating system. If you want the function to return an ANSI string, the system will first convert the Unicode string to an ASI string, then return the result to your application. Transformation of these strings requires the time and memory of the system. By developing applications from head to develop applications, you can make your application run more efficiently.
Windows CE itself is an operating system using Unicode, which does not support ANSI Windows functions.
Windows 98 only supports ANSI, which can only develop applications for ANSI.
When Microsoft converts COM from 16-bit Windows into Win32, the company decides that all COM interface methods that require strings can only accept UNICODE strings.
4. How to write a Unicode source code?
Microsoft designed WindowsAPI for Unicode so that you can minimize the impact of code. In fact, you can write a single source code file to compile it or not using Unicode. Just define two macros (Unicode and _unicode), you can modify and recompile the source file.
The _unicode macro is used for C running files, while Unicode Macro is used for Windows header files. When compiling the source code module, these two macros must usually define.
5. What are the unicode data types defined by Windows?
Data type description
Wchar Unicode characters
PWSTR points to the pointer to the Unicode string
PCWSTR points to a pointer to a constant Unicode string
Corresponding ANSI data types are CHAR, LPSTR, and LPCSTR.
The ANSI / Unicode Universal Data Type is TCHAR, PTSTR, LPCTSTR.
6. How do I operate Unicode?
Character set features instance
ANSI operation function starts with STR STRCPY
UNICODE operation function starts with WCS WCSCPY
MBCS operation function starts _MBSCPY
ANSI / UNICODE operation function starts _TCSCPY (C run library)
The ANSI / UNICODE operation function has two versions of ANSI and Unicode in Windows2000 in Windows2000 in Windows 2000 in Windows2000. The ANSI version function is ended in A; the Unicode version function ends in W. Windows will define as follows:
#ifdef unicode
#define CreateWindowEx CreateWindowexwWW
#ELSE
#define CreateWindowEx CreateWindowexa
#ndif //! Unicode
7. How do I represent Unicode string constance?
Character set instance
ANSI "string"
Unicode L "String"
ANSI / Unicode T ("String") or _Text ("String") IF (Szerror [0] == _Text ('J')) {}
8. Why should the operating system function should be used as possible?
This will help slightly improve the operating performance of the application because the operating system string function is often used by large applications such as the operating system's housing process Explorer.exe. Since these functions are used, they may have been loaded into the RAM when the application is running.
Such as: strcat, strchr, strcmp, and strcpy, etc.
9. How to write applications that meet ANSI and Unicode?
(1) Several text string as a character array rather than a Chars array or byte array.
(2) Use the general data type (such as TCHAR and PTSTR) for text characters and strings.
(3) Use the explicit data type (such as byte and pbyte) for bytes, byte pointers, and data caches.
(4) Use the TEXT macro for primary characters and strings.
(5) Perform a global replacement (for example, with PTSTR replacement PSTR).
(6) Modify the string operation problem. For example, functions typically want to pass a cache size in characters, not bytes. This means that SIZEOF (SZBuffer) should not be delivered (SizeOf (SZBuffer) / SizeOf (Tchar). In addition, if you need to assign a memory block for the string, you have the number of characters in the string, then please Stay in bytes to allocate memory. This is to say, you should call
Malloc (ncharacters * sizeof (tchar)) instead of calling malloc (ncharacters).
10. How do I have a choice for a string?
Implemented by calling CompareString.
Sign meaning
NORM_IGNORECASE ignores the case of the letter
NORM_IGNOREKANATYPE does not distinguish 平 假 假 名 假 假 名 名
NORM_IGNORENONSPACE ignores uninterrupted characters
NORM_IGNORESYMBOLS Ignore symbol
NORM_IGNOREWIDTH does not distinguish between single-byte characters and the same character as a double-byte character
Sort_strings processed punctuation as a normal symbol
11. How to determine whether a text file is ANSI or Unicode?
Judgment If the beginning of the beginning of the text file is 0xFF and 0xfe, then Unicode, otherwise it is ansi.
12. How to determine a string is ANSI or Unicode?
Judging with IstextUnicode. IStextunicode uses a series of statistical methods and qualitative methods to guess the cache. Since this is not an exact scientific method, ISTEXTUNICODE may return incorrect results. 13. How to convert a string between Unicode and ANSI?
The Windows function multibytetowideChar is used to convert the multi-byte string into a wide string; the function WideChartomultibyte converts the wide string into an equivalent multi-byte string.