"Windows Core Programming" Reading Notes (3)

xiaoxiao2021-03-06 110

Chapter 2 Unicode

The Unicode-Wide Birace Character Set is a technical standard customized to solve the software localization (multi-language version). All characters in the Unicode string are 16-bit (two bytes), and the programmer only needs to increment or decrease the pointer, you can traverse the individual characters in the string, and do not judge the same as a single-byte character. The next byte is a component that belongs to the same character or a new character. With Unicode, there are several benefits, you can easily exchange data between different languages, enabling you to assign a single binary .exe file or DLL file that supports all languages, improves the running efficiency of the application. Each Windows operating system supports Unicode: Windows 2000 supports both Unicode, also supports ANSI, so you can develop applications for any one. Windows 98 only supports ANSI, which can only develop applications for ANSI. Windows CE only supports Unicode, which can only develop applications for Unicode. Because COM is usually used to make different components to communicate with each other, and Unicode is the best means of delivering a string. So all COM interface methods that require strings can only accept Unicode strings.

2.8 How to write unicode Source Codes C Support for Unicode Defines a data type of Wchar_t, which is a data type of a Unicode character. For example, if you want to create a cache, you can use the Unicode string of up to 99 characters and a character ending zero, you can use the following statement: wchar_t szbuffer [100]; of course, the string functions in C Such as STRCPY, STRCHR, STRCAT, etc., can only operate the ANSI string, and unicode cannot be processed correctly. Therefore, ANSI C added a set of functions: ANSI: char * strchr (const char *, int); unicode: wchar_t * WCSCHR (const wchar_t *, wchar_t);

ANSI: int strcmp (const char *, const char *); unicode: int wcscmp (const wchar_t *, const wchar_t *);

ANSI: Char * STRCPY (char *, const char *); unicode: wchar_t * wcscpy (wchar_t *, const wchar_t *); ANSI: size_t strlen; unicode size_t wcslen; Please note that attention, All Unicode functions start with WCS, WCS is a wide string English abbreviation. To call the Unicode function, simply use the prefix WCS to replace the prefix STR of the ANSI string function. In general, the function of the ANSI and Unicode characters is not written in the same source code file, which will compile the compiler to compile a lot of trouble (compilation error), but it is necessary to put them in In the same source code file, you need to include a header file tchar.h. The only role of tchar.h header file is to help create an ANSI / Unicode generic source code file. Its working mechanism is, through a set of macros, decide whether the Str function or WCS function is called. For example, in tchar.h, there is a macro to _tcscpy, if it is not defined when including the header file, then _TCscpy wants to be ansi strcpy if defined _unicode, _tcscpy wants The WCSCPY function. There is also a thing that it is worth noting that when using the macro in tchar.h, to generate a Unicode string instead of an ANSI string, you must add a big write character before the string, for example: tchar * szerror = l "error"; Urban letter L usage is to tell the compiler, which should be compiled as a Unicode character. The problem brought about this is that we also need to define a macro to dynamically add uppercase letters L to accommodate the Unicode / ANSI universal source code file. This macro is _text. Tchar * szerror = _texr ("error"); if it is defined above, whether it defines the _unicode file in the source code file, the compiler can correctly identify and compile. In addition, _text macros can also be used to test the first letters of the string. For example: if (szerror [0] == _ text ('j')) {// The process of "J" when the first letter is "J"} Else {// The first letter is not "J",} November 9, 2004

转载请注明原文地址:https://www.9cbs.com/read-93888.html

9cbs

New Post(0)