Chinese character code national standard and GB18030

xiaoxiao2021-03-06  52

Chinese character code national standards and status quo

1. Name:

GB 2312-1980 (Information exchange Chinese character encoding character set base set)

GBK-1995 (Chinese Code Incremental Specification)

GB13000.1-1993 (Information Technology Universal Multi-eight-Bit Coded Character Set (UCS) Part 1: Architecture and Basic Multisteted Plane (IDT ISO / IEC 10646.1-1993))

GB 18030-2000 (information exchange with Chinese character coding character set base set)

2. Relationship:

GB 18030 is fully compatible with GBK, GB2312, and is the alternative standard of the latter, will also be the only character set national standard in the future.

The relationship between GB 18030 and GB13000.1: The code table is not compatible on the word. For example, there will be "ah" word in both standards, but in two standards, "ah" is different.

GB13000.1 and ISO10646 are the same system, compatible with industrial standards Unicode 3.1.

3.GB18030:

This standard is divided into two parts: the double-byte portion and the four-byte part.

The double-byte portion and GBK are basically identical.

Since the four-byte parts, there are 6,582 Chinese characters than GBK (27484-20902), 0x8139ef30 ~ 0x82358738. The corresponding GB13000.1 is 0x3400 ~ 0x4db5.

4. OS currently supporting this standard:

The patch version of the Windows2000 after September 1, 2001;

Windows XP;

And some Linux, UNIX OS. 5.GB18030 Problem Since the GB18030 includes 4 bytes encoded Chinese characters, and the 2-byte encoded GBK and Unicode 3.1 that are commonly used in the Windows platform have a great difference, there are many problems in use. For example, Microsoft's Windows XP actually only supports Unicode3.1 encoding, and 4 bytes of GB18030 are not supported.

转载请注明原文地址:https://www.9cbs.com/read-80850.html

New Post(0)