Talking about ISOIEC 10646 Coding

zhaozj2021-02-16  101

What is ISO 10646 International Code Standard

In order to provide a common technical basis to handle electronic data of different languages, the International Standardization Organization (ISO) has developed a set of international coding standards named ISO 10646. This standard includes traditional Chinese characters in the world, including traditional and simple Chinese characters, and compiling unified internal codes.

Various languages ​​have different characters. In order to deal with the characters of the various regions themselves in the computer and electronic devices, different coding standards are used all over the world. For example, Hong Kong and Taiwan use traditional Chinese characters, usually use "large five yard" coding standards. The use of simplified characters in mainland China usually uses "national code" coding standards. A variety of different coding criteria are incompatible, and a different character may represent different characters in different coding standards, leading to electronic information issued in a certain region, which may appear or certain characters when transferred to a computer system transferred to other regions. You can't display any problems correctly. Even if the converted encoded software can be used to display different encoded materials, this is not allowed to completely solve the problem.

The purpose of the ISO 10646 international coding standard is to solve the above problems. This encoding standard covers characters of various primary languages, including traditional and simplified Chinese characters. ISO 10646 International Coding Standard provides a unified character coding standard for easy electronic communications and data exchange. After this standard, the world's different computer systems can be more accurately stored, handled, transmitted, and display electronic information of various languages, thereby strengthening the circulation of electronic information between all parts and promoting electronic transactions.

Development of ISO 10646 International Coding Standards

International Standardization Organization issued the first version of ISO 10646 International Coding Standard in 1993, the full name is "ISO / IEC 10646-1: 1993", published in 2000, "ISO / IEC 10646-1 : 2000. "ISO / IEC 10646-1: 2000 is the updated version of" ISO / IEC 10646-1: 1993 ", which includes 20,902 episodes and newly received extensions in" ISO / IEC 10646-1: 1993 ". A 6,582 episodes of a total of 27,484 episodes. International Standardization has published "ISO / IEC 10646-2: 2001" in November 2001 as a supplement to "ISO / IEC 10646-1: 2000". "ISO / IEC 10646-2: 2001" In the expansion zone B added 42,711 episodes, which enables ISO 10646 International Code Standards more than 70,000, including all included in the "Kangxi Dictionary", " Chinese characters "and Chinese characters in the" Chinese Dictionary ". International Standardization has published "ISO / IEC 10646: 2003" in April 2004. "ISO / IEC 10646: 2003" combined with "ISO / IEC 10646-1: 2000" and its supplementary version "ISO / IEC 10646-2: 2001" became a single distribution. Therefore, "ISO / IEC 10646: 2003" has the same meaning text in "ISO / IEC 10646-1: 2000" with "ISO / IEC 10646-1: 2000".

Issue text refers to text, such as Chinese characters. The work of the ISO 10646 international coding standard is three phases conducted in the third phase, namely the extended area A, expansion zone B and extension zones C. The extension zone A and expansion zone B have been published with "ISO / IEC 10646-1: 2000" and "ISO / IEC 10646-2: 2001". As for the work plan of the expansion zone C, the International Standardization Organization will decide later.

International Standardization Organization and UNICODE Academic Society maintains ISO 10646 International Coding Standards and Synchronous Development of Unified Codes. For information on ISO 10646 International Coding Standard / Unified Code, see the web page of the Unicode Academic Society http://www.unicode.org/CHARTS/.

ISO 10646 Idea text

Issue text refers to a text that has a relationship with a word meaning. ISO 10646 refers to a set of international coding standards developed by the International Standardization Organization (ISO). This standard is called Chinese characters in all Chinese characters and other languages ​​(such as Japanese Kanzi and Hanja). ISO 10646 International Code Standard Chinese Characters are divided into three main coding districts, namely, China-Japan and Korea's Issue Zone, Sino-Japan and Korean Issue Zone Extension Zone A and Sino-Japan and South Korea Conceptual Zone Expansion Area B. The expressions of the Sino-Japanese and Korean Issue Zone and Extension Area have announced with "ISO / IEC 10646-1: 2000;" ISO BII "of the expansion zone B has" in November 2001 " ISO / IEC 10646-2: 2001 published.

The benefits of ISO 10646 expansion zone B

The extension zone b is like a Chinese Japanese and South Korea, and the general Chinese characters collected by each bid are also included in all borders, and these characters have been included in ISO 10646.

ISO 10646 International Coding Standard After storage expansion zone B, the total number of expressions included in the expansion is increased to over 70,000, which contains all Chinese characters included in the "Kangxi Dictionary", "Chinese Dictionary" and "Chinese Dictionary". After ISO 10646 expansion zone B, the public can make daily electronic communication more effectively and accurately using more common Chinese characters.

Structure of ISO 10646 expansion zone B

In the structure, the expressions of the Sino-Japanese Korean Issue Zone and the Sino-Japan Korean Issue Zone expansion zone A can be composed of a hexadecimal encoding (for example: a hexadecimal value of 4e00), but "ISO / IEC 10646-2 : 2001, China-Japan-Korean condictive text area, Identification, Items, must be composed of 32-bit element (eg hexadecimal numerical 00020000, usually simplified to 20000).

ISO 10646 extension area B web page

"ISO / IEC 10646-2: 2001" contains 42,711 episodes, 1,640 "Hong Kong Supplement Character Sets - 2001" new characters.

ISO 10646 expansion area B webpage contains more detailed information, indicating system requirements, reference glyphs and input method software, and how to view "ISO / IEC 10646-2: 2001" or "ISO / IEC 10646: 2003" expansion area B "Hong Kong Supplement Character Set - 2001" character.

The following animation explains examples and flexibility of ISO 10646 expansion zone B in daily Chinese electronic communications.

Examples of ISO 10646 expansion zone B using ISO 10646 extensions in daily Chinese electronic communications

Flexibility to adopt ISO 10646 expansion zone B

Content text group

The context team is the work unit of ISO / IEC JTC1 / SC2 / WG2 (see below) within the International Standardization of ISO / IEC JTC1 / SC2 / WG2, and the ISO 10646 International Coding Standard Internal Identification (Normally, Japan, Japan, South Korea and Asia) Work. The task of the Identity Team is to submit an express expression to the International Standardization Organization to accommodate in the ISO 10646 international coding standard. The context team has developed a CJK Unified IDEOGRAPHS BLOCK, China-Japan-Korea Ideographs Extension A (CJK Unified IDEOGRAPHSION A Block) and Sino-Japan-Japan Korean Ideograms Extension Area B (CJK Unified IDEOGRAPHS EXTENSION B Block, and is developing the extension area C to accommodate the ISO 10646 International Coding Standard as much as possible in the world.

Issue text team member

Ideal Text Group members come from China, Hong Kong, Macau, Taipei Computer Commercial Association, Singapore, Japan, South Korea, North Korea, Vietnam and the United States. In addition, UNICODE Academic Society also has a representative of the meeting, coordinating the synchronous development of ISO 10646 International Coding Standards and Unified Codes.

Document with a text group

For various documents, including the meeting agenda, the meeting report, the resolution reached by the International Standardization Organization, and the resolution reached at the meeting, can be found in the following websites http://www.cse.cuhk.edu.hk/~irg /.

ISO / IEC JTC1 / SC2 / WG2ISO 10646 International Coding Standard is a working group named ISO / IEC JTC1 / SC2 / WG2 by the International Standardization Organization. Joint Technical Committee On Information Technology (ISO / IEC JTC1) is the "International Electric Technical Committee" (IEC) and IEC and international standardization organization protocol, its work scope covers two agreement agencies about information about information technology. . ISO / IEC JTC1 has a sub-committee named ISO / IEC JTC1 / SC2, which is responsible for standardizing the internal code of various Chinese characters set. This subcommittee has a work group named ISO / IEC JTC1 / SC2 / WG2, which is responsible for developing ISO 10646 International Coding Standards.

Unified code (Unicode)

Whenever a unified coding standard, many people may hear the name of Unicode. I believe that everyone will pay attention to the unified code and ISO 10646 international coding standards can be compatible with each other.

The unified code is a character encoding system developed by agency named Unicode Academic Society to support the exchange, processing and display of written text in the world's main language. Most of the members of the Unicode Academic Society are suppliers of computer hardware and software.

In 1991, the International Standardization Organization and Unicode Academic Society decided to jointly develop a general coding standard for multiple language text. Since then, the two organizations have been working closely to develop ISO 10646 international coding standards and unified code. International Standardization provides characters and coding information in ISO 10646 International Coding Standards. The UNICODE Academic Society makes the method and semantic data of these characters and coding materials. ISO 10646 International Coding Standards and Unified Codes The codes used are the same. The unified code can be seen as a practical version of ISO 10646 international coding standard. Therefore, the products that support unified code have also supported ISO 10646 international coding standards.

A unified code 3.0 developed by the Unicode Academic Society was officially launched in February 2000. This version has accomplished 49,194 characters from different languages ​​around the world, including 27,484 East Asia's Identity (Chinese characters). The Unified Code 3.0 is a version corresponding to ISO / IEC 10646-1: 2000.

Uniform No. 3.1 released in March 2001. The main feature of this version is to increase 44,946 new characters, of which 42,711 are expressions. With the original character of the unified code 3.0, the Unified Code 3.1 has a total of 94,140 characters, which has more than 70,000 identical characters.

Uniform Code 3.2 is launched in March 2002. Although this version includes 1,016 new characters, it contains the same version 3.1 of the Unified Code 3.1.

The latest version of Unified Code was a version 4.0 released in April 2003. Although this version includes 1,226 new characters, it contains the same version 3.1 of the Unified Code 3.1. Unified Code 4.0 is a version corresponding to "ISO / IEC 10646: 2003".

For information on ISO 10646 International Coding Standard / Unified Code, see the web page of the Unicode Academic Society http://www.unicode.org/CHARTS/.

Coding standard comparison

Big five yards (BIG-5)

The big five yards are the coding standards established in Taiwan's major Chinese software developers in more than ten years ago, including approximately 13,000 traditional Chinese characters. The big five yard is also a Chinese coding standard in Hong Kong.

National Standard Code (GB)

The national standard code is the national coding standard of the Chinese government. Its latest version is the GB 18030-2000, published in 2000, including approximately 27,000 Chinese characters.

ISO 10646 International Code Standard

ISO 10646 International Coding Standard is a coding standard developed by international standardized organizations that contain characters in the world's primary Chinese. Among them, the Chinese characters of China, Taiwan, Japan and South Korea will unify the Chinese character coding standards in South Korea, and become a selection of about 70,000 Chinese characters. ISO 10646 International coding standard can be considered as unified code equivalent. coding

ISO 10646 International Code Standard

National Standard Code GB 18030-2000

Big five yards (BIG-5)

Characteristics

Unified Chinese, Taiwan, Japan, South Korea, the Chinese character standard, including all the characters of all five yards and national standard code

The internal code arrangement is different from ISO 10646 international coding standards.

Only traditional Chinese characters

Support character

Traditional and Simplified Chinese characters can be displayed under the same interface

Characters including ISO / IEC 10646-1: 2000

Traditional Chinese characters can only be displayed under an interface

Universal area

all around the world

China

Hong Kong, Taiwan

Formulation

International Organization for Standardization

Chinese government

Taiwan Chinese software developer

Chinese characters

About 70,000

About 27,000

About 13,000

System Requirements

To correctly display the character of "ISO / IEC 10646: 2003", make sure your computer runs on platforms supporting "ISO / IEC 10646: 2003" (eg Chinese or English window XP) and application software (for example: mozilla 1.5, Internet Explore 6.0 and Microsoft Office XP).

Press

Here http://www.info.gov.hk/digital21/chi/hkscs/terms/terms35.html

Download system requirements and installation guidelines.

Press

Here http://www.info.gov.hk/digital21/chi/hkscs/terms/terms36.html

Download reference fonts and input method software.

转载请注明原文地址:https://www.9cbs.com/read-10949.html

New Post(0)