Chinese characters code representation and display

xiaoxiao2021-03-06  42

First, Chinese characters code

In May 1981, my country's National Standard General Administration issued the "Han Character Coding Character Set" (GB2312-80) (GB2312-80) (GB2312-80), referred to as national standard Chinese characters, also called national code. The national code has received 7,445 standard characters. Among them, 3755 Chinese characters, 3008 Chinese characters, totaling 6763 Chinese characters. Due to the characters of Chinese characters, one byte is not enough to represent all common Chinese characters. At the same time, in order not to confuse with Western ASCII, each Chinese character or symbol of the Chinese character code uses 2 bytes (16-bit binary) code. Western characters are represented by a byte, that is, the ASCII code, usually only 128 characters in seven bits, and use the highest bit as parity (or not).

Introduction to the national standard: In the GB2312-80 code table, longitudinal is divided into 0 ~ 93, a total of 94 lines; the transverse direction is also 0 ~ 93, a total of 94 columns. The rows and columns are represented by b7b6b5b4b3b2b1 seven binary binary code, and the first byte represents the row, the second byte represents the column. Its value is from 0100001 to 1111110 (hexadecimal 21-7e). This is the code range of the printable characters of the ASCII code. The national standard is to write the first byte and the second byte. Since binary is too long, it is generally expressed in hexadecimal.

Location code introduction: In the national standard GB2312-80, the national standard may also represent the form of a zone code in addition to the double seven-bit binary representation. That is, in the national standard code table, the line number is called the area code, and the column number is called the bit number, with 94 districts and 94 bits, respectively. The area code and the bit number are indicated by decimalization, and there is not enough to supplement 0. This allows each Chinese character or symbol to represent 4 bits decimal. The location code can therefore be used for an input code. It is one of the basic coding methods of Chinese characters.

Introduction to the Code: Double-byte Chinese characters in the computer use, handle, however, the all bytes of Chinese character encoding will not be confused with the single-byte ASCII code; Therefore, the highest position of the two byte encoding of the Chinese characters is 1, this highest bit of 1 double-byte Chinese character code is the Chinese-oriented interior code of China's commonly used, referred to as internal code, is stored inside the computer, processing The code used by Chinese characters.

The relationship between the internal code, national code, and the location code is:

High byte internal code = high character country code 80H = area code 20h 80H = area code 0a0h = area code 160

Low byte internal code = low word nation specification 80H = bits 20h 80h = bits 0a0h = bits 160

Traditional Chinese characters are still in use in some areas and areas. The country has developed the corresponding traditional Chinese character character set, the national standard code is GB12345-90 "Information exchange Chinese characters encoded character set - auxiliary set", including 717 graphics symbols and 6866 traditional Chinese characters. BIG5 is the Chinese character encoding character set used in computer systems in my country, including 420 graphics symbols and 13070 traditional Chinese characters (without simplicity).

Second, the word model library of Chinese characters

The output of Chinese characters mainly refers to the output of the Chinese character shape. The mode is mainly displayed and printed. When the Chinese character outputs, use a dot matrix to represent a Chinese character. There are only two states of each point of the dot matrix: a bit or no. If the binary code is used to indicate that the value is 1 is 1, it is a bit, and the value is 0 is indicated. The output principle of Chinese characters is the same as the principle of the output of Western. Different is that there are many Chinese characters. It is necessary to express a Chinese character very well, and the at least 16 × 16 o'clock is required. If the word is required to be realistic, the number of points of the dot matrix is ​​also increased. If 24 × 24, 32 × 32, 48 × 48, etc., the storage space of Chinese characters is much larger than that in Western, and a large number of storage space needs to be stored.

Describe the binary code string of a Chinese character dot matrix information is called the "word model" of Chinese characters. All Chinese characters and various symbols of the dot matrix information constitute the "word model library" of Chinese characters. The order of the word mode is: first from left to right, then from top to bottom. That is, 8 points in the top left of the first line, then 8 points above the top right, then 8 points on the left side of the second line, 8 points on the right, and so on. Third, the principle of the display of Chinese characters

1. The Chinese characters entered from the keyboard pass through the keyboard management module, and transform the interior code.

2. Then the word model search program, find the address of the point-in-line information corresponding to the machine code in the word model library.

3. Retrieve the Chinese characters in the word library.

4. Use the display driver to send this information to the display buffer of the display card.

5. The controller of the display sequentially reads the dot matrix information, and allows each binary bit to correspond to a point bit of the screen, you can display the Chinese characters on the screen.

Fourth, use the area code to get the dot information of Chinese characters

Take 16 × 16 dot matrix documents as an example. A Chinese character uses a total of 32 bytes in a total of 256 points. Chinese characters are divided into 94 districts, with 94 Chinese characters per district. The internal code is represented by two bytes, the first byte storage area code (QH), in order to distinguish between the ASCII code, the range from hexadecimal A1H (less than 80 h is an ASCII code character), corresponding area code The first zone; the second byte is the bit number (WH), the range is also starting from the A1H, corresponding to the first bit code in a certain area. In this way, the interpose of the Chinese prize is minus A0A0H to get the zone code of the Chinese character.

Thus you can get the specific location of Chinese characters in the font:

Location = (94 * (QH-1) WH-1) * One-to-use word number

The calculation formula for the specific location of Chinese characters in the font library is: (94 * (QH-1) WH-1) * 32. For example, the "room" machine code is a hexadecimal B7BF, and its location code is B7BFH-A0A0H = 171FH, converted to decimal is 2331, where the location in the Chinese custom library is 32 * [94 * (23-1 ) (31-1)] = 67136 bytes 32 bytes of "room" display point array.

转载请注明原文地址:https://www.9cbs.com/read-54605.html

New Post(0)