BiG5 to GB conversion technology

zhaozj2021-02-17  59

Because there are too many quantities, it is different from English with ASCII code, which uses two bytes.

Representation. By calculating these two bytes, we can get the Chinese characters represented in

The location in the text library. Read several bytes of this location to get dot matrix information indicating this Chinese character. Have

This information can be displayed in DOS or Windows, respectively. In fact,

Saved in the text file is two byte encodings corresponding to each Chinese character, and the display problem is operated by Chinese.

The system is automatically resolved.

Chinese character encoding is not uniform, we use the GB code, and the Taiwan area is BIG5 code. BIG5

The code file is saved in the code file, and the Chinese characters are saved in the GB code file.

The corresponding GB coding (this is also the "garbled phenomenon"). Therefore, the key to the conversion work is that there is one

Record each BIG5 encoding the code table file corresponding to GB encoding.

First step to make a code table file

The BIG5 code coding rule is this: Each Chinese character consists of two bytes, the first byte ranges from

0x81-0XFE, a total of 126 species. The second byte range is 0x40-0x7e, respectively.

0xA1-0XFE, a total of 157 species. That is, use these two bytes to define 126 * 157 = 19

782 Chinese characters. Part of these Chinese characters is usually used, such as one, Ding, these words

We call us common words, its BIG5 code is 0xA440-0XC671, a total of 5401. Not commonly used word

, If you are abused, we call us commonly used words, range 0xc940-0xf9fe,

7652, the remaining is some special characters.

The principle of making a code table file is this: first write all the BIG5 encodes into a file, then

, Using software with BIG5 code to GB code conversion function, such as earth village, Oriental expressway,

Four-way persons, convert files into GB code files, and get the code table file.

The following source program writes all possible BIG5 encoding (0xA100-0xFeff) to file "table.

TXT".

// Turbo C 3.0

#include

#include

Void main () {

File * codefile;

INT I, J, K;

Codefile = FOPEN ("Table.txt", "W B");

For (i = 0xA1; i <= 0xfe; i ) {

For (j = 0x00; j <= 0xff; j ) {

FWRITE (& I, 1, 1, Codefile);

FWRITE (& J, 1, 1, Codefile);

}

Fclose (codefile);

Return;

}

Run Earth Village, Orient Express or Quartent Practice, turn "Table.txt" from the BIG5 code to GB code,

That is, get the code table file.

Second step conversion

The following source program converts the BIG5 code file to the GB code file.

// Turbo C 3.0

#include

#include

Void main () {

INT Que, WEI;

File * SourceFile;

File * Tabfile;

File * destfile;

SourceFile = FOPEN ("BIG.TXT '," R B ");

// BIG5 code file Tabfile = fopen ("Table.txt", 'R B ");

// code table file

Destfile = FOPEN ("GB.TXT", "W B");

// Convert the generated GB code file

While (! feof (Sourcefile) {

Fread (& Que, 1, 1, Sourcefile);

IF (SourceFile) {

Break;

IF (Que> = 0xA1 && Que <= 0xfe)

// Whether the rebel is Chinese characters (BIG5 code)

{FREAD (& wei, 1, 1, sourcefile);

IF (WEI <0xA1) Wei = WEI - 0x40;

IF (wei> = 0xa1) Wei = wei - 0xa1 0x7e - 0x40 1;

FSeek (Tabfile, 2 * ((Que -0xa1) * (0xfe - 0xa1 1 0x7e - 0x40 1

) Wei), seek_set);

Fread (& que, 1, 1, tabfile);

Fread (& wei, 1, 1, tabfile);

FWRITE (& Que, 1, 1, Destfile);

FWRITE (& wei, 1, 1, destfile);

}

Else

FWRITE (& que, 1, 1, destfile); //

}

Fclose (Sourcefile);

Fclose (tabfile);

Fclose (destfile);

Return;

}

转载请注明原文地址:https://www.9cbs.com/read-29171.html

New Post(0)