C ++: BIG5 to GB

zhaozj2021-02-08  269

C : BiG5 to GB (11/03/1999) Because there are too many, it is different from the English using ASCII code, which uses two bytes to represent. By calculating these two bytes, we can get the location of the Chinese characters in the Chinese word library. Read several bytes of this location to get dot matrix information indicating this Chinese character. With this information, you can display the Chinese characters in DOS or Windows, respectively. In fact, saved in a text file is two byte encodings corresponding to each Chinese character, and the display problem is automatically resolved by the Chinese operating system. Chinese character encoding is not uniform, we use the GB code, and the Taiwan area is BIG5 code. The BIG5 code file is saved is the corresponding BIG5 encoding of Chinese characters. The GB code file is saved in the GB code file (this is the origin of "garbled phenomena"). Therefore, the key to the conversion work is that there is a code table file that records each BIG5 encoding corresponding GB encoding. The first step to make the code table file BIG5 code encoding rule is this: Each Chinese character consists of two bytes, and the first byte ranges from 0x81-0xfe, a total of 126 species. The range of the second byte is 0x40-0x7e, 0xa1-0xfe, a total of 157 species. That is, 126 * 157 = 19782 Chinese characters can be defined using these two bytes. Part of these Chinese characters is commonly used, such as one, Dan, which is called common words, and its BIG5 code is 0xA440-0XC671, a total of 5401. More unused words, such as abuse, adjust, we call the common words, range from 0xc940-0xf9fe, a total of 7652, and the rest is some special characters. The principle of making code table files is this: first write all BIG5 encodes into a file, then use the software with BIG5 code to GB code conversion function, such as earth village, Oriental express, four-way persons, convert files to GB code file, you get the code table file. The following source program writes all possible BIG5 encoding (0xA100-0XFEFF) into the file "Table.txt". // Turbo C 3.0 #include

#include

Void main () {

File * codefile;

INT I, J, K;

Codefile = FOPEN ("Table.txt", "W B");

For (i = 0xA1; i <= 0xfe; i ) {

For (j = 0x00; j <= 0xff; j ) {

FWRITE (& I, 1, 1, Codefile);

FWRITE (& J, 1, 1, Codefile);

}

Fclose (codefile);

Return;

}

Run Earth Village, Orient Express or Quartent Practice, convert "Table.txt" from the BIG5 code to GB code, that is, get the code table file.

Second step conversion

The following source program converts the BIG5 code file to the GB code file.

// Turbo C 3.0

#include

#include

Void main () {

INT Que, WEI;

File * SourceFile;

File * Tabfile;

File * destfile;

SourceFile = FOPEN ("BIG.TXT '," R B ");

// BIG5 code file

TabFile = FOPEN ("Table.txt", 'R B ");

// Code table file destfile = fopen ("gb.txt", "w b");

// Convert the generated GB code file

While (! feof (Sourcefile) {

Fread (& Que, 1, 1, Sourcefile);

IF (SourceFile) {

Break;

IF (Que> = 0xA1 && Que <= 0xfe)

// Whether the rebel is Chinese characters (BIG5 code)

{FREAD (& wei, 1, 1, sourcefile);

IF (WEI <0xA1) Wei = WEI - 0x40;

IF (wei> = 0xa1) Wei = wei - 0xa1 0x7e - 0x40 1;

FSeek (TabFile, 2 * ((Que -0xa1) * (0xFE - 0xA1 1 0x7E - 0x40 1) WEI), seek_set);

Fread (& que, 1, 1, tabfile);

Fread (& wei, 1, 1, tabfile);

FWRITE (& Que, 1, 1, Destfile);

FWRITE (& wei, 1, 1, destfile);

}

Else

FWRITE (& que, 1, 1, destfile); //

}

Fclose (Sourcefile);

Fclose (tabfile);

Fclose (destfile);

Return;

}

The above programs are passed by WIN95 / 97, TC3.0. Slightly modified, or in the VC or VB program. With the same method, we can also convert GB codes to a BIG5 code.

This article comes from

Chinese programmer website