Chinese coding conversion instructions

xiaoxiao2021-03-06  46

Principle

Environment: Page: JSP, HTMLServer: Tomcatdb: Informix

Add <% @ Page ContentType = "Text / HTML; CHARSET = GB2312"%> to correctly display the Chinese GBK encoded HTML file to add Correctly display the Chinese GBK code

General encoding process: 1 Input strings (including Chinese or other) in the JSP page, submit

2 Convert strings into Unicode encoded ISO character sets, if it is Chinese, each word is: 0x01 **, that is, 8 bits 100000001, if it is English, each The word is two bytes, the encoding form is: 0x00 **, that is, 8 bits 100000000 ,. Because each character in Java is 16 bits, 2 bytes, (set i to the bit in the string in the string), each of the Chinese character strings: String.Charat (i)> 511 corresponds to Each of the English strings: String.Charat (i) <512

3 String String Database If there is Chinese character in the string and its encoding is ISO encoding, the JDBC first converts it to GBK encodes by ISO, so it is displayed in the database. However, if the Chinese character encoding in the string is GBK encoding (transition), the JDBC converts it to the GBK encoding format in ISO encoding format, and stored in the database. If it is English, it is itself is ISO code.

4 Remove the string from the database If there is Chinese characters in the string and its encoding is GBK code, JDBC first converts it from GBK encoding to ISO encoding. At this time, if not converted to GBK coding, the JSP page will not display correctly. Of course If it is already garbled in the database, it will not be reversed back.

So, when data is taken from the database, we can first determine if the string has a hanicode encoded ISO character set. If String.Charat (i)> 511, convert it into GBK coding, see the String object Gettes ()method

Note Two points: 1 English character can be converted multiple times, unaffected 2 Chinese southerners can be converted from GBK encoding to ISO encoding, reversal, but if the Chinese characters are converted twice or more to the same type, it will be wrong, become wrong Coagan, and cannot be restored. For example, it is displayed twice to GBK encoding twice in succession, then it is displayed as garbled, and it cannot reverse the original code. Therefore, it must be judged to be converted before the conversion.

Second method

In our programming environment, in order to correctly display Chinese, follow the principle: Deposit data is the GBKTOUNICODE method by GBK encoding to ISO encoding detail data is the unicodetogbk method by ISO encoding to GBK encoding, so the following file has been adjusted

1 * SCHEMA.JAVA file In the Get Field Name () method, add a strtool.unicodeTogbk () Conversion purpose: Transform the value of the resulting field Encoding the Encode () method, join the STRTOOL.UNICODETOGBK () Conversion Detection: In the string conversion encoding decode () method, add the strtool.gbktounicode () conversion destination: Include the incoming string conversion encoded Getv () method, join the strtool.gbktounicode () conversion destination: Incurrent string Conversion coding, directly operating the database function, so there is no need to convert the encoded 2 * db.java file in the set field name () method, add strtool.gbktounicode () conversion destination: Introduced query string Conversion coding

3 In the getonevalue () method in the EXESQL.JAVA file, join the STRTOOL.GBKTOUNICODE () Conversion Detection: Conversion of incoming query string

转载请注明原文地址:https://www.9cbs.com/read-60990.html

New Post(0)