Analysis of Java multilingual coding problem (1)

zhaozj2021-02-16  65

1. The Java Compiler will convert the source file to Unicode encoding before compiling the source file. For this reason, we must use the source file "tell" compiler when compiling.

For example, our source file is saved in UTF-8, and in compiler, the compiler is saved as a GBK mode, so that the compiler will follow the encoded conversion method of GBK-> Unicode The file is converted, then compile, this will of course be wrong, and the compiler should be converted to the source file in accordance with the UTF-8-> Unicode encoding method.

a. For console programs, the compiler will regard the source file as a system default encoding type (system default encoding type depends on the configuration in the control panel area setting, "in Chinese Win2k is usually GBK), or Use the -Encoding parameter to set, such as Javac -Encoding UTF-8, so the compiler will see the source file as the UTF-8 encoding (this is just the encoding type of the compiler source file, not the source file Transcoding). In the platform of various languages, as long as the encoding method with the source file is specified when compiling, it is not an international problem.

b. For JSP, the compiler will determine what encoding method is used according to the set character set, and then convert it into Unicode; if the JSP is not specified, the compiler will regard the JSP file as a Save according to the default encoding of the system. In JSP 2.0, a <% @ Page PageEncoding = "" command is added to notify the compiler's encoding method used by this source file.

2. When processing the input and output, Note that the encoding type of the input stream and the output stream is consistent with the encoding mode employed when the user input is displayed.

Since the JRE encodes the contents of the input or output when processing the input and output, the input will be converted to Unicode and then feed, so the encoding mode of the actual input content is to be correctly entered, and inform the JRE encoding method, for the output, will The Unicode is converted to other encoding and then transmitted, so the encoding mode is used to correctly match the output device and inform the JRE encoding method.

For example: The encoding of the input stream in the program is New InputStreamReader (System.in, "GB2312"); after the program is running, the user input is used, and the BIG5 encoded is entered, so that JRE encodes BIG5 The content is treated as the encoded conversion of GB2312-> Unicode, which is obviously not the content you want to enter.

By default, JRE will use the contents of the input and output as a default encoding method according to the system.

3. In the servlet, in addition to the "tell" compiler that must be used by the source file correctly correct, pay attention to the actual URL data, the coding format of the form data, and the encoding format declared in the request. .

When the client browser submits data through the form and the URL, the container and JVM regard the data in the request as encoded in accordance with the encoding method declared in the request, and then converts the data to Unicode and then sent to servlet. (In fact, the container will turn the data in the Request to an intermediate encoding method, depending on the configuration of the container, and then converted from the JVM to Unicode, usually this intermediate format is ISO). The Unicode data output from the servlet is converted by the container according to the encoding method declared in Response, and is sent to the client browser.

When receiving the client input, the encoding method of requesting the request in request is declared with the request.setCharacterenceEncoding ().

Use response.setContentType ("text / html; charSt =") when outputting to the client, telling the browser which encoding method is displayed in the encoding method of the data response. 4. In JSP, since JSP is compiled as a servlet by the JSP compiler, the situation is the same as servlet.

<% @ Page ContentType = "Text / HTML; Charset ="%> <% Request.setCharacterencoding ("");%> These two JSP instructions declare the encoding method for request and response.

Just make sure that the encoding mode of the data in the URL parameter or form is consistent with the declared coding method, and then inform the JSP compiler this JSP file and which character contains this JSP file. Solve the character encoding problem of JSP.

转载请注明原文地址:https://www.9cbs.com/read-21353.html

New Post(0)