Multi-language mixed problem

zhaozj2021-02-17  48

Because Java will have a bug that can't be mixed, it will not be able to display the multi-language BUG. This weekend studied the problem of servlet, JSP's multi-language display, which is the multi-character set problem of servlet, because I am not very good to the concept of character set. Clearly written, it is not necessarily accurate, I understand the character set in Java: When running, all string objects are stored in Unicode internal code (I think all languages There is correspondingly encoded because the internal strings are always represented by the internal code, but only the string coding time platform in the normal computer language is related, while Java uses the platform-independent Unicode).

Java When reading a string from a BYTE stream, the platform-related Byte transition into a platform-independent Unicode string. At the time of output, Java will turn the Unicode string into a platform-related BYTE stream, if a Unicode character does not exist on a platform, will output one '?'. For example: In Chinese Windows, Java reads a "GB2312" encoded file (which can be any stream) to construct string objects in memory, will convert GB2312 encoded characters to Unicode encoded strings, if This string output will turn the Unicode string to the BYTE stream of GB2312: "Chinese test" -----> "/ u4E2D / u6587 / u6d4b / u8bd5" -----> "Chinese test" .

The following routines: byte [] bytes = new byte [] {(byte) 0xD6, (byte) 0xD0, (byte) 0xce, (byte) 0xc4, (byte) 0xB2, (Byte) 0xE2, (Byte) 0xca, byte) 0xd4}; // GBK coded "test Chinese" java.io.ByteArrayInputStream bin = new java.io.ByteArrayInputStream (bytes); java.io.BufferedReader reader = new java.io.BufferedReader (new java.io. InputStreamReader (BIN, "GBK"); string msg = reader.readline (); system.out.println (msg)

This program is placed in a system (such as a Chinese system) containing the four words of "Chinese Test", which can be printed correctly. The MSG string contains the correct "Chinese test" Unicode encoding: "/ U4E2D / U6587 / U6D4B / U8BD5", the default character set conversion to the operating system is converted to the operating system when printing, whether it can correctly display the character set of the operating system, Only in the system that supports the corresponding character set, our information can output the correct output, otherwise it will be garbage.

When you are ahead, let's take a look at the multilingual questions in servlet / jsp. Our goal is that the clients of any country send information to Server via Form, and Server is stored in the database, and the client can still see the correct information sent by the client when retrieving. In fact, we must ensure that the correct Unicode encoding of the client is saved in the SQL statement in the final Server; the encoding mode used by DBC and database communication can contain text information sent by the client, in fact, it is best Let JDBC use Unicode / UTF8 with database communication! This ensures that the information is not lost; the information transmitted by the server will also use the encoding method without losing information or Unicode / UTF8. If you do not specify the Form's ENCTYPE attribute, the Form will put the input content according to the coded character set Urlencode of the current page, and the server is a string of Urlencoding. The encoded URLENCoding string is related to the code of the page, such as the "Chinese test" encoded by the GB2312 encoded, is obtained, "% D6% D0% CE% C4% B2% E2% CA% D4", each "%" Followed by a 16-based string; but the UTF8 encoded is "% E4% B8% AD% E6% 96% 87% E6% B5% 8B% E8% AF% 95", because A Chinese character encoded in GB2312 is 16, while a Chinese character in UTF8 is 24. The IE4 or more browsers in Sino-Japan and South Korea support UTF8 encoding. This program will definitely contain the three languages, so if we let the HTML page use UTF8 encoding and will at least support this three languages.

However, if our HTML / JSP page uses UTF8 encoding, because the application server may not know this, because if the information sent by the browser does not include Charset information, at most Server knows that the Accept-Language requests bidding, we know only This bid is unfolded by the browser, so the application server cannot correctly resolve the submitted content, why? Because all strings in Java are Unicode16-bit encoded, HTTPSERVLETREQUEST.REQUEST (STRING) function is to convert the Urlencode encoded by the client to a Unicode string, and some Server can only think the client's encoding is the same as the Server platform. Simply use the Urldecoder.Decode (String) method to directly decode, if the client encoding is the same as the server, then you can get the correct string, otherwise, if the submitted string contains local characters, then the garbage will be caused information.

In this solution I proposed, UTF8 encoding has been specified, so we can avoid this problem, we can customize the decode method:

Public static string decode (string s, string encoding) throws exception {stringbuffer SB = new stringbuffer (); for (int i = 0; i

I do this, build a servlet class, override the service method, read and resolve the content submitted by calling the parent class, see the source code below:

Package com.hto.servlet; import javax.servlet.http.httpservletRequest; import java.util. *; / *** Insert the Type's description Here. * Creation Date: (2001-2-4 15:43:46) * @author: Qian Wei Chun * / public class UTF8ParameterReader {Hashtable pairs = new Hashtable (); / *** UTF8ParameterReader constructor comment * / public UTF8ParameterReader (HttpServletRequest request) throws java.io.IOException {super (); parse (request. .getQueryString ()); parse (request.getReader () readLine ());.} / *** UTF8ParameterReader constructor comment * / public UTF8ParameterReader (HttpServletRequest request, String encoding) throws java.io.IOException {super ().; parse (request.getQueryString (), encoding); parse (. request.getReader () readLine (), encoding);} public static String decode (String s) throws Exception {StringBuffer sb = new StringBuffer (); for (int i = 0; i

Break; default: sb.append (c); break;}} // undo convers}} // undo convers}} // undo conver = sb.tostring (); Byte [] InputBytes = Result.getbytes ("8859_1"); return new String InputBytes, Encoding;} / *** Insert The Method's Description Here. * Creation Date: (2001-2-4 17:30:59) * @Return java.lang.string * @Param name java.lang.string * / public string getParameter (String name) {if (pairs == null ||! pairs.containskey (name)) Return null; return ((arraylist) ((arraylist) pairs.get (Name)). Get (0)) } / *** Insert the method's description Here. * Creation Date: (2001-2-4 17:28:17) * @Return java.util.enumeration * / public enumeration getParameterNames () {if (pairs == null Return null; return pairs.keys ();} / *** Insert the method's description her. * Creation Date: (2001-2-4 17:33:40) * @return java.lang.string [] * @ Param name java.lang.string * / public string [] getParameterValues ​​(String name) {if (pairs == null ||! pairs.containskey (name)) Return Null; arraylist al = (arraylist) pairs.get (name) ; String [] Values ​​= New Str ING [al.size ()]; for (int i = 0; i

Package com.hto.servlet; import java.io. *; import javax.servlet. *; import javax.servlet.http. *; / *** Insert The Type's Description Here. * Creation Date: (2001-2-5 8:28:20) * @author: Qian Wei Chun * / public class UtfBaseServlet extends HttpServlet {public static final String PARAMS_ATTR_NAME = "PARAMS_ATTR_NAME"; / *** Process incoming HTTP GET requests ** @param request Object that encapsulates the request to the servlet * @param response Object that encapsulates the response from the servlet * / public void doGet (HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {performTask (request, response);} / *** Process incoming HTTP POST requests * * @param request Object that encapsulates the request to the servlet * @param response Object that encapsulates the response from the servlet * / public void doPost (HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {performTask (request, response);} / *** Insert The Method's Description Here. * Cre Ation Date: (2001-2-5 8:52:43) * @Return int * @Param Request javax.servlet.http.httpservletRequest * @Param name java.lang.string * @Param Required Boolean * @Param DefValue int * / public static java.sql.Date getDateParameter (HttpServletRequest request, String name, boolean required, java.sql.Date defValue) throws ServletException {String value = getParameter (request, name, required, String.valueOf (defValue));

Return java.sql.date.valueof (value);} / *** Insert the method's description Here. * Creation Date: (2001-2-5 8:52:43) * @return int * @Param Request Javax.Servlet .http.HttpServletRequest * @param name java.lang.String * @param required boolean * @param defValue int * / public static double getDoubleParameter (HttpServletRequest request, String name, boolean required, double defValue) throws ServletException {String value = getParameter ( Request, Name, Required, String.Valueof (DefValue); Return Double.Parsedouble (Value);} / *** Insert The Method's Description Here. * Creation Date: (2001-2-5 8:52:43) * @return int * @param request javax.servlet.http.HttpServletRequest * @param name java.lang.String * @param required boolean * @param defValue int * / public static float getFloatParameter (HttpServletRequest request, String name, boolean required, float DEFVALUE) THROWS servleTexception {string value = getParameter (Request, Name, Required, String.Valueof (DefValue));

Return float.parsefloat (value);} / *** Insert the method's description Here. * Creation Date: (2001-2-5 8:52:43) * @return int * @Param request javax.servlet.http.httpservletRequest * @param name java.lang.String * @param required boolean * @param defValue int * / public static int getIntParameter (HttpServletRequest request, String name, boolean required, int defValue) throws ServletException {String value = getParameter (request, name, Required, String.Valueof (DefValue);

Return INTEGER.PARSEINT (VALUE);} / *** Insert The Method's Description Here. * Creation Date: (2001-2-5 8:43:36) * @return java.lang.string * @Param Request Javax.Servlet .http.HttpServletRequest * @param name java.lang.String * @param required boolean * @param defValue java.lang.String * / public static String getParameter (HttpServletRequest request, String name, boolean required, String defValue) throws ServletException {if (! request.getAttribute (UtfBaseServlet.PARAMS_ATTR_NAME) = null) {UTF8ParameterReader params = (UTF8ParameterReader) request.getAttribute (UtfBaseServlet.PARAMS_ATTR_NAME); (! params.getParameter (name) = null) if return params.getParameter (name); if (required) throw new ServletException ( "The Parameter" name "required but not provided!"); else return defValue; (! request.getParameter (name) = null)} else {if return request.getParameter (name); if ("The Parameter" Name "Required But Not Provided!"); Else Return DefValue;}} / *** Returns The Servlet Info S TRING. * / public string getServletInfo () {Return Super.getServletInfo ();} / *** Insert The method's description herE. * Creation Date: (2001-2-5 8:52:43) * @Return Int * @ param request javax.servlet.http.HttpServletRequest * @param name java.lang.String * @param required boolean * @param defValue int * / public static java.sql.Timestamp getTimestampParameter (HttpServletRequest request, String name, boolean required, java. Sql.TimeStamp DefValue) Throws servletexception {string value = getParameter (Request, Name, Required, String.Valueof (DefValue));

Return java.sql.timestamp.valueof (value);} / *** Initializes the servlet. * / public void init () {// INSERT CODE To Initialize the servlet here} for information ** @param request Object that encapsulates the request to the servlet * @param response Object that encapsulates the response from the servlet * / public void performTask (HttpServletRequest request, HttpServletResponse response) {try {// Insert user code from here.} catch (Throwable theException) {// uncomment the following line when unexpected exceptions // are occuring to aid in debugging the problem.file: //theException.printStackTrace ();}} / *** Insert the method's description here * Creation date: (. 2001-2-5 8:31:54) * @param request javax.servlet.ServletRequest * @param response javax.servlet.ServletResponse * @exception javax.servlet.ServletException The exception description. * @exception java.io.IOException The Exception Description. * / Public Void Service (ServletRequest Request, ServletResponse Response) throwsja vax.servlet.ServletException, java.io.IOException {String content = request.getContentType ();!. if (content == null || content = null && content.toLowerCase () startsWith ( "application / x-www-form -urlencoded ")) request.setAttribute (PARAMS_ATTR_NAME, new UTF8ParameterReader ((HttpServletRequest) request)); super.service (request, response);}} this is the Servlet base class that overrides the service method of the parent class, calling the parent Before class service, a UTF8ParameterReader object is created, where information submitted in the form is saved. This object is then saved as an Attribute to the Request object. Then simulate the service method of the parent class.

For the servlet inherited this class, it is to be noted that the "standard" getParameter does not read the POST data, because the data has been read from the servletinputstream in this class before this class. So you should use the getParameter method provided in this class.

The rest is the output problem, we want to turn the output information to the binary stream output of UTF8. As long as we set up CONTENT-TYPE, specify the charset to utf8, then use the PrintWriter output, then these conversions are automatically performed, the servlet is set this: response.setContentType ("text / html; charset = utf8");

This setting in JSP:

<% @ Page ContentType = "text / html; charset = uTF8"%>

This ensures that the output is a UTF8 stream, and the client can be displayed, it will look at the client.

For the contents of Multipart / Form-Data, I also provide a class to process, in this class constructor, can specify the charset used by the page, default or UTF-8, limited to space not posted source, if you feel Interest can be Mail to: vividq@china.com and I discuss.

转载请注明原文地址:https://www.9cbs.com/read-30191.html

New Post(0)