Java Servlet / Jsp solution of multi-language: vividq
Because Java has never been confused, there will be a bug that can't be mixed with many languages. This weekend studied servlet.
JSP's multi-language display problem, that is, servlet multi-character set problem, because I also have the concept of character set
Not very clear, what is written is not necessarily accurate, I understand the character set in Java: during runtime
The stored in each string object is encoded to Unicode internal code (I think all languages have corresponding
Encoded, because the internal strings are always represented by the internal strings, only the words in the normal computer language
The string encoded time platform is related, while Java uses a platform-independent Unicode).
When Java reads a string from a BYTE stream, the platform-related Byte transforms into platform-independent Unprocesses.
iCode string. Java will transform the Unicode string to the platform-related BYTE stream if the unit is output. If a unman
ICODE characters do not exist on a platform, will output one '?'. For example: in Chinese, WINDOWS, JAV
A Read a "GB2312" encoded file (which can be any stream) to construct string objects in memory, will put GB2
312 The encoded text is converted to a Unicode encoded string, and if the string output will output the Unicode word
BYTE stream or array of strings into GB2312: "Chinese test" -----> "/ u4E2D / u6587 / u6D4b / u8bd5" -
---> "Chinese Test".
The following routines:
Byte [] bytes = new byte [] {(byte) 0xD0, (byte) 0xCE, (Byte) 0XC4, (B
YTE) 0xB2, (Byte) 0xE2, (Byte) 0xca, (Byte) 0xD4}; // GBK Code "Chinese Test"
Java.io.bytearrayinputstream bin = new java.io.byteaRrayInputStream (bytes);
Java.io.bufferedreader reader = new java.io.bufferedreader (new java.IO. INPU
TSTreamReader (BIN, "GBK");
String msg = reader.readline ();
System.out.println (MSG)
This program is placed in a system containing the four words of "Chinese Test" (such as Chinese), which can be properly printed
These words. The MSG string contains the correct "Chinese test" Unicode encoding: "/ U4E2D / U6587 / U6D4
B / U8BD5 ", the default character set when the printing is converted to the operating system, whether it can be correctly displayed on the operating system
The character set, only in the system that supports the corresponding character set, our information can output correctly, otherwise it will
Will be garbage.
When you are ahead, let's take a look at the multilingual questions in servlet / jsp. Our goal is to serve any country
The NET sends information to the server through the Form, and Server is stored in the database, and the client can still be able to retrieve.
Enough to see the correct information you sent. In fact, we must ensure that when saving in the final Server SQL statement
Contains the correct Unicode encoding of the client sending text; the encoding method used by DBC and database communication can include guests
The text message sent by the house, in fact, it is best to let JDBC use Unicode / UTF8 and database communication! This ensures that the information will not be lost; the information that Server is sent to the client, and the encoding party that does not lose information is also used.
The formula can also be Unicode / UTF8.
If you do not specify the ENCTYPE attribute of the Form, the Form will put the input content according to the coded character set of the current page.
After Rlencode, then submit it, the server is getting a string of Urlencoding. UrlencoDi obtained after encoding
The NG string is related to the page's encoding, such as the "Chinese test", "Chinese test", "% D6)
% D0% CE% C4% B2% E2% CA% D4 ", each"% "followed by a 16-en-generated string; it is obtained when UTF8
It is "% E4% B8% AD% E6% 96% 87% E6% B5% 8B% E8% AF% 95", because a Chinese character in GB2312 encoding is 16
And a Chinese character in UTF8 is 24. The IE4 or more browsers in Sino-Japanese and South Korea support UTF8 encoding, this part
The case will definitely cover the language of these three countries, so if we let the HTML page use UTF8 encoding so will at least support
These three languages.
However, if our HTML / JSP page uses UTF8 encoding, because the application server may not know this
Situation, because if the information sent by the browser does not contain charset information, at most Server knows to read Accept-La
Nguage requests bidding, we know that this bid is not known to the browser, so the application
The sequence server does not correctly resolve the submitted content, why? Because all strings in Java are Unicode16
The encoded, httpservletRequest.Request (string) features the URLENCode submitted by the client
The information of the code is converted to a Unicode string, and some Server can only think that the client's encoding is the same, simple
Use the urldecoder.decode (string) method to directly decode if the client encoding is the same as Server
Then you can get the correct string, otherwise, if the submitted string contains local characters, then
It will lead to spam.
In this solution I proposed, UTF8 encoding has been specified, so it can avoid this problem.
We can customize the Decode method yourself:
Public Static String Decode (String S, String Encoding) Throws Exception {
StringBuffer SB = new stringbuffer ();
For (int i = 0; i CHAR C = S.Charat (i); Switch (c) { Case ' ': sb.append (''); Break; Case '%': Try { sb.append ((char) integer.parseint S.SUBSTRING (i 1, i 3), 16)); } Catch (NumberFormatexception E) { Throw new illegalargumentException (); } i = 2; Break; DEFAULT: Sb.append (c); Break; } } // undo converness to external encodingstring result = sb.toString (); BYTE [] INPUTBYTES = Result.getbytes ("8859_1"); Return New String (InputBytes, Encoding); } This method can specify eNCoding if we meet our needs if it is assigned to UTF8. For example Analysis: "% E4% B8% AD% E6% 96% 87% E6% B5% 8B% E8% AF% 95" can get the correct Chinese character "Chinese test" Unicode string. The problem now is that we must get the string of the URLENCode submitted by the client. For the fo of Method The information submitted by RM can be read with the httpservletRequest.getQueryString () method, and for the POST The information submitted by the method can only be read from the servletinputstream, in fact, the standard GETParameter After the first time was first called, the information submitted by Form was read, and servletinputStream could not be Repeated read. So we should read and resolve the information submitted by the getParameter method before the first time. . I did this, build a servlet class, override the service method, in calling the previce party Read and analyze the content submitted by FORM, please see the source code below: Package com.hto.servlet; Import javax.servlet.http.httpservletRequest; Import java.util. *; / ** * Insert The Type's Description Here. * CREATION DATE: (2001-2-4 15:43:46) * @Author: Qian Weichun * / Public class utf8parameterReader { Hashtable pairs = new hashtable (); / ** * UTF8ParameterReader Constructor Comment. * / Public UTF8ParameterReader (httpservletRequest request) throws java.io.ioExce Ption { Super (); PARSE ()); Parse (Request.getReader (). Readline ()); } / ** * UTF8ParameterReader Constructor Comment. * / Public UTF8ParameterReader (httpservletRequest Request, String Encoding) Throw s java.io.ioException { Super (); Parse (Request.GetQueryString (), Encoding; Parse (Request.getReader (). Readline () (), ENCODING } Public static string decode (string s) throws exception { StringBuffer SB = new stringbuffer (); For (int i = 0; i CHAR C = S.Charat (i); Switch (c) { Case ' ': sb.append (''); break; Case '%': Try { sb.append ((char) integer.parseint S.SUBSTRING (i 1, i 3), 16)); } Catch (NumberFormatexception E) { Throw new illegalargumentException (); } i = 2; Break; DEFAULT: Sb.append (c); Break; } } // undo conversion to external encoding String result = sb.toString (); BYTE [] INPUTBYTES = Result.getbytes ("8859_1"); Return New String (INPUTBYTES, "UTF8"); } Public Static String Decode (String S, String Encoding) Throws Exception { StringBuffer SB = new stringbuffer (); For (int i = 0; i CHAR C = S.Charat (i); Switch (c) { Case ' ': sb.append (''); Break; Case '%': Try { sb.append ((char) integer.parseint S.SUBSTRING (i 1, i 3), 16)); } Catch (NumberFormatexception E) { Throw new illegalargumentException (); } i = 2; Break; DEFAULT: Sb.append (c); Break; } } // undo conversion to external encoding String result = sb.toString (); BYTE [] INPUTBYTES = Result.getbytes ("8859_1"); Return New String (InputBytes, Encoding); } / ** * Insert The Method's Description Here. * Creation Date: (2001-2-4 17:30:59) * @Return java.lang.string * @Param name java.lang.string * / Public String getParameter (String name) { IF (pairs == null ||! pairs.containskey (name)) Return NULL; Return ((ArrayList) PAIRS.GET (NAME). Get (0)); } / ** * Insert The Method's Description Here. * Creation Date: (2001-2-4 17:28:17) * @Return java.util.enumeration * / Public Enumeration getParameterNames () { IF (pairs == null) Return NULL; Return pairs.keys (); } / ** * INSERT The Method's Description Here. * Creation Date: (2001-2-4 17:33:40) * @Return java.lang.string [] * @Param name java.lang.string * / Public String [] getParameterValues (String name) { IF (pairs == null ||! pairs.containskey (name)) Return NULL; ArrayList Al = (arraylist) pairs.get (name); String [] Values = new string [al.size ()]; For (int i = 0; i VALUES [I] = (string) Al.Get (i); Return Values; } / ** * Insert The Method's Description Here. * Creation Date: (2001-2-4 20:34:37) * @Param Urlenc java.lang.string * / Private Void Parse (String Urlenc) throws java.io ioException { IF (urlenc == null) return; StringTokenizer tok = New StringTokenizer (Urlenc, "&"); Try { While (tok.hasmoretoKens ()) { String APAIR = tok.nextToken (); INT POS = APAIR.INDEXOF ("="); String name = NULL; String value = NULL; IF (POS! = -1) { Name = decode (Apair.substring (0, POS)); Value = decode (Apair.substring (POS 1)); } else { Name = APAIR; Value = ""; } IF (Pairs.Containskey (Name)) { ArrayList Values = (arraylist) pairs.get (name); VALUES.Add (Value); } else { ArrayList Values = New ArrayList (); VALUES.Add (Value); Pairs.Put (name, value); } } } catch (exception e) { Throw new java.io.ioException (E.getMessage ()); } } / ** * Insert The Method's Description Here. * Creation Date: (2001-2-4 20:34:37) * @Param Urlenc java.lang.string * / Private Void Parse (String Urlenc, String Encoding) THROWS JAVA.IO.IEXCEPTION { IF (urlenc == null) return; StringTokenizer tok = New StringTokenizer (Urlenc, "&"); Try { While (tok.hasmoretoKens ()) { String APAIR = tok.nextToken (); INT POS = APAIR.INDEXOF ("="); String name = NULL; String value = NULL; IF (POS! = -1) { Name = decode (apair.substring (0, pos), eNCoding; Value = decode (apair.substring (pOS 1), eNCoding; } else { Name = APAIR; Value = ""; } IF (Pairs.Containskey (Name)) { ArrayList Values = (arraylist) pairs.get (name); VALUES.Add (Value); } else { ArrayList Values = New ArrayList (); VALUES.Add (Value); Pairs.Put (name, value); } } } catch (exception e) { Throw new java.io.ioException (E.getMessage ()); } } } The functionality of this class is to read and save information submitted by Form, and achieve commonly used GetParameter methods. Package com.hto.servlet; Import java.io. *; Import javax.servlet. *; Import javax.servlet.http. *; / ** * Insert The Type's Description Here. * CREATION DATE: (2001-2-5 8:28:20) * @Author: Qian Weichun * / Public class utfbaseservlet extends httpservlet { Public static final string params_attr_name = "params_attr_name"; / ** * Process Incoming HTTP Get Requests * * @Param Request Object That Encapsules The Request to the Servlet * @Param Response Object That Encapsules The Response from the servlet * / Public void doget (httpservletRequest request, httpservletResponse response) Throws servletexception, ioException { PerformTask (Request, Response); } / ** * Process incoming http post requests * * @Param Request Object That Encapsules The Request to the Servlet * @Param Response Object That Encapsules The Response from the servlet * / Public void dopost (httpservletRequest Request, httpservletResponse response) Throws servletexception, ioException { PerformTask (Request, Response); } / ** * INSERT The Method's Description Here. * Creation Date: (2001-2-5 8:52:43) * @Return Int * @Param Request Javax.Servlet.http.httpservletRequest * @Param name java.lang.string * @Param Required Boolean * @Param DefValue Int * / Public static java.sql.date getdateParameter (httpservletRequest Request, STR ING Name, Boolean Required, Java.sql.date DefValue) throws servletexception { String value = getParameter (Request, Name, Required, String.Valueof (DefValue)); Return java.sql.date.valueof (value); } / ** * Insert The Method's Description Here. * Creation Date: (2001-2-5 8:52:43) * @Return Int * @Param Request Javax.Servlet.http.httpservletRequest * @Param name java.lang.string * @Param Required Boolean * @Param DefValue Int * / Public Static Double GetDoubleParameter (httpservletRequest Request, String N AME, Boolean Required, Double DefValue "throws servletexception { String value = getParameter (Request, Name, Required, String.Valueof (DefValue)); Return Double.Parsedouble (Value); } / ** * Insert The Method's Description Here. * Creation Date: (2001-2-5 8:52:43) * @Return Int * @Param Request Javax.Servlet.http.httpservletRequest * @Param name java.lang.string * @Param Required Boolean * @Param DefValue Int * / Public Static Float GetFloatParameter (httpservletRequest Request, String Nam E, Boolean Required, Float DefValue "throws servletexception { String value = getParameter (Request, Name, Required, String.Valueof (DefValue)); Return float.Parsefloat (Value); } / ** * Insert The Method's Description Here. * Creation Date: (2001-2-5 8:52:43) * @Return Int * @Param Request javax.servlet.http.httpservletRequest * @Param name java.lang.string * @Param Required Boolean * @Param DefValue Int * / Public Static Int GetInet (httpservletRequest Request, String Name, B oolean required, int devalue) throws servletexception { String value = getParameter (Request, Name, Required, String.Valueof (DefValue)); Return Integer.Parseint (Value); } / ** * Insert The Method's Description Here. * Creation Date: (2001-2-5 8:43:36) * @Return java.lang.string * @Param Request Javax.Servlet.http.httpservletRequest * @Param name java.lang.string * @Param Required Boolean * @Param DefValue java.lang.string * / Public Static String getParameter (httpservletRequest Request, String Name, B Oolean Required, String DefValue) THROWS servletexception { IF (Request.getaTribute (UtfbaseServlet.Params_attr_name)! = null) { Utf8ParameterReader params = (UTF8ParameterReader) Request.GetaTribute (UTFBA SESESERVET.PARAMS_ATTR_NAME); IF (Params.GetParameter (Name)! = null) Return Params.GetParameter (Name); IF (Required) Throw New ServletException ("The Parameter" Name Required Bu T Not Provided! "); Else Return DefValue; } else { IF (Request.GetParameter (Name)! = null) Return Request.getParameter (Name); IF (Required) Throw New ServletException ("The Parameter" Name Required Bu T Not Provided! "); Else Return DefValue; } } / ** * Returns the servlet info string. * / Public string getServletInfo () { Return super.getServletInfo (); } / ** * Insert The Method's Description Here. * Creation Date: (2001-2-5 8:52:43) * @Return Int * @Param Request javax.servlet.http.httpservletRequest * @Param name java.lang.string * @Param Required Boolean * @Param DefValue Int * / Public Static Java.sql.TimeStamp GetTimeStampparameter (HttpServletRequest Re Quest, String Name, Boolean Required, Java.sql.TimeStamp DefValue) Throws SE RvletException { String value = getParameter (Request, Name, Required, String.Valueof (DefValue)); Return java.sql.timestamp.valueof (value); } / ** * Initializes the servlet. * / Public void init () { // Insert Code to Initialize The Servlet Here } / ** * Process Incoming Requests for Information * * @Param Request Object That Encapsules The Request to the Servlet * @Param Response Object That Encapsules The Response from the servlet * / Public void PerformTask (HttpservletRequest Request, HttpservletResponse Resp ONS) { Try { // INSERT User Code from here. } Catch (throwable theexception) { // uncomment the folowing line when unexpected exceptions // Are Occuring to Aid in Debugging The Problem. TooException.printStackTrace (); } } / ** * Insert The Method's Description Here. * Creation Date: (2001-2-5 8:31:54) * @Param Request Javax.Servlet.ServletRequest * @Param Response Javax.Servlet.ServletResponse * @Exception javax.servlet.servletException The Exception Description. * @Exception java.io.ioException The Exception Description. * / Public Void Service (ServletRequest Request, ServletResponse Response) THROWS Javax.Servlet.ServletException, java.io.ioException { String content = Request.getContentType (); IF (content == null || content! = null && content.tolowercase (). startswith ("a PPlication / X-www-form-urlencoded ")) Request.setttribute (params_attr_name, New Utf8ParameterReader ((httpservletre) Quest) Request); Super.Service (Request, Response); } } This is the servlet base class, which covers the service method of the parent class, and created before calling the parent class service. UTF8ParameterReader object, which saves information submitted in Form. Then use this object as a Attribute is saved in the Request object. Then simulate the service method of the parent class. For the servlet inherited this class, it is important to note that "standard" getParameter does not read POST. Data, because the data has been read from ServletInputStream from ServletInputStream before this. So it should be made Use the getParameter method provided in this class. The rest is the output problem, we want to turn the output information to the binary stream output of UTF8. if I Specify charset to UTF8 when setting Content-Type, then use PrintWriter output, then these conversions Is automatic, servlet is set in this: Response.setContentType ("text / html; charSet = utf8"); This setting in JSP: <% @ Page ContentType = "text / html; charset = uTF8"%> This ensures that the output is a UTF8 stream, and the client can be displayed, it will look at the client. I also provide a class to process for the content of Multipart / Form-Data's Form. The structure can specify the charset used by the page, default or UTF-8, limited to the space not posted source code, if Interested in mail to: vividq@china.com and discuss.