Solution to jsp garbled problem
2009-07-15 10:32
1. The reason why the JSP page is garbled is that the character set encoding is not specified in the page. The solution: just use the following code at the beginning of the page to specify the character set encoding, <%@ page contentType="text /html; charset=gb2312"? %>
2. Database garbled characters. This kind of garbled characters will cause the Chinese characters you insert into the database to become garbled characters, or the Chinese characters you insert into the database will also be garbled characters when read and displayed. The solution is as follows:
Add the encoded character set to the database connection string (the same applies to data source connections)
String Url="jdbc:mysql://localhost/digitgulf?
user=root&password=root&useUnicode=true&characterEncoding=GB2312";
And use the following code in the page:
response.setContentType("text/html;charset=gb2312");
request.setCharacterEncoding("gb2312");
3. Garbled characters when passing Chinese as a parameter. When we pass a Chinese character as a parameter to another page, garbled characters will also appear. The solution is as follows:
Encode parameters when passing parameters, such as RearshRes.jsp?keywords=" + java.net.URLEncoder.encode(keywords)
Then use the following statement on the receiving parameter page to receive keywords=new String(request.getParameter("keywords").getBytes("8859_1"));
The core problem of garbled characters is still the problem of character set encoding. As long as this is mastered, general garbled code problems can be solved.
------------------------------------------
Since I came into contact with Java and JSP, I have been constantly dealing with the problem of Chinese garbled characters in Java. Now it has finally been completely solved. Now we will share our solution experience with everyone.
1. The origin of the Java Chinese problem
Java's core and class files are based on unicode, which makes Java programs good cross-platform, but also brings some troubles with Chinese garbled characters. There are two main reasons: the garbled code problem caused by the compilation of Java and JSP files themselves and the garbled code problem caused by the interaction of Java programs with other media.
First of all, Java (including JSP) source files are likely to contain Chinese, and the saving method of Java and JSP source files is based on byte streams. If Java and JSP are compiled into class files, the encoding method used is different from the source file. If the encoding is inconsistent, garbled characters will appear. Based on this kind of garbled code, it is recommended not to write Chinese in Java files (the comment part does not participate in compilation, it does not matter if you write Chinese). If you must write, try to manually compile with the parameter -ecoding GBK or -ecoding gb2312; for JSP, in the file header Adding <%@ page contentType="text/html;charset=GBK"%> or <%@ page contentType="text/html;charset=gb2312"%> can basically solve this kind of garbled code problem.
This article will focus on the second type of garbled code, which is the garbled code generated when Java programs interact with other storage media. Many storage media, such as databases, files, streams, etc., are based on byte streams. When a Java program interacts with these media, conversion between characters (char) and bytes (byte) will occur, for example, from a page The data submitted in the submission form displays garbled characters in the Java program.
If the encoding method used in the above conversion process is inconsistent with the original encoding of the bytes, garbled characters are likely to appear.
2. Solution
For the popular Tomcat, there are two solutions:
1) Change D:Tomcatconfserver.xml and specify the browser encoding format as "Simplified Chinese":
The method is to find the
<Connector port="8080" maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
enableLookups="false" redirectPort="8443" acceptCount="100"
connectionTimeout="20000" disableUploadTimeout="true" URIEncoding='GBK' />
Tags, bold text added by me.
You can verify whether your change is successful like this: Before making the change, in your IE browser where the garbled page appears, click the menu "View | Encoding" and you will find that "Western Europe (ISO)" is selected. After the change, click the menu "View | Encoding" and you will find that "Simplified Chinese (GB2312)" is selected.
b) Update the Java program. My program is as follows:
public class ThreeParams extends HttpServlet {
public void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
response.setContentType("text/html; charset=GBK");
...
}
}
Bold fonts are required, and their function is to allow the browser to convert Unicode characters into GBK characters. In this way, the content of the page and the display mode of the browser are set to GBK, so there will be no garbled characters.
Complete solution to Chinese under tomcat
I am developing a project these days. The server is Tomcat, the operating system is Articles and opinions, finally got it done. But a good memory is not as good as a bad pen, so I wrote it down to prevent myself from forgetting, and also to provide a good reference for those who encounter the same problem:
(1) The JSP page is in Chinese, but when you look at it, it is garbled:
The solution is to use <%@ page language="java" contentType="text/html;charset=GBK" %> in the encoding of the JSP page. Because of the encoding problem when Jsp is converted into a Java file, by default some servers It is ISO-8859-1. If Chinese is input directly into a JSP, there will definitely be a problem with Jsp treating it as ISO8859-1. We can confirm this by looking at the Java intermediate file generated by Jasper.
(2) When using the Request object to obtain the Chinese character code submitted by the customer, garbled characters will appear:
The solution is to configure a filter, which is a Servelet filter. The code is as follows:
import java.io.IOException;
import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
public class CharacterEncodingFilter implements Filter {
private FilterConfig config;
private String encoding = "ISO8859_1";
public void destroy() {
System.out.println(config);
config = null;
}
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain) throws IOException, ServletException {
request.setCharacterEncoding(encoding);
//chain.doFilter(request, response);
chain.doFilter(request, response);
}
public void init(FilterConfig config) throws ServletException {
this.config = config;
String s = config.getInitParameter("encoding");
if (s != null) {
encoding = s;
}
}
}
}
Configure web.xml
<filter>
<filter-name>CharacterEncodingFilter</filter-name>
<filter-class>com.SetCharacterEncodingFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>CharacterEncodingFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
If this situation still occurs in your case, you can go down and see if you have the fourth situation. Whether the data submitted by your Form is submitted using get. Generally speaking, there is no problem if you use post to submit. If so, take a look at the fourth solution.
There is also the processing of information containing Chinese characters. The processing code is:
packagedbJavaBean;
public class CodingConvert
{
public CodingConvert()
{
//
}
public String toGb(String uniStr){
String gbStr = "";
if(uniStr == null){
uniStr = "";
}
try{
byte[] tempByte = uniStr.getBytes("ISO8859_1");
gbStr = new String(tempByte,"GB2312");
}
catch(Exception ex){
}
return gbStr;
}
public String toUni(String gbStr){
String uniStr = "";
if(gbStr == null){
gbStr = "";
}
try{
byte[] tempByte = gbStr.getBytes("GB2312");
uniStr = new String(tempByte,"ISO8859_1");
}catch(Exception ex){
}
return uniStr;
}
}
You can also perform direct conversion. First, encode the obtained string with ISO-8859-1, then store the encoding in a byte array, and then convert the array into a string object. For example :
String str=request.getParameter("girl");
Byte B[]=str.getBytes("ISO-8859-1");
Str=new String(B);
Through the above conversion, any information submitted can be displayed correctly.
(3) When the Formget request uses request.getParameter("name") on the server, garbled characters are returned; setting the Filter according to tomcat's method does not work, or using request.setCharacterEncoding("GBK"); does not work either. The problem is that In terms of the method of processing parameter transfer: If you use the doGet(HttpServletRequest request, HttpServletResponse response) method in the servlet to process it, even if it is written before:
request.setCharacterEncoding("GBK");
response.setContentType("text/html;charset=GBK");
It doesn't work either, the Chinese returned is still garbled! ! ! If you change this function to doPost(HttpServletRequest request, HttpServletResponse response) everything will be OK.
Similarly, when using two JSP pages to process form input, the reason why Chinese can be displayed is because the post method is used to pass it, and changing it to the get method still does not work.
It can be seen that you need to pay attention when using the doGet() method in servlet or the get method in JSP. After all, this involves passing parameter information through the browser, which is likely to cause conflicts or mismatches in commonly used character sets.
The solution is:
1) Open the server.xml file of tomcat, find the block, and add the following line:
URIEncoding=”GBK”
The complete should be as follows:
<Connector port="8080" maxThreads="150" minSpareThreads="25" maxSpareThreads="75" enableLookups="false" redirectPort="8443" acceptCount="100" debug="0" connectionTimeout="20000" disableUploadTimeout=" true" URIEncoding="GBK"/>
2) Restart tomcat, everything is OK.
You can find out the reason why you need to join by studying the file under $TOMCAT_HOME/webapps/tomcat-docs/config/http.html. It should be noted that if you use UTF-8 in this place, garbled characters will appear in Tomcat during the transmission process. If that doesn't work, change to another character set.
(4) There are Chinese characters on the JSP page and Chinese characters on the buttons, but when viewing the page through the server, garbled characters appear:
The solution is: First, the localized message text should not be included directly in the JSP file, but the text should be obtained from the Resource Bundle through the <bean:message> tag. You should put your Chinese text in the Application.properties file. This file is placed under WEB-INF/classes/*. For example, if I have two labels for name and age on the page, I first need to create an Application.properties. The content inside should be name="name" age="age", then I put this file under WEB-INF/classes/properties/, and then encode it according to the Application.properties file to create a Chinese Resource file, assuming the name is Application_cn.properties. The native2ascii command is provided in the JDK, which can realize character encoding conversion. Find the directory where you placed the Application.properties file in the DOS environment. Execute the command in the DOS environment. The Chinese resource file Application_cn.properties encoded in GBK will be generated: native2ascii ?encoding gbk Application.properties Application_cn.properties. Execute the above command. In the future, the Application_cn.properties file with the following content will be generated: name=u59d3u540d age=u5e74u9f84, configured in Struts-config.xml: <message-resources parameter="properties.Application_cn"/>. At this point, more than half of it is basically done. Then you have to write <%@ page language="java" contentType="text/html;charset=GBK" %> on the JSP page. The label to the name is Write <bean:message key=”name”>. When this change appears on the page, the Chinese name will appear. The same goes for age. The Chinese characters on the button are also processed in the same way.
(5) The code written to the database is garbled:
Solution: Configure a filter, which is a Servelet filter. The code is the same as the second time.
If you are directly connected to the database through JDBC, the configuration code is as follows: jdbc:mysql://localhost:3306/workshopdb?useUnicode=true&characterEncoding=GBK. This will ensure that the code in the database is not garbled.
If you link through the data source, the above is also suitable. If you configure it correctly, when you enter Chinese, it will be Chinese in the database. One thing to note is that you also need to use <%@ page on the page that displays the data. language="java" contentType="text/html;charset=GBK" %>This line of code. It should be noted that some front desk staff use Dreamver to write the code. When writing a Form, they change it to a jsp. There is one thing to pay attention to, and that is the submission of Action in Dreamver. The method is request, and you need to send it over, because there are two methods of JSP submission: POST and GET, but the code submitted by these two methods is still very different in terms of encoding. This is Explained later
Article summary:
Here we mainly talk about the solution to the jsp garbled problem
1. The most basic problem of garbled characters.
PHPCE.CN, Design Manual
3. How to handle garbled characters in the form get submission method.
If you use the get method to submit Chinese, the page that accepts parameters will also appear garbled. The reason for this garbled code is also caused by tomcat's internal encoding format iso8859-1. Tomcat will encode Chinese characters using the default encoding method of get, iso8859-1, and append it to the URL after encoding, resulting in garbled parameters / and received from the page.
Solution:
A. Use the first method in the above example to decode the received characters and then transcode them.
B. Get uses url submission, and iso8859-1 encoding has been performed before entering the url. To affect this encoding, you need to add useBodyEncodingForURI="true" to the Connector node in server.xml
Attribute configuration can control the Chinese character encoding method of tomcat for the get method. The above attribute controls the get submission to be encoded using the encoding format set by request.setCharacterEncoding("UTF-8"). Therefore, it is automatically encoded as utf-8, and the acceptance page can accept it normally. But I think the real encoding process is that tomcat has to change D:Tomcatconfserver.xml and specify the browser's encoding format as "Simplified Chinese":
<Connector port="8080"
maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
enableLookups="false" redirectPort="8443" acceptCount="100"
debug="0" connectionTimeout="20000" useBodyEncodingForURI="true"
disableUploadTimeout="true" URIEncoding="UTF-8"/> PHPCE.CN, the URIEncoding="UTF-8" set in the design manual is encoded again, but since it has been encoded as utf-8, it will not be encoded again. Something has changed. If the encoding is obtained from the URL, the receiving page is decoded according to URIEncoding="UTF-8".
You can know the reason why you need to add it by studying the file under $TOMCAT_HOME/webapps/tomcat-docs/config/http.html. You can verify whether your change is successful like this: before making the change, a garbled page will appear on your page. In the IE browser, click the menu "View | Encoding" and you will find that "Western Europe (ISO)" is selected. After the change, click the menu "View | Encoding" and you will find that "Simplified Chinese (GB2312)" is selected.
4. Solve the garbled characters when uploading files. When uploading files, the form settings are enctype="multipart/form-data". This way files are submitted in a streaming manner. If you use apach's upload component, you will find a lot of garbled code. This is because apach's early commons-fileupload.jar has a bug, and the Chinese characters are taken out and decoded. Because this method is submitted, the encoding automatically uses tomcat's default encoding format iso-8859-1. But the problem of garbled characters is: Special symbols such as periods, commas, etc. become garbled characters. If the number of Chinese characters is an odd number, garbled characters will appear. If there is an even number, the parsing will be normal.
Solution: Download commons-fileupload-1.1.1.jar. This version of the jar has solved these bugs.
However, when extracting the content, you still need to transcode the extracted characters from iso8859-1 to utf-8. All normal Chinese characters and characters can be obtained.
5. Java code regarding url request, receiving garbled parameters
The encoding format of the url depends on the URIEncoding="UTF-8" mentioned above. If this encoding format is set, it means that all Chinese character parameters to the URL must be encoded. Otherwise, the obtained Chinese character parameter values are all garbled, such as a link Response.sendDerect ("/a.jsp?name=Zhang Dawei"); and it is used directly in a.jsp
PHPCE.CN, Design Manual
String name = request.getParameter("name"); what you get is garbled characters. Because it is stipulated that it must be utf-8, so the redirection should be written like this:
Response.sendDerect("/a.jsp?name=URLEncode.encode("Zhang Dawei","utf-8"); only.
What will happen if this parameter URIEncoding="UTF-8" is not set? If not set, the default encoding format iso8859-1 will be used. The problem arises again. The first is that if the number of parameter values is an odd number, it can be parsed normally. If it is an even number, the final character will be garbled. Also, if the last character is in English, it can be parsed normally, but the Chinese punctuation marks are still garbled. As an expedient, if there are no Chinese punctuation marks in your parameters, you can add an English symbol at the end of the parameter value to solve the garbled problem, and then remove the last symbol after getting the parameters. Can also be scraped or used.
6. Regarding the url request, the script code will also control the page redirection if the parameters received are garbled. It will also involve the attached parameters and parse the parameters on the receiving page. If this Chinese character parameter does not perform encoding processing specified by URIEncoding="UTF-8", the Chinese characters received by the receiving page will also be garbled. Script processing and encoding is troublesome. You must have a corresponding file corresponding to the encoding script, and then call the method in the script to encode the Chinese characters.
7. Regarding the garbled problem of jsp opened in MyEclipse. For an existing project, the storage format of the Jsp file may be utf-8. If eclipse is newly installed, the encoding format used by default to open is iso8859-1. Therefore, the Chinese characters in jsp are garbled. This garbled code is relatively easy to solve. Just go to the preferences of eclipse3.1 and find general->edidor, and set the opening encoding of your file to utf-8. Eclipse will automatically reopen with the new encoding format. Chinese characters can be displayed normally.
8. About the garbled code when opening the HTML page in eclipse. Since most pages are produced by Dreamweaver, their storage format is different from the recognition of eclipse.
Generally, in this case, create a new jsp in eclipse, copy the page content directly from dreamweaver and paste it into the jsp.
PHPCE.CN, Design Manual
This garbled code problem is the simplest garbled code problem. Usually new ones will appear. It is the garbled code caused by inconsistent page encoding.
<%@ page language="java" pageEncoding="UTF-8"%>
<%@ page contentType="text/html;charset=iso8859-1"%>
<html>
<head>
<title>Chinese problem</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
</head>
<body>
i am a good person
</body>
</html>
Coding in three places.
The encoding format in the first place is the storage format of the jsp file. Ecljpse will save files according to this encoding format. And compile the jsp file, including the Chinese characters inside.
The second encoding is the decoding format. Because files saved as UTF-8 are decoded into iso8859-1, if there is Chinese, it will definitely be garbled. That means it must be consistent. The line where the second place is located does not need to be there. The default encoding format is also ISO8859-1. So without this line, "I am a good person" would also be garbled. It must be consistent.
The third encoding is to control the browser's decoding method. If the previous decodings are consistent and correct, the encoding format does not matter. Some web pages appear garbled because the browser cannot determine which encoding format to use. Because pages are sometimes embedded in pages, the browser confuses the encoding format. Garbled characters appear. PHPCE.CN, Design Manual
2. The problem of garbled characters received after the form is submitted using the Post method is also a common problem. This garbled code is also caused by tomcat's internal encoding format iso8859-1. That is to say, when the post is submitted, if the submitted encoding format is not set, it will be submitted in the iso8859-1 method, but the accepted jsp will be accepted in the utf-8 method. . resulting in garbled characters. For this reason, here are several solutions and comparisons.
A. Encoding conversion when accepting parameters
String str = new String(request.getParameter("something").getBytes("ISO-8859-1"),"utf-8"); In this case, each parameter must be transcoded in this way. Very troublesome. But you can indeed get Chinese characters.
B. At the beginning of the request page, execute the requested encoding code, request.setCharacterEncoding("UTF-8"), and set the character set of the submitted content to UTF-8. In this case, pages that accept this parameter do not need to be transcoded. Use directly
String str = request.getParameter("something"); you can get the Chinese character parameters. But this sentence needs to be executed on every page. This method is only effective for post submissions, but is invalid for enctype="multipart/form-data" when submitting files and uploading files. The two garbled characters will be explained separately later.
C. In order to avoid writing request.setCharacterEncoding("UTF-8") on every page, it is recommended to use filters to
Carry out encoding processing. There are many examples of this online. Please check it yourself