摘要:本文主要解决由字符集编码冲突引起的中文显示和中文入库(数据库)问题。
作者:StephenCat.WJJ @ GZ,2006年4月13日。
作者声明:不对读者使用本文内容所引起的影响和后果承担任何责任。转载请注明来源于本 Blog。作者保留著作权。
本文试验环境:
运行:JDK 1.5 update 5,Eclipse 3.1.2,MyEclipse 4.1.1 GA,Tomcat 5.5.9,MySQL 5.0
框架:Struts 1.2,Hibernate 3.0
Eclipse 编写的文件默认字符编码(Encoding,下同)都是欧洲标准的单字节的 ISO8859-1,不能支持双字节的中文字符编码,包括 GB2312 和 GBK。解决方法是统一采用 UTF-8 编码。
解决步骤如下:
1. 声明 JSP 的字符集为 UTF-8
在每个 JSP 文件的头部加上这一行:
<%@ page contentType="text/html; charset=UTF-8" pageEncoding="UTF-8" %>
2. 建立预备资源文件 ApplicationResources_PRE.properties
1)使用 Eclipse 建立一个空白的 ApplicationResources_PRE.properties 文件。
2)在包资源管理器中右键单击新建立的资源文件,点击选择“属性”,打开属性配置对话框。
3)在“文本文件编码”中选择 UTF-8
4)按“确定”或“应用”。
5)编辑这个预备资源文件,把国际化中文信息写进去。
3. 生成正式的 ApplicationResources.properties
1)假设 JDK 安装目录为 C:\Java\jdk1.5.0_05
2)假设预备和正式的国际化资源文件都放在 F:\workspace\SomeProject\src 目录中
3)进入命令行方式,执行以下命令:
cd C:\Java\jdk1.5.0_05\bin
F:
cd \workspace\SomeProject\src
c:native2ascii -encoding UTF-8 ApplicationResources_PRE.properties ApplicationResources.properties
c:native2ascii -encoding UTF-8 ApplicationResources_PRE.properties ApplicationResources_zh.properties
4)结果是生成了新的 ApplicationResources.properties (覆盖原文件),其内容可能是这样的:
login.username=\u7528\u6237\u540d
login.password=\u5bc6\u7801
\ufffd\ufffd\ufffd\ufffd
等价于预备资源文件中的:
login.username=用户名[回车]
login.password=密码[回车]
只要最后一行有 [回车] 的话,自动产生的 \ufffd\ufffd\ufffd\ufffd 就不会影响页面显示。
5)使用 Eclipse,在包资源管理器中分别用右键单击 ApplicationResources.properties 和 ApplicationResources_zh.properties,点击选择“属性”,打开属性配置对话框。在“文本文件编码”中选择 UTF-8,按“确定”保存。
6)重新部署 Web 应用程序到应用服务器。
7)经过以上步骤,就解决了 JSP 页面的中文显示问题。
补充:DOS 批处理文件示例
(放在 F:\workspace\SomeProject 目录)
cd C:\Java\jdk1.5.0_05\bin
f:
cd \workspace\Xnews\src\org\stephencat\xnews
echo Creating ApplicationResources.properties
echo.
cd c:
echo.
cd f:
echo.
echo c:native2ascii -encoding UTF-8 ApplicationResources_xx.properties ApplicationResources.properties
echo c:native2ascii -encoding UTF-8 ApplicationResources_xx.properties ApplicationResources_zh.properties
echo.
c:native2ascii -encoding UTF-8 ApplicationResources_xx.properties ApplicationResources.properties
c:native2ascii -encoding UTF-8 ApplicationResources_xx.properties ApplicationResources_zh.properties
echo.
echo Done.
echo.
pause
以下解决 Form 提交中文及中文入库问题
4. 建立一个 Filter 类
样板文件在 %Tomcat安装目录%\webapps\servlets-examples\WEB-INF\classes\filters
Filter 类的内容如下:
package org.stephencat.struts.util;
import javax.servlet.*;
import java.io.IOException;
/**
* <p>Filter that sets the character encoding to be used in parsing the
* incoming request, either unconditionally or only if the client did not
* specify a character encoding. Configuration of this filter is based on
* the following initialization parameters:</p>
* <ul>
* <li><strong>encoding</strong> - The character encoding to be configured
* for this request, either conditionally or unconditionally based on
* the <code>ignore</code> initialization parameter. This parameter
* is required, so there is no default.</li>
* <li><strong>ignore</strong> - If set to "true", any character encoding
* specified by the client is ignored, and the value returned by the
* <code>selectEncoding()</code> method is set. If set to "false,
* <code>selectEncoding()</code> is called <strong>only</strong> if the
* client has not already specified an encoding. By default, this
* parameter is set to "true".</li>
* </ul>
*
* <p>Although this filter can be used unchanged, it is also easy to
* subclass it and make the <code>selectEncoding()</code> method more
* intelligent about what encoding to choose, based on characteristics of
* the incoming request (such as the values of the <code>Accept-Language</code>
* and <code>User-Agent</code> headers, or a value stashed in the current
* user's session.</p>
*
* @author <a href="mailto:jwtronics@yahoo.com">John Wong</a>
*
* @version $Id: SetCharacterEncodingFilter.java,v 1.1 2002/04/10 13:59:27 johnwong Exp $
*/
public class SetCharacterEncodingFilter implements Filter {
// ----------------------------------------------------- Instance Variables
/**
* The default character encoding to set for requests that pass through
* this filter.
*/
protected String encoding = null;
/**
* The filter configuration object we are associated with. If this value
* is null, this filter instance is not currently configured.
*/
protected FilterConfig filterConfig = null;
/**
* Should a character encoding specified by the client be ignored?
*/
protected boolean ignore = true;
// --------------------------------------------------------- Public Methods
/**
* Take this filter out of service.
*/
public void destroy() {
this.encoding = null;
this.filterConfig = null;
}
/**
* Select and set (if specified) the character encoding to be used to
* interpret request parameters for this request.
*
* @param request The servlet request we are processing
* @param result The servlet response we are creating
* @param chain The filter chain we are processing
*
* @exception IOException if an input/output error occurs
* @exception ServletException if a servlet error occurs
*/
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain)
throws IOException, ServletException {
// Conditionally select and set the character encoding to be used
if (ignore || (request.getCharacterEncoding() == null)) {
String encoding = selectEncoding(request);
if (encoding != null)
request.setCharacterEncoding(encoding);
}
// Pass control on to the next filter
chain.doFilter(request, response);
}
/**
* Place this filter into service.
*
* @param filterConfig The filter configuration object
*/
public void init(FilterConfig filterConfig) throws ServletException {
this.filterConfig = filterConfig;
this.encoding = filterConfig.getInitParameter("encoding");
String value = filterConfig.getInitParameter("ignore");
if (value == null)
this.ignore = true;
else if (value.equalsIgnoreCase("true"))
this.ignore = true;
else if (value.equalsIgnoreCase("yes"))
this.ignore = true;
else
this.ignore = false;
}
// ------------------------------------------------------ Protected Methods
/**
* Select an appropriate character encoding to be used, based on the
* characteristics of the current request and/or filter initialization
* parameters. If no character encoding should be set, return
* <code>null</code>.
* <p>
* The default implementation unconditionally returns the value configured
* by the <strong>encoding</strong> initialization parameter for this
* filter.
*
* @param request The servlet request we are processing
*/
protected String selectEncoding(ServletRequest request) {
return (this.encoding);
}
}//EOC
5. 编辑 web.xml 站点配置文件
加入以下内容:
<filter-name>Set Character Encoding</filter-name>
<filter-class>org.stephencat.struts.util.SetCharacterEncodingFilter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
<init-param>
<param-name>ignore</param-name>
<param-value>true</param-value>
</init-param>
</filter>
<filter-mapping>
<filter-name>Set Character Encoding</filter-name>
<servlet-name>action</servlet-name>
</filter-mapping>
6. 设置 MySQL 数据库的默认字符编码为 UTF-8