web 应用请求乱码问题

背景

作为非西欧语系的国家，总是要处理编码问题

使用java编码解码

    @Test
    public void coderTest() throws UnsupportedEncodingException {
        String str = URLEncoder.encode("中国人民", "GBK");
        System.out.println(str);

        str = URLDecoder.decode(str, "UTF-8");
        System.out.println(str);

        str = URLEncoder.encode(str, "ISO-8859-1");
        System.out.println(str);

        str = URLDecoder.decode(str, "BIG5");
        System.out.println(str);

    }

View Code

浏览器端

html页面可以设置字符集，比如：Content-Type:text/html;charset=UTF-8。
设置后的作用：
1、Get/Post方法提交参数会使用这个设置。
2、网页文字展示也会使用这个设置，比如把它修改为ISO-8859-1，网页中的非ASCII字符都会显示乱码。

服务器端

通过request.setCharacterEncoding()可以设置解析参数时使用的字符集（response也有这个方法）。

默认设置

容器和大部分浏览器默认编码处理都使用的是ISO-8859-1。

乱码原因

浏览器和服务器双方使用不同的编码设置，就会产生乱码。

解决方案

方案一、设置http服务器解析URL的编码方式

<html>
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
    <form method="get" action="world">
        名称：<input type="text" name="name"><br>
        <button>发出 GET 请求</button>
    </form><br><br>
    <form method="post" action="world">
        名称：<input type="text" name="name"><br>
        <button>发出 POST 请求</button>
    </form>
</body>
</html>

Html Code

      <plugin>
        <groupId>org.apache.tomcat.maven</groupId>
        <artifactId>tomcat7-maven-plugin</artifactId>
        <version>2.2</version>
        <configuration>
          <path>/hello</path>
          <port>8090</port>
          <uriEncoding>UTF-8</uriEncoding>
          <server>tomcat7</server>
        </configuration>
        <executions>
          <execution>
            <phase>package</phase>
            <goals>
              <goal>run</goal>
            </goals>
          </execution>
        </executions>
      </plugin>

Pom Code

package com.test;

import org.junit.Test;

import javax.servlet.ServletException;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
import java.io.PrintWriter;
import java.io.UnsupportedEncodingException;
import java.net.URLDecoder;
import java.net.URLEncoder;

@WebServlet("/world")
public class MyServlet extends HttpServlet {
    @Override
    protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
        String name = request.getParameter("name");
        System.out.println("name: " + name); // name: 中国人民
    }

    @Override
    protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
        String name = request.getParameter("name");
        System.out.println("name: " + name); // name: ä¸å½äººæ°
    }
}

Java Code

　　关键配置是： <uriEncoding>UTF-8</uriEncoding>

　　这种方式只对get请求有效。因为它设置的只是http服务器解析url时要使用UTF-8。url解析是http服务器解析的，Tomcat里自带一个http服务器。

方案二、通过String的getBytes()指定编码来取得该字符串的字节数组，然后再重新构造为正确编码的字符串。

　　例如，若浏览器使用UTF-8处理字符，Web容器默认使用ISO-8859-1编码，则正确处理编码的方式为：

　　　　String name = request.getParameter(name);
　　　　name = new String(name.getBytes("ISO-8859-1"), "UTF-8");

　　这种方法对get/post请求都有效

方案三、使用HttpServletRequest的setCharacterEncoding()方法指定解析请求参数时使用的编码。

　　例如，若浏览器以UTF-8来发送请求，则接收时也要使用UTF-8编码字符串，则可以在取得任何请求值之“前”，执行语句：

　　request.setCharacterEncoding("UTF-8");

　　只适用于post请求；究其原因，是因为处理URL的是HTTP服务器，而非Web容器，如果使用get方法，HTTP服务器已经提前一步使用ISO-8859-1解析了url上的参数了，已经乱码了，所以再设置也没有作用了。

posted @ 2017-11-28 20:22 zhuangrunwei 阅读(289) 评论(0) 收藏举报

刷新页面返回顶部

zhuangrunwei

www.cnblogs.com/zhuangrunwei

web 应用请求乱码问题