检查字符串是什么编码字符集

引入依赖

        <dependency>
            <groupId>com.googlecode.juniversalchardet</groupId>
            <artifactId>juniversalchardet</artifactId>
            <version>1.0.3</version>
        </dependency>

包装一下

public class CharsetUtil {

    /**
     * 获取字符(串字节数组格式)的字符集
     *
     * @param bytes 字符串的字节数组
     * @return 字符集
     */
    public static String getCharset(byte[] bytes) {
        String defaultCharset = "UTF-8";
        UniversalDetector detector = new UniversalDetector(null);
        detector.handleData(bytes, 0, bytes.length);
        detector.dataEnd();
        detector.reset();
        String detectedCharset = detector.getDetectedCharset();
        return detectedCharset == null ? defaultCharset : detectedCharset;
    }

}

验证

    @Test
    void getCharset() {
        String hello = "hello, world";
        System.out.println("hello charset: " + CharsetUtil.getCharset(hello.getBytes()));
    }

输出(与项目字符集设置有关):

hello charset: UTF-8
posted @   漠孤烟  阅读(13)  评论(0编辑  收藏  举报
点击右上角即可分享
微信分享提示