微信昵称乱码-解决方案
背景
网页授权拉取用户信息时昵称乱码
原因:
调接口时未设置字符集,默认使用的字符集是 ISO-8859-1,该字符集不适合汉字和特殊字符
原来的代码
/**
* 网页授权之拉取用户信息
*
* @param accessToken 网页授权token(注意和公众号的token不一样)
* @param openId 用户openId
* @return
*/
public @Nullable JSONObject getSnsUserInfo(String accessToken, String openId) {
String requestUrl = StrUtil.format("https://api.weixin.qq.com/sns/userinfo?access_token={}&openid={}&lang=zh_CN", accessToken, openId);
log.info("getSnsUserInfo 请求url:{}", requestUrl);
try {
String responseStr = restTemplate.getForObject(requestUrl, String.class);
JSONObject response = JSON.parseObject(responseStr);
log.info("getSnsUserInfo 响应:{}", response);
boolean isSuccess = checkResponseIsSuccess(response, "getSnsUserInfo");
if (isSuccess) {
return response;
}
} catch (Exception e) {
log.info("网页授权之拉取用户信息 异常:{}", e.getMessage());
}
return null;
}
解决方案:
增量数据
发送请求时,指定字符集 UTF-8
完善后的代码
/**
* 网页授权之拉取用户信息
*
* @param accessToken 网页授权token(注意和公众号的token不一样)
* @param openId 用户openId
* @return
*/
public @Nullable JSONObject getSnsUserInfo(String accessToken, String openId) {
String requestUrl = StrUtil.format(SNS_USER_INFO_URL, accessToken, openId);
log.info("getSnsUserInfo 请求url:{}", requestUrl);
try {
// 创建一个StringHttpMessageConverter,并设置字符集为UTF-8
StringHttpMessageConverter stringConverter = new StringHttpMessageConverter(Charset.forName("UTF-8"));
stringConverter.setSupportedMediaTypes(Collections.singletonList(MediaType.TEXT_PLAIN));
// 将StringHttpMessageConverter添加到RestTemplate的消息转换器列表中
restTemplate.getMessageConverters().add(0, stringConverter);
// 创建HttpHeaders对象,设置Accept头部的值为"text/plain;charset=UTF-8"
HttpHeaders headers = new HttpHeaders();
headers.setAccept(Collections.singletonList(MediaType.TEXT_PLAIN));
headers.set(HttpHeaders.ACCEPT_CHARSET, "UTF-8");
String responseStr = restTemplate.getForObject(requestUrl, String.class);
JSONObject response = JSON.parseObject(responseStr);
log.info("getSnsUserInfo 响应:{}", response);
boolean isSuccess = checkResponseIsSuccess(response, "getSnsUserInfo");
if (isSuccess) {
return response;
}
} catch (Exception e) {
log.info("网页授权之拉取用户信息 异常:{}", e.getMessage());
}
return null;
}
历史数据
将字符集是 ISO_8859_1的昵称转换为 UTF-8
@Test
public void test(){
String wrongEncodedString = "Má´\u0087á´\u0087á´\u009B ꦿá\u00AD\u0084 .";
if (isISO88591(wrongEncodedString)) {
String newStr = convertStrCharset(wrongEncodedString);
System.out.println(newStr);
//结果: Mᴇᴇᴛ ꦿ᭄ .
}
}
private boolean isISO88591(String str) {
byte[] byteArr = str.getBytes(StandardCharsets.ISO_8859_1);
String convertedStr = new String(byteArr, StandardCharsets.ISO_8859_1);
// 比较原始字符串和转换后的字符串是否相等
return str.equals(convertedStr);
}
private String convertStrCharset(String str) {
try {
// 假设原始字符编码
byte[] bytes = str.getBytes(StandardCharsets.ISO_8859_1);
// 使用UTF-8重新编码为正常的字符串
return new String(bytes, StandardCharsets.UTF_8);
} catch (Exception e) {
log.warn("convertStrCharset failed,errorMsg:{}", e.getMessage());
}
return str;
}
ps:
ISO-8859-1并不适合表示所有语言的字符,特别是亚洲语言如中文、日文和韩文等。对于这些语言,需要使用其他字符集,例如UTF-8或UTF-16。通常更推荐使用Unicode字符集(如UTF-8)