Python3解决Nginx日志的中文乱码问题

Nginx中文日志出现乱码,如下所示:

{\x22code\x22: \x22000\x22, \x22msg\x22: \x22\x5Cu6210\x5Cu529f\x22, \x22data\x22: {\x22store_id\x22: 322589}, \x22subcode\x22: \x22100000\x22}

Python3进行解码

import json

msg = """
{\x22code\x22: \x22000\x22, \x22msg\x22: \x22\x5Cu6210\x5Cu529f\x22, \x22data\x22: {\x22store_id\x22: 322589}, \x22subcode\x22: \x22100000\x22}
"""
res_obj = json.loads(msg.encode('raw_unicode_escape').decode('utf8'))
print(json.dumps(res_obj, ensure_ascii=False))

结果如下所示:

{"code": "000", "msg": "成功", "data": {"store_id": 322589}, "subcode": "100000"}

总结

  • Nginx默认不支持中文日志,会将中文转成16进制存储
  • 通过Python3先编码在解码:msg.encode('raw_unicode_escape').decode('utf8')即可完成相应的转换
  • Nginx可以在配置中支持中文的json,在定义 access log 格式时,加上 escape=json,如下所示:
log_format  main escape=json '$remote_addr - $remote_user [$time_local] "$request" ' 
'$status $body_bytes_sent "$http_referer" ' 
'"$http_user_agent" "$http_x_forwarded_for"';  
posted on 2022-01-20 15:25  JentZhang  阅读(1775)  评论(0编辑  收藏  举报