python基础篇----字符串unicode

python中处理中文常要用到unicode，因为较容易遇到字符串编码的问题，我一般都是将字符串统一转成unicode去处理

python中定义一个unicode字符串，可以在字符串前面加u：

str=u"hello world"

python中定义不转义的字符串，可以在字符串前面加r：

path=r"c:\programfile\test"

解码将其他字符串格式转为unicode：

ret=str.decode("gb2312")
ret=str.decode("ascii")
ret=str.decode("utf-8")

编码将unicode字符转为其他字符串格式：

ret=str.encode(“gb2312”)
ret=str.encode("ascii")
ret=str.encode("utf-8")

chardet判断字符串为何种编码格式：

encode = chardef.detect（str）
print encode['encoding']

字符串格式化%s

print "test for %s, value is %d"%("format", 123)

一般在py文件开始的时候都加上#encoding=utf-8，避免文件中有中文乱码

处理字符串问题最主要是知道字符串输入的时候是什么格式，在输入的时候处理好字符串，处理过程就好办了

posted @ 2016-08-27 09:46 crazymanpj 阅读(1094) 评论(0) 收藏举报

刷新页面返回顶部

crazymanpj