中文unicode范围及unicode编解码
中文unicode范围 : [\u4e00-\u9fa5]
普通字符串可以用多种方式编码成Unicode字符串,具体要看你究竟选择了哪种编码:
unicodestring = u"Hello world"
# 将Unicode转化为普通Python字符串:"encode"
utf8string = unicodestring.encode("utf-8") asciistring = unicodestring.encode("ascii") isostring = unicodestring.encode("ISO-8859-1") utf16string = unicodestring.encode("utf-16")
# 将普通Python字符串转化为Unicode:"decode"
plainstring1 = unicode(utf8string, "utf-8") plainstring2 = unicode(asciistring, "ascii") plainstring3 = unicode(isostring, "ISO-8859-1") plainstring4 = unicode(utf16string, "utf-16") assert plainstring1 == plainstring2 == plainstring3 == plainstring4
每天一小步,人生一大步!Good luck~