UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position 4: ordinal not in range(128)
Rohit Agarwal的笔记 出处:https://notes.rohitagarwal.org/2013/05/28/fixing-unicodedecodeerror-in-python.html
在Python中修复UnicodeDecodeError
>>> a = "He said, “Hi, there.” She didn't reply."
>>> type(a)
<type 'str'>
>>> a
"He said, \xe2\x80\x9cHi, there.\xe2\x80\x9d She didn't reply."
>>> print a
He said, “Hi, there.” She didn't reply.
a
是用utf-8编码的字符串。
>>> b = unicode(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 9: ordinal not in range(128)
这没有用,因为ascii中python中的默认编码。因此,python无法以a
假定的ascii编码进行解码。
>>> b = unicode(a, "utf-8")
>>> type(b)
<type 'unicode'>
>>> b
u"He said, \u201cHi, there.\u201d She didn't reply."
>>> print b
He said, “Hi, there.” She didn't reply.
b
不是字符串。它是一个unicode对象。我认为它没有编码。您可以使用不同的编码方式对其进行编码。
>>> c = b.encode("utf-8")
>>> type(c)
<type 'str'>
>>> c
"He said, \xe2\x80\x9cHi, there.\xe2\x80\x9d She didn't reply."
>>> print c
He said, “Hi, there.” She didn't reply.
c
现在与相同a
。它是用utf-8编码的字符串。我们通过编码unicode对象来创建它。
>>> d = a.encode("utf-8")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 9: ordinal not in range(128)
a
已经以utf-8编码。这里发生的是python首先尝试解码a
,然后编码a
。但是解码a
失败,因为假定默认编码为ascii。
>>> e = a.decode("utf-8")
>>> type(e)
<type 'unicode'>
>>> e
u"He said, \u201cHi, there.\u201d She didn't reply."
>>> print e
He said, “Hi, there.” She didn't reply.
现在,e
与相同b
。它是一个unicode对象。
>>> f = a.decode("ascii")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 9: ordinal not in range(128)
只是为了展示我们之前所说的话。
>>> g = b.encode("ascii")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' in position 9: ordinal not in range(128)
请注意,这是一个,UnicodeEncodeError
而不是一个UnicodeDecodeError
。我们无法对包含超出ascii编码范围的字符的unicode对象进行编码。
posted on 2019-12-30 17:01 zhangmingda 阅读(112) 评论(0) 编辑 收藏 举报
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· 开源Multi-agent AI智能体框架aevatar.ai,欢迎大家贡献代码
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· AI技术革命,工作效率10个最佳AI工具