Python 爬虫常见的坑和解决方法

1.请求时出现HTTP Error 403: Forbidden

headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:23.0) Gecko/20100101 Firefox/23.0'}  

req = urllib.request.Request(url=url, headers=headers)  

urllib.request.urlopen(req).read()

详细：https://www.2cto.com/kf/201309/242273.html

2.保存html内容时出现Python UnicodeEncodeError: 'gbk' codec can't encode character

将

f = open("out.html","w")

换成

f = open("out.html","w",encoding='utf-8')

详细：http://www.jb51.net/article/64816.htm

posted @ 2018-01-06 16:26 程序生(Codey) 阅读(677) 评论(0) 收藏举报

刷新页面返回顶部

cxscode

Python 爬虫常见的坑和解决方法

公告