scrapy爬取招聘网站,items转换成dict遇到的问题
pipelines代码
1 import json 2 3 class TencentJsonPipeline(object): 4 def __init__(self): 5 self.file = open('tencent.json','wb') 6 7 def process_item(self, item, spider): 8 content = json.dumps(dict(item),ensure_ascii=False)+"\n" 9 self.file.write(content) 10 return item 11 def close_project(self): 12 self.file.close()
报错:
self.file.write(content) TypeError: a bytes-like object is required, not 'str'
这个问题是基本的编码解码问题,打开json文件时不能用‘wb’,而是‘w’,编码方式为utf-8
更正后代码:
1 class TencentJsonPipeline(object): 2 def __init__(self): 3 self.file = open('tencent.json','w',encoding='utf-8') 4 5 def process_item(self, item, spider): 6 content = json.dumps(dict(item),ensure_ascii=False)+"\n" 7 self.file.write(content) 8 return item 9 def close_project(self): 10 self.file.close()
运行正常
参考地址:https://stackoverflow.com/questions/44682018/typeerror-object-of-type-bytes-is-not-json-serializable