04-数据存储篇 - 随笔分类 - 不是霉蛋

<4> pipeline

摘要："""scrapy 保存管道数据""" from scrapy.exporters import CsvItemExporter class CsvPipeline: def __init__(self): # 文件存储初始化操作 self.file = open('filename.csv', ' 阅读全文

posted @ 2022-11-02 16:30 不是霉蛋阅读(51) 评论(0) 推荐(0)

<3> MongoDB存储

摘要：from pymongo import MongoClient class Spider(object): def __init__(self): # 将数据存储到数据库中 try: self.client = MongoClient('localhost', 27017) self.sina_db 阅读全文

posted @ 2022-11-02 16:20 不是霉蛋阅读(36) 评论(0) 推荐(0)

<2> MySQL存储

摘要：import mysql.connector """数据模型类""" class QingHuaModel(object): def __init__(self, title, time, contents): self.title = title self.time = time self.con 阅读全文

posted @ 2022-11-02 16:14 不是霉蛋阅读(39) 评论(0) 推荐(0)

<1> csv 存储

摘要："""方式一（scrapy下）：""" scrapy crawl 爬虫名 -o 保存的csv文件名 """方式二(常用)：""" from scrapy.exporters import CsvItemExporter class CsvPipeline: def __init__(self): # 阅读全文

posted @ 2022-11-02 16:08 不是霉蛋阅读(48) 评论(0) 推荐(0)

随笔分类 - 04-数据存储篇