scrapy(四)使用redis

项目源码可以参考我的github:https://github.com/corolcorona/spider_scrapy

 

1.执行以下命令安装redis模块

pip install scrapy-redis

2.settings.py

(报错exceptions.ValueError: ("Failed to instantiate dupefilter class '%s': %s", 'scrapy.dupefilters.RFPDupeFilter', TypeError("__init__() got an unexpected keyword argument 'key'",))

缺少DUPEFILTER_CLASS = "scrapy_redis.dupefilter.RFPDupeFilter")

DUPEFILTER_CLASS = "scrapy_redis.dupefilter.RFPDupeFilter"
SCHEDULER = "scrapy_redis.scheduler.Scheduler"
SCHEDULER_ORDER = 'BFO'
SCHEDULER_PERSIST = True
SCHEDULER_QUEUE_CLASS = 'scrapy_redis.queue.SpiderPriorityQueue'
REDIS_URL = None
REDIS_HOST = '127.0.0.1'
REDIS_PORT = 6379

3.novel.py中导入对redis的使用

 

from scrapy_redis.spiders import RedisSpider

 

posted @ 2017-05-03 12:35  corolcorona  阅读(1141)  评论(1编辑  收藏  举报