scrapy框架爬取小说入库

 http://www.cnblogs.com/GUIDAO/p/6690759.html

本人步骤:

1>setting.py:

BOT_NAME = 'newding'  SPIDER_MODULES = ['newding.spiders'] NEWSPIDER_MODULE = 'newding.spiders'
ROBOTSTXT_OBEY = True

  ITEM_PIPELINES = { 'newding.pipelines.NewdingPipeline': 300, }

以上配置;创建项目会自动出现这些

以下是想要入数据库的(阶段):

MYSQL_USER = 'root' MYSQL_PASSWORD = '12345678' MYSQL_HOST = '127.0.0.1' MYSQL_PORT = '3306' MYSQL_DB = 'xiaoshuo'
2>RUN.py
from scrapy.cmdline import execute execute(['scrapy', 'crawl', 'newding1s']) #执行项目命令
3>items.py
import scrapy
class NewdingItem(scrapy.Item):
# define the fields for your item here like:
# name = scrapy.Field()
# pass
title = scrapy.Field()
types = scrapy.Field()
zijie = scrapy.Field()
book_url = scrapy.Field()
posted @ 2017-06-30 13:57  航林  阅读(532)  评论(0编辑  收藏  举报