scrapy框架爬取小说入库

http://www.cnblogs.com/GUIDAO/p/6690759.html

本人步骤：

1>setting.py:

BOT_NAME = 'newding'  SPIDER_MODULES = ['newding.spiders'] NEWSPIDER_MODULE = 'newding.spiders'
ROBOTSTXT_OBEY = True

ITEM_PIPELINES = { 'newding.pipelines.NewdingPipeline': 300, }

以上配置；创建项目会自动出现这些

以下是想要入数据库的（阶段）：

MYSQL_USER = 'root' MYSQL_PASSWORD = '12345678' MYSQL_HOST = '127.0.0.1' MYSQL_PORT = '3306' MYSQL_DB = 'xiaoshuo'
2>RUN.py

from scrapy.cmdline import execute execute(['scrapy', 'crawl', 'newding1s']) #执行项目命令
 3>items.py

import scrapy
 class NewdingItem(scrapy.Item):     
  # define the fields for your item here like:    
  # name = scrapy.Field()    
  # pass     
  title = scrapy.Field()     
  types = scrapy.Field()     
  zijie = scrapy.Field()     
  book_url = scrapy.Field()

posted @ 2017-06-30 13:57 航林阅读(532) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

航林

scrapy框架爬取小说入库

公告