scrapy爬虫-setting.py

# Obey robots.txt rules
ROBOTSTXT_OBEY = False  不遵从网站的robots.txt法则

# See also autothrottle settings and docs
DOWNLOAD_DELAY = 3  每次下载延迟3秒,防止造成网站攻击

# Override the default request headers:
DEFAULT_REQUEST_HEADERS = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 设置默认请求头
'Accept-Language': 'en',
}
# Configure item pipelines
# See https://doc.scrapy.org/en/latest/topics/item-pipeline.html
ITEM_PIPELINES = {
'xiaoshuo.pipelines.XiaoshuoPipeline': 300,  数字越小,优先级越高
}

FEED_EXPORT_ENCODING ='utf-8' 文件乱码设置
posted @ 2018-09-25 15:37  ShadowXie  阅读(208)  评论(0编辑  收藏  举报