Pyhon3爬虫（2）

Scrapy框架简单使用

下载scrapy模块

pip install scrapy

进入要存放工程的路径，创建工程

scrapy startproject scrapyDemo

进入spiders目录，新建scrapy_demo.py

import scrapy
from bs4 import BeautifulSoup

class tsSpider(scrapy.Spider):
    name = "demo"

    def start_requests(self):
        urls = [r'https://www.cnblogs.com/', ]
        headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36'}
        for url in urls:
            yield scrapy.Request(url=url, headers=headers, callback=self.parse)


    def parse(self, response):
        soup = BeautifulSoup(response.body, "html.parser")
        titles = soup.find_all("a", "titlelnk")
        for title in titles:
            print(title.string)

输入

scrapy crawl demo

posted @ 2018-04-22 16:09 背向我煮面阅读(193) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

背向我煮面

Pyhon3爬虫（2）

Scrapy框架简单使用

公告