fbhell - 博客园

2022年6月12日

摘要： 1 ROBOTSTXT_OBEY = False 2 3 LOG_LEVEL="ERROR" 4 5 USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86 阅读全文

posted @ 2022-06-12 11:19 fbhell 阅读(14) 评论(0) 推荐(0) 编辑

基础

摘要： 1 安装scrapy 2 pip install wheel 3 下载Twisted https://www.lfd.uci.edu/~gohlke/pythonlibs/ 4 pip install Twisted-xxxxxx 5 pip install scrapy 创建工程 scrapy s 阅读全文

posted @ 2022-06-12 11:19 fbhell 阅读(9) 评论(0) 推荐(0) 编辑

协程+事件循环驱动+绑定回调

摘要： 1 import asyncio 2 3 async def request(url): 4 print("请求url") 5 print("完成") 6 7 c=request("www.baidu.com") 8 loop=asyncio.get_event_loop() 9 #创建协程 10 阅读全文

posted @ 2022-06-12 11:15 fbhell 阅读(48) 评论(0) 推荐(0) 编辑

线程池

摘要： 1 from multiprocessing.dummy import Pool 2 3 4 pool=Pool(4) 5 pool.map(函数,数组) 6 7 pool.close() 8 pool.join() 阅读全文

posted @ 2022-06-12 11:15 fbhell 阅读(6) 评论(0) 推荐(0) 编辑

正则表达式基本语法

摘要：基本语法 / 转义字符，消除字符的特殊含义 , '/''表示匹配单引号' . 匹配且仅匹配任意一个字符; .*表示长度为0或以上的任意字符串修饰符描述 re.I 使匹配对大小写不敏感 re.L 做本地化识别（locale-aware）匹配 re.M 多行匹配，影响 ^ 和 $ re.S 使 . 阅读全文

posted @ 2022-06-12 11:15 fbhell 阅读(21) 评论(0) 推荐(0) 编辑

# 爬虫流程

摘要： # UA伪装 # 指定url # 发起请求，输入搜索内容 #获取数据 #持久化存储 https://curlconverter.com/#python 各个模块作用 requests 发送请求、获取数据、处理数据自动转码等 response=requests.get(url) response.te 阅读全文

posted @ 2022-06-12 11:14 fbhell 阅读(20) 评论(0) 推荐(0) 编辑

kfc店铺位置

摘要： 1 import requests 2 list=[] 3 def ua(place="北京",pageIndex="1"): 4 cookies = { 5 'route-cell': 'ksa', 6 'ASP.NET_SessionId': 'unlvrjaq405kxftmopzeerp2' 阅读全文

posted @ 2022-06-12 11:13 fbhell 阅读(76) 评论(0) 推荐(0) 编辑

豆瓣爬取

摘要： import requests list=[] def ua(start=0): cookies = { 'll': '"118151"', 'bid': 'JGmehAcUHh0', '_pk_ref.100001.4cf6': '%5B%22%22%2C%22%22%2C1649677087%2 阅读全文

posted @ 2022-06-12 11:12 fbhell 阅读(34) 评论(0) 推荐(0) 编辑

python 文件模式

摘要： r+进行了覆盖写，可读可写，若文件不存在，报错 a+ 附加读写方式打开 w+ 可读可写，若文件不存在，创建阅读全文

posted @ 2022-06-12 11:11 fbhell 阅读(9) 评论(0) 推荐(0) 编辑

UA伪装

摘要： # UA伪装 headers={ "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36" ,"Conn 阅读全文

posted @ 2022-06-12 11:11 fbhell 阅读(198) 评论(0) 推荐(0) 编辑