pu369com

「Python网络爬虫」利用requests携带chrome中复制出的headers

代码:

import requests
import json

#这段三引号内的内容是从chrome控制台的网络页下刷新后,找到第一页的headers,整体复制
headers = """
GET /wui/index.html HTTP/1.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Accept-Encoding: gzip, deflate
Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6
Cache-Control: max-age=0
Connection: keep-alive
Cookie:马赛克
If-Modified-Since: Sat, 30 Jul 2022 08:07:38 GMT
If-None-Match: "8Jo马赛克X"
Referer: http://马赛克/index.html
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.5112.102 Safari/537.36 Edg/104.0.1293.63
"""

# 去除参数头尾的空格并按换行符分割
headers = headers.strip().split('\n')
# 使用字典生成式将参数切片重组,并去掉空格,处理带协议头中的://
headers ={x.split(':')[0].strip():("".join(x.split(':')[1:])).strip().replace('//', "://") for x in headers}
#print(json.dumps(headers,indent=4))
url = 'http://马赛克/api'
res = requests.get(url,headers=headers).text
print(res)

 

 

参考:https://baijiahao.baidu.com/s?id=1729201298117686282&wfr=spider&for=pc

https://blog.csdn.net/kylner/article/details/125299214

posted on 2022-08-26 11:47  pu369com  阅读(373)  评论(0编辑  收藏  举报

导航