requests模块
(1)GET请求
(1)发送get请求
| import requests |
| |
| url = "https://www.baidu.com/" |
| |
| |
| response = requests.get(url) |
- 其中,一些常用的属性和方法包括:
status_code
: 响应的状态码,200表示请求成功,404表示页面不存在等。
text
: 响应的内容,通常是服务器返回的HTML文本。
json()
: 将响应的内容解析为JSON格式。
| import requests |
| from fake_useragent import UserAgent |
| |
| url = 'https://www.bqgbb.cc/' |
| |
| headers = { |
| 'User-Agent': UserAgent().random |
| } |
| |
| res = requests.get(url=url, headers=headers) |
| response = res.text |
| print(response) |
| print(res.status_code) |
(2)get请求携带参数
| import requests |
| from fake_useragent import UserAgent |
| |
| url = 'https://www.baidu.com/s' |
| |
| headers = { |
| 'User-Agent': UserAgent().random |
| } |
| |
| params = { |
| 'wd': '周杰伦' |
| } |
| |
| res = requests.get(url=url, headers=headers, params=params) |
| print(res.text) |
| print(res.request.url) |
| |
(3)get请求携带cookie
| import requests |
| from fake_useragent import UserAgent |
| |
| url = 'https://search.jd.com/Search' |
| |
| headers = { |
| 'User-Agent': UserAgent().random, |
| 'Cookie': '' |
| } |
| params = { |
| 'keyword': '手机' |
| } |
| |
| res = requests.get(url=url, headers=headers, params=params) |
| print(res.text) |
| print(res.request.url) |
| |
(2)POST请求
(1)发送POST请求
| import requests |
| |
| url = 'http://www.aa7a.cn/user.php' |
| |
| data = { |
| 'username': 'heart', |
| 'password': '123', |
| 'captcha': '3aeh', |
| 'remember': '1', |
| 'ref': 'http://www.aa7a.cn', |
| 'act': 'act_login' |
| } |
| res = requests.post(url=url,json=data) |
| print(res.text) |
(3)自动携带cookie的session对象
| import requests |
| from fake_useragent import UserAgent |
| |
| headers = { |
| 'User-Agent': UserAgent().random, |
| } |
| url = "https://xueqiu.com/" |
| |
| session = requests.Session() |
| response1 = session.get(url=url, headers=headers) |
| |
| tag_url = "https://stock.xueqiu.com/v5/stock/batch/quote.json?symbol=SH000001,SZ399001,SZ399006,SH000688,SH000016,SH000300,BJ899050,HKHSI,HKHSCEI,HKHSTECH,.DJI,.IXIC,.INX" |
| response = session.get(url=tag_url, headers=headers) |
| print(response.text) |
(4)响应的相关参数
(1)字符串格式
response.text
将响应体转换为字符串形式。
| import requests |
| |
| response = requests.get('https://www.baidu.com') |
| |
| data = response.text |
| print(data, type(data)) |
(2)二进制格式
response.content
: 获取响应体的二进制内容,适用于处理图像、视频等非文本类型的响应。(默认是16进制)
| import requests |
| |
| response = requests.get('https://pic.netbian.com/uploads/allimg/240322/232300-171112098057a5.jpg') |
| |
| data = response.content |
| print(data, type(data)) |
(3)json格式
response.json()
:获取json格式数据
| import requests |
| from fake_useragent import UserAgent |
| |
| headers = { |
| 'User-Agent': UserAgent().random, |
| } |
| url = "https://xueqiu.com/" |
| |
| session = requests.Session() |
| response1 = session.get(url=url, headers=headers) |
| |
| tag_url = "https://stock.xueqiu.com/v5/stock/batch/quote.json?symbol=SH000001,SZ399001,SZ399006,SH000688,SH000016,SH000300,BJ899050,HKHSI,HKHSCEI,HKHSTECH,.DJI,.IXIC,.INX" |
| response = session.get(url=tag_url, headers=headers) |
| print(response.json(), type(response.json())) |
| |
(4)响应体的编码格式
response.encoding
:获取响应的编码格式
| import requests |
| |
| url = 'https://www.baidu.com' |
| |
| res = requests.get(url=url) |
| print(res.encoding) |
| import requests |
| |
| url = 'https://www.baidu.com' |
| |
| res = requests.get(url=url) |
| |
| print(res.text) |
| print(res.encoding) |
| |
| |
| res.encoding='utf-8' |
| print(res.text) |
| print(res.encoding) |
(5)响应体的状态码
| import requests |
| |
| url = 'https://www.baidu.com' |
| |
| res = requests.get(url=url) |
| print(res.status_code) |
(6)响应头
response.headers
:获取响应头信息,返回一个字典对象
| import requests |
| |
| url = 'https://www.baidu.com' |
| |
| res = requests.get(url=url) |
| print(res.headers) |
| |
(7)响应Cookie
response.cookies
: 获取服务器返回的cookie信息。
response.cookies.get_dict()
: 将cookie信息转换为字典形式。
response.cookies.items()
: 获取cookie信息并以列表形式返回。
| import requests |
| |
| url = 'https://www.baidu.com' |
| |
| res = requests.get(url=url) |
| print(res.cookies) |
| print(res.cookies.get_dict()) |
| print(res.cookies.items()) |
(8)当前请求/响应对象的URL
response.url
: 获取当前响应的URL。
- 这是在完成HTTP请求并接收到服务器响应后,实际返回的资源URL。
- 在重定向发生时,这个属性会反映最终页面的实际URL。
- 例如,如果你发起一个请求到某个网站A,但该网站随后重定向到了网站B,那么
response.url
将显示网站B的URL。
response.request.url
: 获取当前请求的URL。
- 这是发送HTTP请求时使用的原始URL,即你在发出请求时指定的URL。
- 无论是否发生重定向,这个属性始终保持不变。
- 也就是说,它反映了你最初尝试访问的地址。
| import requests |
| |
| url = 'https://www.baidu.com' |
| |
| response = requests.get(url=url) |
| print(response.url) |
| print(response.request.url) |
(9)当前请求的重定向ULR
response.history
: 如果有重定向,返回一个列表,包含所有经过的重定向URL。
| import requests |
| |
| url = 'https://www.baidu.com' |
| |
| response = requests.get(url=url) |
| print(response.history) |
(10)迭代获取二进制数据
response.iter_content()
:迭代获取响应内容(适用于处理视频、图片等二进制数据)
| import requests |
| |
| tag_url = "https://pic.netbian.com/uploads/allimg/240322/232300-171112098057a5.jpg" |
| |
| response = requests.get(tag_url) |
| |
| |
| file_path = "./a.jpg" |
| with open(file_path, 'wb') as f: |
| |
| for chunk in response.iter_content(chunk_size=1024): |
| |
| if chunk: |
| f.write(chunk) |
(5)SSL认证
| import requests |
| url = 'https://ssr2.scrape.center/' |
| response = requests.get(url) |
| print(response.status_code) |
| import requests |
| url = 'https://ssr2.scrape.center/' |
| response = requests.get(url,verify=False) |
| print(response.status_code) |
(6)使用代理
| proxies = {'协议':'协议://IP:端口号'} |
| import requests |
| from fake_useragent import UserAgent |
| |
| proxies = { |
| 'http': 'http://221.6.139.190:9002' |
| } |
| |
| url = 'http://httpbin.org/get' |
| |
| headers = { |
| 'User-Agent': UserAgent().random |
| } |
| response = requests.get(url=url,headers=headers,proxies=proxies,timeout=3) |
| print(response.text) |
| |
| |
| |
| |
| |
| |
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】凌霞软件回馈社区,博客园 & 1Panel & Halo 联合会员上线
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】博客园社区专享云产品让利特惠,阿里云新客6.5折上折
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步