Python的requests模块

requests模块：

是python的第三方模块,用来发送网络请求,常用于爬虫,能够完全满足基于HTTP协议的接口测试

pip install requests

import requests

一、发送请求方法

模拟对百度发起http GET请求:

response = requests.get("https://www.baidu.com")

发起POST请求:
response = requests.post('http://httpbin.org/post', data = {'key':'value'})
发起PUT请求:
response = requests.put('http://httpbin.org/put', data = {'key':'value'})
发起DELETE请求:
response = requests.delete('http://httpbin.org/put')
发起HEAD请求:
response = requests.head('http://httpbin.org/put')
发起OPTIONS请求:
response = requests.options('http://httpbin.org/put')

二、传递URL参数

params = {'key1': 'value1', 'key2': 'value2'}
response = requests.get(url='https://www.baidu.com',params=params)
print(response.url) # https://www.baidu.com?key2=value2&key1=value1

需要注意的是:注意字典里值为 None 的键都不会被添加到 URL 的查询字符串里。

还可以通过字典进行参数传递:

params = {'key1': 'value1', 'key2': ['value2', 'value3']}
response = requests.get(url='https://www.baidu.com',params=params)
print(response.url) # https://www.baidu.com?key2=value2&key1=value1&key2=value3

data = {'username':'zhangsan', 'password':'123456'}
respone = requests.post(url='https://www.baidu.com', data=data)

三、响应内容

如在‘传递url参数’中的示例一样，对于每一个我们发起的需要接收响应内容的请求，我们都会通过一个Response来接受远端服务器的响应内容。
例如接收服务端响应的页面内容:
text = response.text
查看页面代码的编码格式:
html_encoding = response.encoding
以二进制接收响应内容:
content = response.content
获取json格式内容:
html_json = response.json()
获取响应状态码:
status_code = response.status_code

获取响应headers：

headers= response.headers

四、请求头构造

我们在发起请求时，碍于某种需求需要进行个性化请求头构造；例如通过爬虫在进行数据采集的时候，为了避免服务端进行浏览器指纹的识别，我们可能需要平凡的更换请求头中的浏览器头标识:

headers = {'user-agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.109 Safari/537.36'}
response = requests.get(url=url, headers=headers)

五、其他

verify: 这是本次请求是否进行证书校验，常见与发起https请求时，建议在发起https时关闭证书校验的环节:

response = requests.get("https://www.baidu.com", verify=False)

allow_redirects: 本次请求是否支持重定向，requests是一个功能完善的模块，在发起请求时默认是支持重定向的，即与浏览器的地址跳转保持一致，此功能默认为开启模式

response = requests.get("https://www.baidu.com", allow_redirects=True)

timeout: 设置响应超时时间,timeout 仅对连接过程有效，与响应体的下载无关。 timeout 并不是整个下载响应的时间限制，而是如果服务器在 timeout 秒内没有应答，将会引发一个异常（更精确地说，是在 timeout 秒内没有从基础套接字上接收到任何字节的数据时）If no timeout is specified explicitly, requests do not time out.

response = requests.get('https://www.baidu.com', timeout=5)

history:请求历史,此结果多与包含重定向的请求配合使用，即内容为在此次请求过程中所有参与跳转的历史链接响应的Response对象

response = requests.get("https://www.baidu.com")
history_info = response.history

六、高级用法

会话对象:会话对象让你能够跨请求保持某些参数。它也会在同一个 Session 实例发出的所有请求之间保持 cookie，期间使用 urllib3 的 connection pooling 功能。所以如果你向同一主机发送多个请求，底层的 TCP 连接将会被重用，从而带来显著的性能提升。直接通过requests发起http请求时，每发起一个请求底层的tcp链接都将会被重新创建。session对象主要用于保持cookie等场景。

session = requests.Session()
response = session.get("https://www.baidu.com")

或者
with requests.Session() as s:
s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')

事件挂钩
Requests有一个钩子系统，你可以用来操控部分请求过程，或信号事件处理。可用的钩子:
response:
从一个请求产生的响应
你可以通过传递一个 {hook_name: callback_function} 字典给 hooks 请求参数为每个请求分配一个钩子函数：

hooks=dict(response=print_url)
callback_function 会接受一个数据块作为它的第一个参数。

def print_url(r, *args, **kwargs):
print(r.url)
若执行你的回调函数期间发生错误，系统会给出一个警告。

若回调函数返回一个值，默认以该值替换传进来的数据。若函数未返回任何东西，也没有什么其他的影响。

我们来在运行期间打印一些请求方法的参数：

requests.get('http://httpbin.org', hooks=dict(response=print_url))
http://httpbin.org
<Response [200]>

代理:如果需要使用代理，你可以通过为任意请求方法提供 proxies 参数来配置单个请求:
```python
import requests

proxies = {
"http": "http://10.10.1.10:3128",
"https": "http://10.10.1.10:1080",
}

requests.get("http://example.org", proxies=proxies)
```
你也可以通过环境变量 HTTP_PROXY 和 HTTPS_PROXY 来配置代理。
export HTTP_PROXY="http://10.10.1.10:3128"
xport HTTPS_PROXY="http://10.10.1.10:1080"
$ python
>>> import requests
>>> requests.get("http://example.org")

若你的代理需要使用HTTP Basic Auth，可以使用 http://user:password@host/ 语法：

proxies = {
"http": "http://user:pass@10.10.1.10:3128/",
}
要为某个特定的连接方式或者主机设置代理，使用 scheme://hostname 作为 key，它会针对指定的主机和连接方式进行匹配。

proxies = {'http://10.20.1.128': 'http://10.10.1.10:5323'}
注意，代理 URL 必须包含连接方式。

posted on 2022-09-05 15:55 yanmay 阅读(146) 评论(0) 收藏举报

刷新页面返回顶部

Python的requests模块

导航

公告