requests 库

1. requests 简介

2. get 请求

3. post 请求

4. 其他请求方法

5. 高级用法

5.1 获取 json 格式的响应数据

5.2 获取原始的 socket 响应数据

1. requests 简介

Python 中有多种库可以用来处理 http 请求，比如 urllib、requests 库等。

requests VS urllib：

urllib 和 urllib2 是相互独立的模块，python3.0 以上把 urllib 和 urllib2 合并成一个库了，requests 库使用了 urllib3。
requests 库的口号是“HTTP For Humans”（为人类使用 HTTP 而生），因此比起 urllib 包的繁琐，requests 库特别简洁和容易理解。

2. get 请求

 1 # 使用 get 方法访问网页资源
 2 >>> resp = requests.get("http://www.baidu.com")
 3 
 4 # 返回响应对象
 5 >>> resp
 6 <Response [200]>
 7 
 8 # 状态码
 9 >>> resp.status_code
10 200
11 
12 # 请求地址
13 >>> resp.url
14 'http://www.baidu.com/'
15 
16 # 用 resp.encoding 对 resp.content 进行解码后的字符串
17 >>> print(resp.text[:100])
18 <!DOCTYPE html>
19 <!--STATUS OK--><html> <head><meta http-equiv=content-type content=text/html;charse
20 
21 # 请求所使用的编码
22 >>> resp.encoding
23 'ISO-8859-1'
24 
25 # 以字节方式获取的响应内容
26 >>> print(resp.content[:100])
27 b'<!DOCTYPE html>\r\n<!--STATUS OK--><html> <head><meta http-equiv=content-type content=text/html;charse'

get 方法带请求参数：

 1 # 方式1：使用字典的请求参数
 2 >>> payload = {"key1":"value1", "key2":"value2"}
 3 >>> resp = requests.get("http://httpbin.org/get", params=payload)
 4 >>> print(resp.text)
 5 {
 6   "args": {
 7     "key1": "value1",
 8     "key2": "value2"
 9   },
10   "headers": {
11     "Accept": "*/*",
12     "Accept-Encoding": "gzip, deflate",
13     "Host": "httpbin.org",
14     "User-Agent": "python-requests/2.23.0",
15     "X-Amzn-Trace-Id": "Root=1-5fb685f6-23b6c5e864d8dc4e41e8de27"
16   },
17   "origin": "113.116.22.63",
18   "url": "http://httpbin.org/get?key1=value1&key2=value2"
19 }
20 
21 
22 # 方式2：使用字典+列表的请求参数
23 >>> payload = {"key1":"value1", "key2":["value2", "value3"]}
24 >>> resp = requests.get("http://httpbin.org/get", params=payload)
25 >>> resp.url
26 'http://httpbin.org/get?key1=value1&key2=value2&key2=value3'

3. post 请求

post 请求方法有两种方式：

表单提交：提交字典或二维元组的数据
非表单提交：提交 json 格式的数据

示例一：表单提交的两种方式

 1 # 方式一：使用字典
 2 >>> resp = requests.post("http://httpbin.org/post", data={"key": "value"})
 3 >>> print(resp.text)
 4 {
 5   "args": {},
 6   "data": "",
 7   "files": {},
 8   "form": {
 9     "key": "value"
10   },
11   "headers": {
12     "Accept": "*/*",
13     "Accept-Encoding": "gzip, deflate",
14     "Content-Length": "9",
15     "Content-Type": "application/x-www-form-urlencoded",
16     "Host": "httpbin.org",
17     "User-Agent": "python-requests/2.23.0",
18     "X-Amzn-Trace-Id": "Root=1-5fb67ae5-6c15961202281a1d70522539"
19   },
20   "json": null,
21   "origin": "113.116.22.63",
22   "url": "http://httpbin.org/post"
23 }
24 
25 
26 # 方式二：使用二维元组
27 >>> payload = (('key1', 'value1'), ('key1', 'value2'))
28 >>> resp = requests.post("http://httpbin.org/post", data=payload)
29 >>> print(resp.text)
30 {
31   "args": {},
32   "data": "",
33   "files": {},
34   "form": {
35     "key1": [
36       "value1",
37       "value2"
38     ]
39   },
40   "headers": {
41     "Accept": "*/*",
42     "Accept-Encoding": "gzip, deflate",
43     "Content-Length": "23",
44     "Content-Type": "application/x-www-form-urlencoded",
45     "Host": "httpbin.org",
46     "User-Agent": "python-requests/2.23.0",
47     "X-Amzn-Trace-Id": "Root=1-5fb67b74-716bca001516d46950d0d762"
48   },
49   "json": null,
50   "origin": "113.116.22.63",
51   "url": "http://httpbin.org/post"
52 }

示例二：非表单提交

 1 import requests
 2 
 3 # 方式1：使用json.dumps
 4 import json
 5 
 6 url = 'http://httpbin.org/post'
 7 payload = {'some': 'data'}
 8 
 9 resp = requests.post(url, data=json.dumps(payload))
10 >>> print(resp.text)
11 {
12   "args": {},
13   "data": "{\"some\": \"data\"}",
14   "files": {},
15   "form": {},
16   "headers": {
17     "Accept": "*/*",
18     "Accept-Encoding": "gzip, deflate",
19     "Content-Length": "16",
20     "Host": "httpbin.org",
21     "User-Agent": "python-requests/2.23.0",
22     "X-Amzn-Trace-Id": "Root=1-5fb67c87-78a1dd216e987f0226d5b97a"
23   },
24   "json": {
25     "some": "data"
26   },
27   "origin": "113.116.22.63",
28   "url": "http://httpbin.org/post"
29 }
30 
31 
32 # 方式2：使用内置参数 json
33 url = 'http://httpbin.org/post'
34 payload = {'some': 'data'}
35 
36 resp = requests.post(url, json=payload)

4. 其他请求方法

 1 # put：从客户端向服务器传送的数据取代指定的文档的内容
 2 >>> r = requests.put('http://httpbin.org/put', data={'key':'value'})
 3 >>> print("put:", r.text)
 4 put: {
 5   "args": {},
 6   "data": "", 
 7   "files": {},
 8   "form": {
 9     "key": "value"
10   },
11   "headers": {
12     "Accept": "*/*",
13     "Accept-Encoding": "gzip, deflate",
14     "Content-Length": "9",
15     "Content-Type": "application/x-www-form-urlencoded",
16     "Host": "httpbin.org",
17     "User-Agent": "python-requests/2.23.0",
18     "X-Amzn-Trace-Id": "Root=1-5fb6808c-7842d5b450d1777139efab8e"
19   },
20   "json": null,
21   "origin": "113.116.22.63",
22   "url": "http://httpbin.org/put"
23 }
24 
25 # delete：请求服务器删除指定的页面
26 >>> r = requests.delete('http://httpbin.org/delete')
27 >>> print("delete:", r.text)
28 delete: {
29   "args": {},
30   "data": "",
31   "files": {},
32   "form": {},
33   "headers": {
34     "Accept": "*/*",
35     "Accept-Encoding": "gzip, deflate",
36     "Content-Length": "0",
37     "Host": "httpbin.org",
38     "User-Agent": "python-requests/2.23.0",
39     "X-Amzn-Trace-Id": "Root=1-5fb6808e-042c04c61a6257820e4ff404"
40   },
41   "json": null,
42   "origin": "113.116.22.63",
43   "url": "http://httpbin.org/delete"
44 }
45 
46 # head：类似于get请求，只不过返回的响应中没有具体的内容，用于获取报头
47 >>> r = requests.head('http://httpbin.org/get')
48 >>> print("head:", r.text)
49 head:
50 >>> print(r.headers)
51 {'Date': 'Thu, 19 Nov 2020 14:26:23 GMT', 'Content-Type': 'application/json', 'Content-Length': '306', 'Connection': 'keep-alive', 'Server': 'gunicorn/19.9.0', 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Credentials': 'true'}
52 
53 # options：允许客户端查看服务器的性能
54 >>> r = requests.options('http://httpbin.org/get')
55 >>> print("options:", r.text)
56 options:
57 >>> print(r.headers)
58 {'Date': 'Thu, 19 Nov 2020 14:26:24 GMT', 'Content-Type': 'text/html; charset=utf-8', 'Content-Length': '0', 'Connection': 'keep-alive', 'Server': 'gunicorn/19.9.0', 'Allow': 'OPTIONS, HEAD, GET', 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Credentials': 'true', 'Access-Control-Allow-Methods': 'GET, POST, PUT, DELETE, PATCH, OPTIONS', 'Access-Control-Max-Age': '3600'}

5. 高级用法

5.1 获取 json 格式的响应数据

1 r = requests.get('https://api.github.com/events')
2 print(r.json())  # （将json数据转成python对象）本例返回一个列表，里面是一个字典元素
3 print(type(r.json()))  # List

5.2 获取原始的 socket 响应数据

1 >>> resp = requests.get("https://api.github.com/events", stream=True)
2 >>> print(type(resp.raw))
3 <class 'urllib3.response.HTTPResponse'>
4 >>> print(resp.raw)
5 <urllib3.response.HTTPResponse object at 0x000001E3F2A0C2B0>
6 >>> print(resp.raw.read())  # 获取流格式的响应数据
7 b'\x1f\x8b\x08\x00\x00\x00\x00\ ......

将数据流保存到文件中：

 1 >>> resp = requests.get("https://api.github.com/events", stream=True)
 2 >>> with open("e:\\file.txt", "wb") as f:
 3 ...     for chunk in resp.iter_content(1000): 
 4 ...             f.write(chunk)
 5 ...
 6 2748
 7 2853
 8 4761
 9 4835
10 4691
11 4066
12 5545
13 7525
14 4489
15 2732
16 3259
17 2115
18 >>> with open("e:\\file.txt") as f:
19 ...     print(f.read(50))
20 ...
21 [{"id":"14250730635","type":"PushEvent","actor":{"...]

5.3 设置请求头

1 >>> url = "http://api.github.com/some/endpoint"
2 >>> headers = {"user-agent": "my-app/0.0.1"}  # 增加浏览器及版本信息
3 >>> r = requests.get(url, headers=headers)

5.4 上传文件

方式 1：

1 import requests
2 
3 url = 'http://httpbin.org/post'
4 files = {'file': open('e:\\test.xlsx', 'rb')}
5 
6 r = requests.post(url, files=files)
7 print(r.text)

方式 2：显式设置文件名、文件类型和请求头

1 import requests
2 
3 url = 'http://httpbin.org/post'
4 files = {'file': ('report.xls', open('e:\\test.xlsx', 'rb'), 'application/vnd.ms-excel', {'Expires': '0'})}
5 
6 r = requests.post(url, files=files)
7 print(r.text)

建议用二进制模式（binary mode）打开文件。这是因为 requests 可能会试图为你提供 Content-Length header，在它这样做的时候，这个字段值会被设为文件的字节数（bytes）。如果用文本模式（text mode）打开文件，就可能会发生错误。

5.5 状态码

 1 import requests
 2 
 3 r = requests.get('http://httpbin.org/get')
 4 print(r.status_code)  # 200
 5 print(r.status_code == requests.codes.ok)  # 状态码判断：True
 6 
 7 # 非200时抛出异常代码
 8 print(r.raise_for_status())  # None
 9 
10 r = requests.get('https://www.cnblogs.com/dinex.indd')
11 print(r.raise_for_status())  # 抛异常：...404 Client Error: Not Found...

5.6 获取响应头信息

1 import requests
2 
3 r = requests.get('https://api.github.com/events')
4 print(r.headers) 
5 print(r.headers['Content-Type'])
6 print(r.headers.get('content-type'))

5.7 获取/发送 Cookie

获取 Cookie：

1 import requests
2 
3 url = 'https://www.baidu.com'
4 r = requests.get(url)
5 print(r.cookies)  # 存储在字典里  # <RequestsCookieJar[<Cookie BDORZ=27315 for .baidu.com/>]>
6 for k, v in r.cookies.items():
7     print(k, v)  # BDORZ 27315

发送 Cookie：

1 import requests
2 
3 url = 'http://httpbin.org/cookies'
4 cookies = dict(cookies_are='working')
5 
6 r = requests.get(url, cookies=cookies)
7 print(r.text)  # {"cookies":{"cookies_are":"working"}}

设定跨多个路径的 Cookie：

1 import requests
2 
3 jar = requests.cookies.RequestsCookieJar()
4 jar.set('tasty_cookie', 'yum', domain='httpbin.org', path='/cookies')
5 jar.set('gross_cookie', 'blech', domain='httpbin.org', path='/elsewhere')
6 
7 url = 'http://httpbin.org/cookies'
8 r = requests.get(url, cookies=jar)
9 print(r.text)  # {"cookies":{"tasty_cookie":"yum"}}

5.8 请求超时

1 import requests
2 
3 requests.get('http://github.com', timeout=0.001)  # 抛超时的异常

5.9 获取重定向响应数据

1 import requests
2 
3 r = requests.head('http://github.com', allow_redirects=True)
4 print(r.url) # 最终访问的url：'https://github.com/'
5 print(r.history[0].url)  # 跳转前的url：http://github.com/
6 print(r.history)  # 历史响应对象的列表  # [<Response [301]>]

禁止重定向：

1 import requests
2 
3 r = requests.get('http://github.com', allow_redirects=False)
4 print(r.status_code)  # 301
5 print(r.history)  # []

5.10 Session

会话对象让你能够跨请求保持某些参数，它也会在同一个 Session 实例发出的所有请求之间保持 Cookie。

 1 import requests
 2 
 3 s = requests.Session()
 4 
 5 # 跨请求主体去请求
 6 s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')
 7 # 从上一个请求中获得的cookie信息，会自动的发给下一次请求的网址。
 8 r = s.get("http://httpbin.org/cookies")
 9 
10 print(r.text)  # {"cookies": {"sessioncookie": "123456789"}}

在会话中添加默认请求头配置：

 1 import requests
 2 
 3 s = requests.Session()
 4 s.auth = ('username', 'passwd')
 5 # 添加的一个默认header信息
 6 s.headers.update({'x-test': 'true'})
 7 
 8 # both 'x-test' and 'x-test2' are sent
 9 r=s.get('http://httpbin.org/headers', headers={'x-test2': 'true'})
10 print(r.text)
11 
12 # both 'x-test' and 'x-test3' are sent
13 r=s.get('http://httpbin.org/headers', headers={'x-test3': 'true'})
14 print(r.text)

posted @ 2020-11-26 23:06 Juno3550 阅读(374) 评论(0) 编辑收藏举报

刷新页面返回顶部

Juno3550

requests 库

1. requests 简介

2. get 请求

3. post 请求

4. 其他请求方法

5. 高级用法

1. requests 简介

2. get 请求

get 方法带请求参数：

3. post 请求

示例一：表单提交的两种方式

示例二：非表单提交

4. 其他请求方法

5. 高级用法

5.1 获取 json 格式的响应数据

5.2 获取原始的 socket 响应数据

5.3 设置请求头

5.4 上传文件

5.5 状态码

5.6 获取响应头信息

5.7 获取/发送 Cookie

5.8 请求超时

5.9 获取重定向响应数据

5.10 Session

公告