Requset模块
Requests是用python语言基于urllib编写的,采用的是Apache2 Licensed开源协议的HTTP库
各种请求方式:
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests url = "http://www.baidu.com" print(requests.get(url)) print(requests.post(url)) print(requests.put(url)) print(requests.head(url)) print(requests.options(url)) print(requests.delete(url))
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests url = "http://www.baidu.com" response = requests.get(url) print(type(requests)) print(response.status_code) print(response.text) print(response.cookies)
基本用法:
get:
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests url = "http://www.baidu.com" response = requests.get(url) print(response.text)
get(参数):
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests url = "http://httpbin.org/get?name=cc&age=24" response = requests.get(url) print(response.text)
URL查询字符串传递数据,Requests模块允许使用params关键字传递参数,以一个字典来传递这些参数。
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests url = "http://www.baidu.com" data = { "name": "cc", "age": "24" } response = requests.get(url, params=data) print(response.text)
json:
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests import json url = "http://httpbin.org/get" response = requests.get(url) print(type(response.text)) print(response.json()) print(type(response.json())) print(json.loads(response.text))
获取二进制数据:
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests url = "http://github.com/favicon.ico" response = requests.get(url) print(type(response.text), type(response.content)) print(response.text) print(response.content) with open("favicon.ico", "wb") as f: f.write(response.content) f.close()
添加headers:
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests url = "http://www.baidu.com" headers = { "user-agent":"Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Mobile Safari/537.36" } response = requests.get(url, headers=headers) print(response.text)
POST请求:
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests url = "http://httpbin.org/post" data = { "name": "cc", "age": 24 } response = requests.post(url, data) print(response.text) print(help(requests.post)
状态判断:
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests url = "http://www.jianshu.com" headers = { "user-agent":"Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Mobile Safari/537.36" } response = requests.get(url, headers=headers) # if response.status_code == 200: # 判断状态 if response.status_code == requests.codes.ok: # 判断状态 print(response.text) else: print("Request Successfully")
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests response = requests.get("http://www.baidu.com") print(type(response.status_code), response.status_code) print(type(response.headers), response.headers) print(type(response.cookies), response.cookies) print(type(response.url), response.url) print(type(response.history), response.history)
requests高级操作:
cookie:
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests url = "http://www.baidu.com" response = requests.get(url) print(response.cookies) for key, value in response.cookies.items(): print(key + "=" + value)
会话维持:
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests url = "http://httpbin.org/cookies" req = requests.Session() req.get("http://httpbin.org/cookies/set/number/1234567890") response = req.get(url) print(response.text)
证书验证:
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests url = "https://www.12306.cn" response = requests.get(url) print(response.status_code) Traceback (most recent call last): File "E:\python3.6\lib\site-packages\urllib3\connectionpool.py", line 601, in urlopen chunked=chunked) File "E:\python3.6\lib\site-packages\urllib3\connectionpool.py", line 346, in _make_request self._validate_conn(conn) File "E:\python3.6\lib\site-packages\urllib3\connectionpool.py", line 850, in _validate_conn conn.connect() File "E:\python3.6\lib\site-packages\urllib3\connection.py", line 346, in connect _match_hostname(cert, self.assert_hostname or hostname)
requests.exceptions.SSLError: HTTPSConnectionPool(host='www.12306.cn', port=443): Max retries exceeded with url: / (Caused by SSLError(CertificateError("hostname 'www.12306.cn' doesn't match either of 'webssl.chinanetcenter.com', 'i.l.inmobicdn.net', '*.fn-mart.com', 'www.1zhe.com', 'dl.jphbpk.gxpan.cn', 'dl.givingtales.gxpan.cn', 'dl.toyblast.gxpan.cn', 'dl.sds.gxpan.cn', 'download.ctrip.com', 'mh.tiancity.com', 'cdn.hxjyios.iwan4399.com', 'ios.hxjy.iwan4399.com', 'gjzx.
避免这种情况的发生可以通过verify=False来处理:
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests url = "https://www.12306.cn" response = requests.get(url, verify=False) print(response.text)
还是会提示:InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warningsInsecureRequestWarning)
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests from requests.packages import urllib3 url = "https://www.12306.cn" urllib3.disable_warnings() response = requests.get(url, verify=False) print(response.status_code) print(response.text)
方法二:
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests url = "https://www.12306.cn" response = requests.get(url, cert=("/path/server.crt", "/path/key")) # cert:需要把12306证书下载来 print(response.status_code) print(response.text)
文件上传:
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests url = "http://httpbin.org/post" files = {"file": open("favicon.ico", "rb")} response = requests.post(url, files=files) print(response.text)
IP代理:
import requests url = "https://www.baidu.com" proxies = { "http": "http://127.0.0.1:3456", "https": "http://127.0.0.1:2123" } response = requests.get(url, proxies=proxies) print(response.status_code)
#!/urs/bin/evn python # -*- coding:utr-8 -*- import requests url = "https://www.baidu.com" proxies = { "http": "http://user:password@127.0.0.1:1234", } # 代理需要设置账户名和密码 response = requests.get(url, proxies=proxies) print(response.status_code)
如果你的代理是通过sokces这种方式则需要pip install "requests[socks]":
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests url = "https://www.baidu.com" proxies = { "http": "socks5://user:password@127.0.0.1:1234", } # 代理需要设置账户名和密码 response = requests.get(url, proxies=proxies) print(response.status_code)
超时设置:
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests url = "http://www.baidu.com" response = requests.get(url, timeout=0.1) print(response.status_code)
认证设置:
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests from requests.auth import HTTPBasicAuth url = "http://www.baidu.com" response = requests.get(url, auth=HTTPBasicAuth("user", "123")) print(response.status_code)
异常处理:(http://www.python-requests.org/en/master/api/#exceptions)
#!/urs/bin/evn python # -*- coding:utf-8 -*- import requests from requests.exceptions import ReadTimeout, HTTPError, RequestException try: url = "http://httpbin.org/get" response = requests.get(url, timeout=0.1) except ReadTimeout: print("Timeout") except HTTPError: print("Error") except RequestException: print("error")