Requests库入门

Requests库

7个主要方法
(1) requests.requests()
(2) requests.get()
(3) requests.head()
(4) requests.post()
(5) requests.put()
(6) requests.patch()
(7) requests.delete()
response对象的属性：

属性	说明
r.status_code	HTTP请求的返回状态，200表示连接成功，404表示失败
r.text	HTTP响应内容的字符串形式，即，url对应的页面内容
r.encoding	从HTTP header中猜测的响应内容编码方式
r.apparent_encoding	从内容中分析出的响应内容编码方式（备选编码方式）
r.content	HTTP响应内容的二进制形式

爬取网页通用代码框架：

import requests
def getHTMLText(url):
    try:
        r = requests.get(url,timeout = 30)
        r.raise_for_status()
        r.encoding = r.apparent_encoding
        return r.text
    except:
        return "Something Wrong!!!"

requests访问控制参数**：

(1) params : 字典或字节序列，作为参数增加到url中

 kv = {'key1': 'value1', 'key2': 'value2'}
 r = requests.request('GET', 'http://python123.io/ws', params=kv)
 print(r.url)
http://python123.io/ws?key1=value1&key2=value2

(2) data : 字典、字节序列或文件对象，作为Request的内容

 kv = {'key1': 'value1', 'key2': 'value2'}
 r = requests.request('POST', 'http://python123.io/ws', data=kv)
 body = '主体内容'
 r = requests.request('POST', http://python123.io/ws',data=body)

(3) json : JSON格式的数据，作为Request的内容

kv = {'key1': 'value1'}
r = requests.request('POST', 'http://python123.io/ws', json=kv)

(4)headers : 字典，HTTP定制头

hd = {'user‐agent': 'Chrome/10'}
r = requests.request('POST', 'http://python123.io/ws', headers=hd)

(5)files : 字典类型，传输文件

fs = {'file': open('data.xls', 'rb')}
r = requests.request('POST', 'http://python123.io/ws', files=fs)

(6)timeout : 设定超时时间，秒为单位

r = requests.request('GET', 'http://www.baidu.com', timeout=10)

(7)proxies : 字典类型，设定访问代理服务器，可以增加登录认证

 pxs = { 'http': 'http://user:pass@10.10.10.1:1234'
'https': 'https://10.10.10.1:4321' }
r = requests.request('GET', 'http://www.baidu.com', proxies=pxs)

(8)allow_redirects : True/False，默认为True，重定向开关
(9)stream : True/False，默认为True，获取内容立即下载开关
(10)verify : True/False，默认为True，认证SSL证书开关
(11)cert : 本地SSL证书路径
(12)cookies : 字典或CookieJar，Request中的cookie
(13)auth : 元组，支持HTTP认证功能

posted @ 2019-01-06 15:59 JeffreyLee 阅读(184) 评论(0) 收藏举报

刷新页面返回顶部

机器学习在哪里

学习日志博客

Requests库入门

Requests库

requests访问控制参数**：

公告

机器学习在哪里

学习日志博客

Requests库入门

Requests库

** requests访问控制参数：

公告

requests访问控制参数**：