爬虫二之Requests

requests

实例引入

import requests

response  = requests.get('https://www.baidu.com')
response.status_code
response.text
response.cookies

请求方式

post()
put()
delete()
head()
options()

请求

基本get请求
带参数get请求

data = {'name':'germey', 'age':'22}
response = request.get('http://httpbin.org/get', params=data)
print(respones.text)

解析json

response.json()

获取二进制数据

response.content
response=request.get('https://github.com/favicon.ico')
f = open('favicon.ico', 'wb')
f.write(response.content)
f.close()

添加headers

headers={
    'User-Agent':'Mozilla/5.0 (Macintosh; intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36'
}
response = request.get('https://www.zhihu.com/explore'. headers=headers_

POST请求

data = {}
headers = {}
response = request.post('http://httpbin.org/post', data=data, headers=headers)

response属性

status_code
headers
cookies
url
history

高级操作

文件上传

files = {'file':open('favicon.ico','rb')}
response = request.get('http://httpbin.org/post', files=files)

获取cookie

for key,value in response.cookies.items():
    print(key + '=' + value)

会话维持

requests.get('http://httpbin.org/cookies/set/number/123456789)
response = requests.get('http://httpbin.org/cookies')

上述方法无法得到想要的cookie

s = requests.Session()
s.get(...)
response = s.get(...)

证书验证

暂时不看。如果发生情况则添加参数 verify=False

代理设置

proxies={}
response = requests.get(' ', proxies=proxies)

超时设置

from requests.exceptions import ReadTimeout

try:
    #some codes
except ReadTimeout:
    print('Timeout')

认证设置

request.get(...,auth={'user','123'})

异常处理

posted @ 2019-07-12 16:55  鬼鬼果果  阅读(196)  评论(0编辑  收藏  举报