Python Requests 使用摘要 二

五. 带参数访问 

 

1. 普通参数

>>> payload = {'key1': 'value1', 'key2': 'value2'}

>>> r = requests.get("http://httpbin.org/get", params=payload)

>>> r.url

u'http://httpbin.org/get?key2=value2&key1=value1'

>>> r.text

u'{\n  "url": "http://httpbin.org/get?key2=value2&key1=value1",\n  ...}'

 

 

2. 带中文参数

 

>>> payload = {'key1': 'value1', 'key2': u'中文'}

>>> r = requests.get("http://httpbin.org/get", params=payload)

>>> r.url

u'http://httpbin.org/get?key2=中文&key1=value1'

>>> r.text

u'{\n  "url": "http://httpbin.org/get?key2=中文&key1=value1",\n  ...}'

 

3. json格式

>>> import json

>>> url = 'https://api.github.com/some/endpoint'

>>> payload = {'some': 'data'}

>>> headers = {'content-type': 'application/json'}

 

>>> r = requests.post(url, data=json.dumps(payload), headers=headers)

 

 

六. 文件操作

 

1. 文件下载

r = requests.get('http://img4.cache.netease.com/travel/2013/4/7/2013040718512699794.jpg')

from PIL import Image

from StringIO import StringIO

i = Image.open(StringIO(r.content))

i.save('1.jpg')

 

2. 文件上传

>>> url = 'http://httpbin.org/post'

>>> files = {'file': open('report.xls', 'rb')}

>>> r = requests.post(url, files=files)

>>> r.text

 

3. 上传时指定文件名

>>> url = 'http://httpbin.org/post'

>>> files = {'file': ('report.xls', open('report.xls', 'rb'))}

>>> r = requests.post(url, files=files)

>>> r.text

 

4. 按文件接收字符串

>>> url = 'http://httpbin.org/post'

>>> files = {'file': ('report.csv', 'some,data,to,send\nanother,row,to,send\n')}

>>> r = requests.post(url, files=files)

>>> r.text

 

5. 流式上传,大文件上传时,不需要全部装载到内存

with open('massive-body') as f:

    requests.post('http://some.url/streamed', data=f)

 

6. 流式下载

r = requests.post('https://stream.twitter.com/1/statuses/filter.json',

    data={'track': 'requests'}, auth=('username', 'password'), stream=True)

 

for line in r.iter_lines():

    if line: # filter out keep-alive new lines

        print json.loads(line)

 

 

七. 代理设置

 

1. 代理参数

proxies = {

  "http": "http://10.10.1.10:3128",

  "https": "http://10.10.1.10:1080",

}

requests.get("http://example.org", proxies=proxies)

 

>>> r = requests.get('http://ifconfig.me/ip')

>>> r.text

u'116.226.xx.xxx\n'

>>> proxies = {

...   "http": "http://175.136.xxx.xx",

... }

>>> r = requests.get("http://ifconfig.me/ip", proxies=proxies)

>>> r.text

u'175.136.xxx.xx\n'

 

2. 环境变量

$ export HTTP_PROXY="http://10.10.1.10:3128"

$ export HTTPS_PROXY="http://10.10.1.10:1080"

$ python

>>> import requests

>>> requests.get("http://example.org")

 

3. 如果代理授权方式用的是HTTP Basic Auth,则可以

proxies = {

    "http": "http://user:pass@10.10.1.10:3128/",

}

 

 

八  Session

 

如果需要在多次访问之间保持状态,则需要用到requests中的Session对象。

>>> s = requests.Session()

>>> s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')

<Response [200]>

>>> r = s.get("http://httpbin.org/cookies")

>>> r.text

u'{\n  "cookies": {\n    "sessioncookie": "123456789"\n  }\n}'

 

另外,Session支持Keep-Alive,在同一个session中的多个请求会自动重用连接。注意,只有所有body内容读完,连接才会放回连接池重用。在使用流式文件的时候小心。

 

 九. 异常出错

  1. 网络错误(如DNS出错,链接拒绝等),抛出ConnectionError

  2. 遭遇少见的非法HTTP 响应(不是requests能解析的那些404之类异常),抛出HTTPError

  3. 超时,抛出Timeout

  4. 301之类的跳转次数太多,抛出TooManyRedirects

  5. 所有requests抛出的异常均继承自requests.exceptions.RequestException

 [转自]EryxLee的博客http://blog.sina.com.cn/eryxlee

posted @ 2015-07-02 11:01  钟灵.毓秀  阅读(374)  评论(0编辑  收藏  举报