python3-requests库的使用

同步请求库requests用来做测试和简单爬虫其实非常好用的,今天来讲一讲,毕竟不熟悉就用,吃了很大亏啊,文档一定要好好看

http://docs.python-requests.org/zh_CN/latest/user/quickstart.html

 

一、最简单常用的用法

GET请求

response = requests.get('http://httpbin.org/get')
print(response.text)

# 输出
{
  "args": {}, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Connection": "close", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.21.0"
  }, 
  "origin": "xx.xx.xx.xx", 
  "url": "http://httpbin.org/get"
}

 

POST请求

form = {'name': 'happy_codes'}
response = requests.post('http://httpbin.org/post', data=form)
print(response.text)

# form表单数据
{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {
    "name": "happy_codes"
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Connection": "close", 
    "Content-Length": "16", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.21.0"
  }, 
  "json": null, 
  "origin": "xx.xx.xx.xx", 
  "url": "http://httpbin.org/post"
}

 

二、加UA,加cookies,加代理

cookies除了使用dict之外,还可以使用cookiejar类,还可以直接给字符串

proxies={'http:': 'http://127.0.0.1', 'https': 'http:127.0.0.1'} 

意思是http协议和https协议使用怎样的代理,没配置正确,就不会用代理,切记。

headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
                         "(KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"}

cookies = {"STM": "1545720205", 'haha': '123'}

response = requests.get('http://httpbin.org/get', headers=headers, cookies=cookies,
                        proxies={'http': 'http://125.123.122.10:42207', 'https': 'http://125.123.122.10:42207'})

print(response.text)

# 输出
{
  "args": {}, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Connection": "close", 
    "Cookie": "STM=1545720205; haha=123", 
    "Host": "httpbin.org", 
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
  }, 
  "origin": "125.123.122.10:42207", 
  "url": "http://httpbin.org/get"
}

 

 

其实可以加的,都写在注释里面了,GET,POST都一样:

def request(method, url, **kwargs):
    """Constructs and sends a :class:`Request <Request>`.

    :param method: method for the new :class:`Request` object.
    :param url: URL for the new :class:`Request` object.
    :param params: (optional) Dictionary, list of tuples or bytes to send
        in the body of the :class:`Request`.
    :param data: (optional) Dictionary, list of tuples, bytes, or file-like
        object to send in the body of the :class:`Request`.
    :param json: (optional) A JSON serializable Python object to send in the body of the :class:`Request`.
    :param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`.
    :param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`.
    :param files: (optional) Dictionary of ``'name': file-like-objects`` (or ``{'name': file-tuple}``) for multipart encoding upload.
        ``file-tuple`` can be a 2-tuple ``('filename', fileobj)``, 3-tuple ``('filename', fileobj, 'content_type')``
        or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``, where ``'content-type'`` is a string
        defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers
        to add for the file.
    :param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.
    :param timeout: (optional) How many seconds to wait for the server to send data
        before giving up, as a float, or a :ref:`(connect timeout, read
        timeout) <timeouts>` tuple.
    :type timeout: float or tuple
    :param allow_redirects: (optional) Boolean. Enable/disable GET/OPTIONS/POST/PUT/PATCH/DELETE/HEAD redirection. Defaults to ``True``.
    :type allow_redirects: bool
    :param proxies: (optional) Dictionary mapping protocol to the URL of the proxy.
    :param verify: (optional) Either a boolean, in which case it controls whether we verify
            the server's TLS certificate, or a string, in which case it must be a path
            to a CA bundle to use. Defaults to ``True``.
    :param stream: (optional) if ``False``, the response content will be immediately downloaded.
    :param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair.
    :return: :class:`Response <Response>` object
    :rtype: requests.Response

    Usage::

      >>> import requests
      >>> req = requests.request('GET', 'https://httpbin.org/get')
      <Response [200]>
    """
View Code

 

三、session类的使用

Session类的作用是用来维持一个会话,可以让多个请求共用cookie和headers和proxies

headers->dict类型,可以通过 session.headers.update(headers) 更新

cookies->cookie Jar类, 可使用 session.cookies.set(key, value) 更新

proxies->dict类型, 可以通过直接赋值 session.proxies = proxies 更新

通过 session.get() 发起请求

headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
                         "(KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"}

proxy = '59.61.38.48:34719'

# requests.Session类 session
= requests.Session() session.headers.update(headers) session.cookies.set('STM', '1231214') session.cookies.set('S', '123123') proxies = { 'http': 'http://%s' % proxy, 'https': 'http://%s' % proxy } session.proxies = proxies print(session.get('http://httpbin.org/get').text)

# 输出

{
"args": {},
"headers": {
  "Accept": "*/*",
  "Accept-Encoding": "gzip, deflate",
  "Cache-Control": "max-age=259200",
  "Connection": "close",
  "Cookie": "S=123123; STM=1231214",
  "Host": "httpbin.org",
  "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
},
"origin": "59.61.38.48",
"url": "http://httpbin.org/get"
}

 

posted @ 2019-01-30 10:21  happy_codes  阅读(270)  评论(0编辑  收藏  举报