8251gr

Python Requests 最详细教程!爬虫必会之!

requests 是Python中一个非常出名的库,它极大的简化了 Python中进行HTTP请求的流程,我们来看一个简单的例子:

In [1]: import requests

In [2]: requests.get("https://jiajunhuang.com")
Out[2]: <Response [200]>

只需要两行便可以发起一个HTTP请求,多么的简单。

针对HTTP协议的 GET , POST , PUT , DELETE 等方法, requests 分别有:

requests.get
requests.options
requests.head
requests.post
requests.put
requests.patch
requests.delete

等对应的方法,他们都是 requests.request 的便捷版,也就是说,调用 requests.get 其实相当于调用 requests.request("GET", xxx) 。

接下来我们看看 requests.request 都能接受哪些参数,这其实也就是上述对应的 get 等方法能传入的参数:

def request(method, url, **kwargs):
    """Constructs and sends a :class:`Request <Request>`.

    :param method: method for the new :class:`Request` object.
    :param url: URL for the new :class:`Request` object.
    :param params: (optional) Dictionary, list of tuples or bytes to send
        in the body of the :class:`Request`.
    :param data: (optional) Dictionary, list of tuples, bytes, or file-like
        object to send in the body of the :class:`Request`.
    :param json: (optional) A JSON serializable Python object to send in the body of the :class:`Request`.
    :param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`.
    :param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`.
    :param files: (optional) Dictionary of ``'name': file-like-objects`` (or ``{'name': file-tuple}``) for multipart encoding upload.
        ``file-tuple`` can be a 2-tuple ``('filename', fileobj)``, 3-tuple ``('filename', fileobj, 'content_type')``
        or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``, where ``'content-type'`` is a string
        defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers
        to add for the file.
    :param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.
    :param timeout: (optional) How many seconds to wait for the server to send data
        before giving up, as a float, or a :ref:`(connect timeout, read
        timeout) <timeouts>` tuple.
    :type timeout: float or tuple
    :param allow_redirects: (optional) Boolean. Enable/disable GET/OPTIONS/POST/PUT/PATCH/DELETE/HEAD redirection. Defaults to ``True``.
    :type allow_redirects: bool
    :param proxies: (optional) Dictionary mapping protocol to the URL of the proxy.
    :param verify: (optional) Either a boolean, in which case it controls whether we verify
            the server's TLS certificate, or a string, in which case it must be a path
            to a CA bundle to use. Defaults to ``True``.
    :param stream: (optional) if ``False``, the response content will be immediately downloaded.
    :param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair.
    :return: :class:`Response <Response>` object
    :rtype: requests.Response

    Usage::

      >>> import requests
      >>> req = requests.request('GET', 'https://httpbin.org/get')
      <Response [200]>
    """

    # By using the 'with' statement we are sure the session is closed, thus we
    # avoid leaving sockets open which can trigger a ResourceWarning in some
    # cases, and look like a memory leak in others.
    with sessions.Session() as session:
        return session.request(method=method, url=url, **kwargs)

我们来看看这些便捷函数的参数:

  • 第一个参数都是 url ,这就是要请求的 url ,例如 https://jiajunhuang.com ,是必填的。
  • 参数 params 是query string,它可以是一个字典,或者一个list,list的内容是一堆的tuple,也可以是bytes。
  • 参数 data 是 HTTP 请求中的 body ,它可以是一个字典,或者一个list,list的内容是一堆的tuple,也可以是bytes或者文件
  • 参数 json 是为了方便请求而提供的参数,它其实相当于 data ,但是会自动把请求的 Content-Type 设置为 application/json
  • 参数 headers 是 HTTP 请求中的头部,它是一个字典
  • 参数 cookies 是所携带的 Cookie ,它可以是一个字典或者 CookieJar 的实例
  • 参数 files 是要上传的文件,它可以是一个字典,字典的内容是tuple
  • 参数 auth 用于开启 HTTP 请求的认证
  • 参数 timeout 是超时时间
  • 参数 allow_redirects 是否允许重定向,它是一个布尔值
  • 参数 proxies 是是否使用代理
  • 参数 verify 是否检查服务端的证书,传布尔值或者 CA 的路径
  • 参数 stream 是布尔值,代表是否以流的方式读取结果
  • 参数 cert 传入客户端的SSL证书

而 requests 中, HTTP 请求的返回结果是 Response 的实例,包括方法:

  • json 用于把返回结果自动转换成 JSON
  • ok 用于检查返回的状态码是否是 400 以下,注意,ok是一个property
  • text 用于以文本方式输出返回结果,注意,text是一个 property

因此,我们可以看看常见的两种用法,一种是请求接口并且转换成 JSON :

import requests

resp = requests.get("https://api.jiajunhuang.com/v1/ip")
print(resp.json())

另外一种是打印状态码并且打印出响应结果:

import requests

resp = requests.get("https://api.jiajunhuang.com/v1/ip")
print(resp.status_code)
print(resp.text)

posted on   牛二娃  阅读(398)  评论(0编辑  收藏  举报

相关博文:
阅读排行:
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南
· 开源Multi-agent AI智能体框架aevatar.ai,欢迎大家贡献代码
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· AI技术革命,工作效率10个最佳AI工具
< 2025年3月 >
23 24 25 26 27 28 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5

导航

统计

点击右上角即可分享
微信分享提示