python使用requests时报错requests.exceptions.SSLError: HTTPSConnectionPool

python使用requests时报错requests.exceptions.SSLError: HTTPSConnectionPool

报错信息

  1.  
    Traceback (most recent call last):
  2.  
    File "<stdin>", line 1, in <module>
  3.  
    File "D:\python\lib\site-packages\requests-2.18.3-py2.7.egg\requests\api.py", line 72, in get
  4.  
    return request('get', url, params=params, **kwargs)
  5.  
    File "D:\python\lib\site-packages\requests-2.18.3-py2.7.egg\requests\api.py", line 58, in request
  6.  
    return session.request(method=method, url=url, **kwargs)
  7.  
    File "D:\python\lib\site-packages\requests-2.18.3-py2.7.egg\requests\sessions.py", line 508, in request
  8.  
    resp = self.send(prep, **send_kwargs)
  9.  
    File "D:\python\lib\site-packages\requests-2.18.3-py2.7.egg\requests\sessions.py", line 640, in send
  10.  
    history = [resp for resp in gen] if allow_redirects else []
  11.  
    File "D:\python\lib\site-packages\requests-2.18.3-py2.7.egg\requests\sessions.py", line 218, in resolve_redirects
  12.  
    **adapter_kwargs
  13.  
    File "D:\python\lib\site-packages\requests-2.18.3-py2.7.egg\requests\sessions.py", line 618, in send
  14.  
    r = adapter.send(request, **kwargs)
  15.  
    File "D:\python\lib\site-packages\requests-2.18.3-py2.7.egg\requests\adapters.py", line 506, in send
  16.  
    raise SSLError(e, request=request)
  17.  
    requests.exceptions.SSLError: HTTPSConnectionPool(host='www.baidu.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)'),))


 

过程

测试1

不指定headers时GET:

  1.  
    >>> import requests
  2.  
    >>> requests.get('http://www.baidu.com/')
  3.  
    <Response [200]>
  4.  
    >>> requests.get('http://www.baidu.com/')
  5.  
    <Response [200]>
  6.  
    >>> requests.get('http://www.baidu.com/')
  7.  
    <Response [200]>
  8.  
    >>> header = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1',}
  9.  
    >>> requests.get('http://www.baidu.com/', headers = header)
  10.  
    <Response [200]>


测试2

当指定headers的User-Agent为火狐浏览器时:

  1.  
    >>> header = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1',}
  2.  
    >>> requests.get('http://www.baidu.com/', headers = header)
  3.  
    <Response [200]>
  4.  
    >>> requests.get('http://www.baidu.com/', headers = header)
  5.  
    Traceback (most recent call last):
  6.  
    File "<stdin>", line 1, in <module>
  7.  
    File "D:\python\lib\site-packages\requests-2.18.3-py2.7.egg\requests\api.py", line 72, in get
  8.  
    return request('get', url, params=params, **kwargs)
  9.  
    File "D:\python\lib\site-packages\requests-2.18.3-py2.7.egg\requests\api.py", line 58, in request
  10.  
    return session.request(method=method, url=url, **kwargs)
  11.  
    File "D:\python\lib\site-packages\requests-2.18.3-py2.7.egg\requests\sessions.py", line 508, in request
  12.  
    resp = self.send(prep, **send_kwargs)
  13.  
    File "D:\python\lib\site-packages\requests-2.18.3-py2.7.egg\requests\sessions.py", line 640, in send
  14.  
    history = [resp for resp in gen] if allow_redirects else []
  15.  
    File "D:\python\lib\site-packages\requests-2.18.3-py2.7.egg\requests\sessions.py", line 218, in resolve_redirects
  16.  
    **adapter_kwargs
  17.  
    File "D:\python\lib\site-packages\requests-2.18.3-py2.7.egg\requests\sessions.py", line 618, in send
  18.  
    r = adapter.send(request, **kwargs)
  19.  
    File "D:\python\lib\site-packages\requests-2.18.3-py2.7.egg\requests\adapters.py", line 506, in send
  20.  
    raise SSLError(e, request=request)
  21.  
    requests.exceptions.SSLError: HTTPSConnectionPool(host='www.baidu.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)'),))


分析

现象:第一次GET时正常,第二次GET时,会报错.

不同点:User-Agent不相同

分析:由于报错SSL证书验证失败,所以这次的访问应该是https协议.但是我们明明使用的是http,所以,猜测访问该网站后,被重定向到了https://www.baidu.com/

验证

首先,进行GET时,关闭证书验证.因为,如果不关闭,请求总是失败,不能获取到重定向的信息.

  1.  
    >>> response = requests.get('http://www.baidu.com/', headers = header, verify=False)
  2.  
    D:\python\lib\site-packages\urllib3\connectionpool.py:858: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  3.  
    InsecureRequestWarning)
  4.  
    >>> response.history
  5.  
    [<Response [302]>]
  6.  
    >>> response.url
  7.  
    u'https://www.baidu.com/'


当不指定User-Agent时

  1.  
    >>> response = requests.get('http://www.baidu.com/', verify=False)
  2.  
    >>> response.history
  3.  
    []
  4.  
    >>> response.url
  5.  
    u'http://www.baidu.com/'


结论

当指定headers的User-Agent时,baidu的服务器会重定向到https的网址.因此报出SSL验证失败的错误.

解决方法

方法1:

在进行GET时,指定SSL证书.详情见附件

方法2:

关闭证书验证. 详情见附件

 

附件

[各浏览器的User-Agent] http://www.useragentstring.com/pages/useragentstring.php

[SSL 证书验证] http://docs.python-requests.org/zh_CN/latest/user/advanced.html#ssl

posted @ 2018-06-28 19:55  烨来风雨声  阅读(1161)  评论(0编辑  收藏  举报