pip3 install beautifulsoup4 出现错误 There was a problem confirming the ssl certificate
chenhuimingdeMacBook-Pro:groceryList Mch$
sudo pip3 install beautifulsoup4
The directory '/Users/Mch/Library/Caches/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/Users/Mch/Library/Caches/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Collecting beautifulsoup4
Could not fetch URL https://pypi.python.org/simple/beautifulsoup4/: There was a problem confirming the ssl certificate: [SSL: TLSV1_ALERT_PROTOCOL_VERSION] tlsv1 alert protocol version (_ssl.c:590) - skipping
Could not find a version that satisfies the requirement beautifulsoup4 (from versions: )
No matching distribution found for beautifulsoup4
Solution:
* reinstall python?
brew reinstall python
==> Reinstalling python ==> Installing dependencies for python: gdbm ==> Installing python dependency: gdbm ==> Downloading https://homebrew.bintray.com/bottles/gdbm-1.17.el_capitan.bottle.tar.gz ######################################################################## 100.0% ==> Pouring gdbm-1.17.el_capitan.bottle.tar.gz 🍺 /usr/local/Cellar/gdbm/1.17: 20 files, 586.2KB ==> Installing python ==> Downloading https://homebrew.bintray.com/bottles/python-3.7.0.el_capitan.bottle.1.tar.gz ######################################################################## 100.0% ==> Pouring python-3.7.0.el_capitan.bottle.1.tar.gz ==> /usr/local/Cellar/python/3.7.0/bin/python3 -s setup.py --no-user-cfg install --force --verbose --install-scripts=/usr/local/Cellar/p ==> /usr/local/Cellar/python/3.7.0/bin/python3 -s setup.py --no-user-cfg install --force --verbose --install-scripts=/usr/local/Cellar/p ==> /usr/local/Cellar/python/3.7.0/bin/python3 -s setup.py --no-user-cfg install --force --verbose --install-scripts=/usr/local/Cellar/p ==> Caveats Python has been installed as /usr/local/bin/python3 Unversioned symlinks `python`, `python-config`, `pip` etc. pointing to `python3`, `python3-config`, `pip3` etc., respectively, have been installed into /usr/local/opt/python/libexec/bin If you need Homebrew's Python 2.7 run brew install python@2 Pip, setuptools, and wheel have been installed. To update them run pip3 install --upgrade pip setuptools wheel You can install Python packages with pip3 install <package> They will install into the site-package directory /usr/local/lib/python3.7/site-packages See: https://docs.brew.sh/Homebrew-and-Python ==> Summary 🍺 /usr/local/Cellar/python/3.7.0: 4,787 files, 102MB ==> Caveats ==> python Python has been installed as /usr/local/bin/python3 Unversioned symlinks `python`, `python-config`, `pip` etc. pointing to `python3`, `python3-config`, `pip3` etc., respectively, have been installed into /usr/local/opt/python/libexec/bin If you need Homebrew's Python 2.7 run brew install python@2 Pip, setuptools, and wheel have been installed. To update them run pip3 install --upgrade pip setuptools wheel You can install Python packages with pip3 install <package> They will install into the site-package directory /usr/local/lib/python3.7/site-packages See: https://docs.brew.sh/Homebrew-and-Python
* install pip3
brew install pip3
Error: No available formula with the name "pip3"
==> Searching for a previously deleted formula (in the last month)...
Error: No previously deleted formula found.
==> Searching for similarly named formulae...
Error: No similarly named formulae found.
==> Searching taps...
==> Searching taps on GitHub...
Warning: Error searching on GitHub: GitHub
The GitHub credentials in the macOS keychain may be invalid.
Clear them with:
printf "protocol=https\nhost=github.com\n" | git credential-osxkeychain erase
Or create a personal access token:
https://github.com/settings/tokens/new?scopes=gist,public_repo&description=Homebrew
and then set the token as: export HOMEBREW_GITHUB_API_TOKEN="your_new_token"
Error: No formulae found in taps.
* Clear The GitHub credentials in the macOS keychain
printf "protocol=https\nhost=github.com\n" | git credential-osxkeychain erase
* create a personal access token
https://github.com/settings/tokens/new?scopes=gist,public_repo&description=Homebrew
* Click Button "Generate token"
https://github.com/settings/tokens
把这个40位的token粘贴出来
* emacs ~/.bash_profile
echo 'export HOMEBREW_GITHUB_API_TOKEN="your_token"' >> ~/.bash_profile . !$
* 再试安装beautifulsoup4
pip3 install beautifulsoup4
Collecting beautifulsoup4
Downloading https://files.pythonhosted.org/packages/fe/62/720094d06cb5a92cd4b3aa3a7c678c0bb157526a95c4025d15316d594c4b/beautifulsoup4-4.6.1-py3-none-any.whl (89kB)
100% |████████████████████████████████| 92kB 180kB/s
Installing collected packages: beautifulsoup4
Successfully installed beautifulsoup4-4.6.1
* upgrade pip
sudo curl https://bootstrap.pypa.io/get-pip.py | python3
Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/usr/local/bin/pip'
Consider using the `--user` option or check the permissions.
sudo chown -hR `whoami`:staff /usr/local/bin/
output:
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1604k 100 1604k 0 0 443k 0 0:00:03 0:00:03 --:--:-- 445k Collecting pip Using cached https://files.pythonhosted.org/packages/5f/25/e52d3f31441505a5f3af41213346e5b6c221c9e086a166f3703d2ddaf940/pip-18.0-py2.py3-none-any.whl Installing collected packages: pip Found existing installation: pip 18.0 Uninstalling pip-18.0: Successfully uninstalled pip-18.0 Successfully installed pip-18.0
* 在引用bs4 “frombs4 import BeautifulSoup”时还会报错“ModuleNotFoundError: Nomodule named 'bs4'.” 未找到名为bs4的模块,这时需要在Pycharm上安装bs4模块来解决
按Cmd(win) + , (windows图标键 和 逗号 一起按)或者点 Pycharm => Preferences
windows下 File => settings
* 点击 右边框 左下角 + install
🔍bs 再点击 "Install Package"
这行代码不报错了
from bs4 import BeautifulSoup
* 创建python文件
#-*- coding: UTF-8 -*- #!/usr/local/bin/python3 from urllib.request import urlopen from bs4 import BeautifulSoup html = urlopen("http://pythonscraping.com/pages/page1.html") bsObj = BeautifulSoup(html.read()) print(bsObj.h1)
Run:
/Users/Mch/PycharmProjects/BeautifulSoup/venv/bin/python /Users/Mch/PycharmProjects/BeautifulSoup/index.py /Users/Mch/PycharmProjects/BeautifulSoup/index.py:8: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently. The code that caused this warning is on line 8 of the file /Users/Mch/PycharmProjects/BeautifulSoup/index.py. To get rid of this warning, pass the additional argument 'features="html.parser"' to the BeautifulSoup constructor. bsObj = BeautifulSoup(html.read()) <h1>An Interesting Title</h1> Process finished with exit code 0
* 这里有一个警告⚠️第8行出错
根据提示
To get rid of this warning,pass the additional argument 'features="html.parser"' to the BeautifulSoup constructor.
找到这个构造函数在文件/Users/Mch/PycharmProjects/BeautifulSoup/venv/lib/python3.7/site-packages/bs4/__init__.py
def __init__(self, markup="", features=None, builder=None, parse_only=None, from_encoding=None, exclude_encodings=None, **kwargs):
因此在第8行添加参数 'html.parser'
bsObj = BeautifulSoup(html.read(), 'html.parser')
再运行:
/Users/Mch/PycharmProjects/BeautifulSoup/venv/bin/python /Users/Mch/PycharmProjects/BeautifulSoup/index.py
<h1>An Interesting Title</h1>
Process finished with exit code 0
warning没有了
* try ... except... else
#-*- coding: UTF-8 -*- #!/usr/local/bin/python from urllib.request import urlopen from bs4 import BeautifulSoup from urllib.request import HTTPError # import socket # socket.setdefaulttimeout(3) try: # html = urlopen("http://pythonscraping.com/pages/page1.html", None, 3) # 设置超时时间为3s html = urlopen("http://nosuchurl", None, 3) except HTTPError as e: print(e) else: bsObj = BeautifulSoup(html.read(), 'html.parser') print(bsObj.body.div)
* 可靠的网络连接
#-*- coding: UTF-8 -*- #!/usr/local/bin/python from urllib.request import urlopen from urllib.request import HTTPError, URLError from bs4 import BeautifulSoup def getTitle(url): try: html = urlopen(url, None, 3) except(HTTPError, URLError) as e: return None try: bsObj = BeautifulSoup(html.read(), "html.parser") title = bsObj.body.h1 except AttributeError as e: return None return title title = getTitle("http://www.pythonscraping.com/pages/page1.html") if title == None: print("Title could not be found") else: print(title)
Run:
python3 ./index.py
<h1>An Interesting Title</h1>
* python库下载
https://www.lfd.uci.edu/~gohlke/pythonlibs/