chrome headless+selenium+python+(ubuntu 16.04/centos7) 下的实现
Ubuntu 16.04 下:
0x01 安装chrome
1 下载源加入系统源列表
sudo wget http://www.linuxidc.com/files/repo/google-chrome.list -P /etc/apt/sources.list.d/
2 导入google软件公钥
wget -q -O - https://dl.google.com/linux/linux_signing_key.pub | sudo apt-key add -
3 更新源
sudo apt-get update
4 安装chrome
sudo apt-get install google-chrome-stable
5 查看安装路径
cd /usr/bin/
xxaq@xxaq-System-Product-Name-Invalid-entry-length-16-Fixed-up-to-11:/usr/bin$ ll google*
lrwxrwxrwx 1 root root 31 2月 24 08:59 google-chrome -> /etc/alternatives/google-chrome*
lrwxrwxrwx 1 root root 32 2月 22 11:06 google-chrome-stable -> /opt/google/chrome/google-chrome*
6 查看安装版本
xxaq@xxaq-System-Product-Name-Invalid-entry-length-16-Fixed-up-to-11:/usr/bin$ google-chrome --version
Google Chrome 64.0.3282.186
7 google-chrome运行报错
xxaq@xxaq-System-Product-Name-Invalid-entry-length-16-Fixed-up-to-11:~$ google-chrome --headless http://www.baidu.com
[0224/090953.953883:ERROR:instance.cc(49)] Unable to locate service manifest for metrics
[0224/090953.954581:ERROR:service_manager.cc(890)] Failed to resolve service name: metrics
[0224/090953.957121:FATAL:nss_util.cc(631)] NSS_VersionCheck("3.26") failed. NSS >= 3.26 is required. Please upgrade to the latest NSS, and if you still get this error, contact your distribution maintainer.
已放弃 (核心已转储)
[0100/000000.152280:ERROR:broker_posix.cc(43)] Invalid node channel message
原因:nss版本需要更新
解决方法:
sudo apt install --reinstall libnss3
0x02 安装chromedriver
1 下载地址:
下载chrome版本对应的chromedriver
对照表网上有人整理,也可以看http://chromedriver.storage.googleapis.com/index.html 下面各个版本里的note文件:
其中chrome 64对应的是2.35版本
2 配置 :
对下载的zip文件直接unzip
unzip chromedriver_linux64.zip
拷贝到/usr/bin/下面
sudo cp chromedriver /usr/bin/
0x03 Python下的调用
1 安装 selenium
sudo pip install selenium
2 测试demo
#encoding: utf-8 from selenium import webdriver from selenium.webdriver.chrome.options import Options import os chrome_options = webdriver.ChromeOptions() chrome_options.add_argument('--headless') #chrome_options.add_argument('--disable-gpu') chromedriver = "/usr/bin/chromedriver" os.environ["webdriver.chrome.driver"] = chromedriver driver = webdriver.Chrome(chrome_options=chrome_options,executable_path=chromedriver) #driver = webdriver.Chrome(executable_path=chromedriver) driver.get("https://stackoverflow.com") print driver.page_source driver.save_screenshot('screen.png') driver.quit()
CentOS7 下:
安装,chrome,chromedriver原理同上
注意在使用的时候要加上
chrome_options.add_argument("--no-sandbox")
否则会出现报错
selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally
(Driver info: chromedriver=2.35.528139 (47ead77cb35ad2a9a83248b292151462a66cd881),platform=Linux 3.10.0-693.el7.x86_64 x86_64)
原因:
谷歌浏览器报错:请以普通用户的身份启动Google Chrome。如果您出于开发目的,需要以根用户打身份运行Chrome,请使用-no-sandbox标记重新运行Chrome
ubuntu没有报错是因为不是以root用户登录的。
测试demo:
#encoding: utf-8 from selenium import webdriver from selenium.webdriver.chrome.options import Options import os chrome_options = webdriver.ChromeOptions() chrome_options.add_argument('--headless') #chrome_options.add_argument('--disable-gpu') chrome_options.add_argument("--no-sandbox") chromedriver = "/usr/bin/chromedriver" os.environ["webdriver.chrome.driver"] = chromedriver driver = webdriver.Chrome(chrome_options=chrome_options,executable_path=chromedriver) #driver = webdriver.Chrome(executable_path=chromedriver) driver.get("https://stackoverflow.com") print driver.page_source driver.save_screenshot('screen.png') driver.quit()