河汉清且浅,牵牛敦而纯

Chrome-Headless for PHP

项目地址 https://github.com/chrome-php/chrome

<?php

$url = "https://g.cn";
$bin = "D:\bin\ChromePortable\chrome.exe";
$bin = "chromium";

# 使用root用户时,在 chrome-php/chrome/src/Browser/BrowserProcess.php 加入--no-sandbox
# 或者 createBrowser() 中使用数组参数 'noSandbox'=> true
if(posix_getuid() == 0) $args['noSandbox'] = true; // root用户 whoami
else $args['noSandbox'] = false;

require_once('chrome-php/autoload.php');
use HeadlessChromium\BrowserFactory;
$browserFactory = new BrowserFactory($bin);

# start sheadless chrome
//$browser = $browserFactory -> createBrowser(['keepAlive' => true, $args['noSandbox']]);
$browser = $browserFactory -> createBrowser();

try{
# creates a new page and navigate to an URL
$page = $browser -> createPage();
$page -> navigate($url) -> waitForNavigation();

# get page title
$pageTitle = $page -> evaluate('document.title') -> getReturnValue();

# get page content
$pageContent = $page -> evaluate('document.documentElement.innerHTML') -> getReturnValue();
file_put_contents('./cache/gcn.html', $pageContent);
// $html = $page -> getHtml(); # 可以精简网页
// file_put_contents('./cache/gcn.html', $html);

# screenshot - Say "Cheese"!😄
// $page -> screenshot() -> saveToFile('./cache/bar.png');

# pdf
// $page -> pdf(['printBackground' => false]) -> saveToFile('./cache/bar.pdf');
}finally{
# bye
    $browser -> close();
}

 

debian 安装 chrome :

apt install -y chromium
chromium / google-chrome --no-sandbox --version
chromium / google-chrome --headless --disable-gpu --no-sandbox --dump-dom https://g.cn


不建议使用google-chrome,比较麻烦
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
apt install ./google-chrome-stable_current_amd64.deb

 

posted on 2023-04-24 03:52  伊索  阅读(355)  评论(0编辑  收藏  举报