Puppeteer使用

1. Install

# --ignore-scripts  can jump install Chromium
$ npm i --save puppeteer --ignore-scripts

2. Api

  • Detailed configuration can refer to official documents.

2.1 New browser

  • New browser object
const browser = await puppeteer.launch({
    slowMo: 200,
    timeout: 15000,
    ignoreHTTPSErrors: true,
    headless: false,
    devtools: true, // Open developer tools
    defaultViewport: {
    width: 1280,
    height: 1000,
    }
})

2.2 Close browser

  • Close browser
await browser.close();

2.3 New page

  • New page object
const page = await browser.newPage();

2.4 Close page

  • Close Page
await page.close();
  • Set page cookie
  •  # cookie format
     ...cookies <...Object>
             name <string> required
             value <string> required
             url <string>
             domain <string>
             path <string>
             expires <number> Unix time in seconds.
             httpOnly <boolean>
             secure <boolean>
             sameSite <"Strict"|"Lax">
        ```
    
    let cookie = fs.readFileSync(cookieFilePath, 'utf8')
    cookie = JSON.parse(cookie)
    if (cookie) await page.setCookie(...cookie);
  const cookie = await page.cookies();
  await page.deleteCookie();

2.8 Open url

  await page.goto('https://www.facebook.com', {
      timeout: 50000,
      waitUntil: ['networkidle0'] // There is no longer triggered when a network connection
  })

2.9 Search dom

// Wait for Dom to load
await page.waitForSelector('li > div > div[aria-label]', { timeout: 20000 });

// Query a Dom by selector
const btn = await page.$('span div[aria-label]:nth-child(1)');

// Click btn
await btn.click();

// Query multiple Doms by selector
const doms = await page.$$('li > div > div[aria-label]');

// Query the content of a single dom
const val = await btn.$eval('div', el => el.textContent);

// Wait...
await page.waitForTimeout(1000); // ms

...

3. Note

  • 上述的结点查询使用的是Selector选择器,对应与Console的 document.querySelector 和 document.querySelectorAll,其它选择器可以查看文档
  • Api使用中发现部分方法存在bug,并得不到理论值,github好多问题也没解决。。。

4. Appendix

posted @ 2020-11-25 14:48  落叶&不随风  阅读(435)  评论(0编辑  收藏  举报