Python Playwright学习笔记(二)

一、模拟手机
playwright.devices可以配置模拟器。

import asyncio
from playwright.async_api import async_playwright

async def run(playwright):
    iphone_12 = playwright.devices['iPhone 12']
    browser = await playwright.webkit.launch(headless=False)
    context = await browser.new_context(
        **iphone_12,
    )

async def main():
    async with async_playwright() as playwright:
        await run(playwright)
asyncio.run(main())

二、监听响应

page.on可以监听事件,包含浏览器关闭、请求、响应等。

from playwright.sync_api import sync_playwright
 
def on_response(response):
    if '/api/xx/' in response.url and response.status == 200:
        print(response.json())
 
with sync_playwright() as p:
    browser = p.chromium.launch(headless=False)
    page = browser.new_page()
    page.on('response', on_response)
    page.goto('https://xx.com/')
    page.wait_for_load_state('networkidle')
    browser.close()

三、读写Cookies

from playwright.sync_api import Playwright, sync_playwright
import json
 
 
def run(playwright: Playwright) -> None:
    browser = playwright.chromium.launch(headless=False)

# 加载cookie
with open("cookie.json") as f:
    storage_state = json.loads(f.read())
context = browser.new_context(storage_state=storage_state)

# Open new page
page = context.new_page()

# Go to http://www.glidedsky.com/login
page.goto("http://www.xx.com/login")

# Click input[name="email"]
page.click("input[name=\"email\"]")

# Fill input[name="email"]
page.fill("input[name=\"email\"]", "账号")

# Click input[name="password"]
page.click("input[name=\"password\"]")

# Fill input[name="password"]
page.fill("input[name=\"password\"]", "密码")

# Click button:has-text("Login")
page.click("button:has-text(\"Login\")")
# assert page.url == "http://www.glidedsky.com/"

# 获得登录后 cookie
storage = context.storage_state()
with open("cookie.json", "w") as f:
    f.write(json.dumps(storage))

page.close()

context.close()
browser.close()

with sync_playwright() as playwright:
run(playwright)

四、获取网页的HTML

在 Playwright 中,可以使用 page.content() 方法获取当前页面的 HTML 内容。以下是一个简单的例子,展示如何使用 Playwright 获取一个网页的 HTML:

import asyncio  
from playwright.async_api import async_playwright  
  
async def main():  
    async with async_playwright() as p:  
        browser = await p.chromium.launch()  
        page = await browser.new_page()  
        await page.goto('https://example.com')  
        html = await page.content()  
        print(html)  
  
asyncio.run(main())

在上面的例子中,我们首先创建一个 Playwright 实例,然后启动一个 Chromium 浏览器。接下来,我们创建一个新的页面,并导航到 https://example.com。最后,我们使用 page.content() 方法获取当前页面的 HTML 内容,并将其打印出来。

请注意,page.content() 方法返回的是一个字符串,其中包含了整个 HTML 文档的内容。如果你需要获取某个特定元素的 HTML 代码,可以使用 page.evaluate() 方法执行 JavaScript 代码,然后使用 Playwright 的 DOM API 来获取元素的 HTML。例如,以下代码演示了如何获取一个页面的标题元素:

import asyncio  
from playwright.async_api import async_playwright  
  
async def main():  
    async with async_playwright() as p:  
        browser = await p.chromium.launch()  
        page = await browser.new_page()  
        await page.goto('https://example.com')  
        title_element = await page.query_selector('h1')  
        title_html = await p.evaluate(title_element, fn)  
        print(title_html)  
  
asyncio.run(main())
posted @   德尔菲殿堂  阅读(30)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 单元测试从入门到精通
· 上周热点回顾(3.3-3.9)
· winform 绘制太阳,地球,月球 运作规律
点击右上角即可分享
微信分享提示