【Python】使用requests_html解析HTML页面

1、官网

https://pypi.org/project/requests-html/

 

2、github

https://github.com/kennethreitz/requests-html

 

3、安装

pip install requests-html

 

 

4、使用HTMLSession

复制代码
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.110 Safari/537.36'}
download_url = 'https://registry.npmmirror.com/binary.html?path=chromedriver/'

from requests_html import HTMLSession

session = HTMLSession()

r = session.get(download_url)

r.html.render()  # this call executes the js in the page

print(r.html.find('html'))
复制代码

 

5、使用AsyncHTMLSession

复制代码
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.110 Safari/537.36'}
download_url = 'https://registry.npmmirror.com/binary.html?path=chromedriver/'


from requests_html import AsyncHTMLSession

asession = AsyncHTMLSession()


async def get_webdriver():
    r = await asession.get(download_url)
    await r.html.arender()
    return r


results = asession.run(get_webdriver)
print(results)
复制代码

 

 

 

参考链接:

https://www.cnblogs.com/angelyan/p/13913926.html

https://www.cnblogs.com/Sadusky/p/12887215.html

https://www.cnblogs.com/pythonywy/p/11694967.html

https://baijiahao.baidu.com/s?id=1701142223076604985&wfr=spider&for=pc

posted @   代码诠释的世界  阅读(781)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· 单线程的Redis速度为什么快?
点击右上角即可分享
微信分享提示