【Python】使用requests_html解析HTML页面
1、官网
https://pypi.org/project/requests-html/
2、github
https://github.com/kennethreitz/requests-html
3、安装
pip install requests-html
4、使用HTMLSession
headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.110 Safari/537.36'} download_url = 'https://registry.npmmirror.com/binary.html?path=chromedriver/' from requests_html import HTMLSession session = HTMLSession() r = session.get(download_url) r.html.render() # this call executes the js in the page print(r.html.find('html'))
5、使用AsyncHTMLSession
headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.110 Safari/537.36'} download_url = 'https://registry.npmmirror.com/binary.html?path=chromedriver/' from requests_html import AsyncHTMLSession asession = AsyncHTMLSession() async def get_webdriver(): r = await asession.get(download_url) await r.html.arender() return r results = asession.run(get_webdriver) print(results)
参考链接:
https://www.cnblogs.com/angelyan/p/13913926.html
https://www.cnblogs.com/Sadusky/p/12887215.html
https://www.cnblogs.com/pythonywy/p/11694967.html
https://baijiahao.baidu.com/s?id=1701142223076604985&wfr=spider&for=pc