Splash
官网:
https://splash.readthedocs.io/en/stable/index.html
常用接口(API)
1、render.html
格式:
http://10.63.32.49:8050/render.html?url=https://www.baidu.com&wait=3.0&timeout=4
timeout超时,默认为30秒
wait 加载页面后,等待更新的时间,更新内容(ajax,js的加载,setInterval/settimeout等)
import requests r = requests.get( url='http://10.63.32.49:8050/render.html?url=https://www.baidu.com') print(r.text)
2、render.png
http://10.63.32.49:8050/render.png?url=https://www.sina.com.cn/&wait=5&width=320&height=240
图片的格式:.png
width和height决定图片的大小
import requests r = requests.get( url='http://10.63.32.49:8050/render.png?url=https://search.jd.com/Search?keyword=iphone&enc=utf-8&pvid=5f6243644eeb4510b26cffce889d77a8') # print(r.content) with open(file='jd.png', mode='wb') as f: f.write(r.content)
3、render.jpeg
http://10.63.32.49:8050/render.jpeg?url=https://www.sina.com.cn/&wait=5&&quality=30
图片格式jepg
quality为图片的质量,默认75,类型:integer 范围 0~100,不应该超过95
也有width和height
4、render.har
http://10.63.32.49:8050/render.har?url=https://search.jd.com/Search?keyword=iphone&enc=utf-8&pvid=5f6243644eeb4510b26cffce889d77a8
获取页面加载的HAR数据。结果JSON
import requests import json r = requests.get( url='http://10.63.32.49:8050/render.har?url=https://search.jd.com/Search?keyword=iphone&enc=utf-8&pvid=5f6243644eeb4510b26cffce889d77a8') # print(r.json()) with open(file='a.json', mode='w') as f: f.write(json.dumps(r.json()))
5、render.json
http://10.63.32.49:8050/render.json?url=https://search.jd.com/Search?keyword=iphone&enc=utf-8&pvid=5f6243644eeb4510b26cffce889d77a8
获取接口的所有功能,返回结果JSON
import requests import json r = requests.get( url='http://10.63.32.49:8050/render.json?url=https://search.jd.com/Search?keyword=iphone&enc=utf-8&pvid=5f6243644eeb4510b26cffce889d77a8') # print(r.json()) with open(file='b.json', mode='w') as f: f.write(json.dumps(r.json()))
6、execute
http://10.63.32.49:8050/execute?lua_source=
作用:用Lua实现和页面的交互操作
from urllib.parse import quote import requests lua = """ function main(splash) return "hello" end """ lua_url = quote(lua) # print(lua_url) r = requests.get('http://10.63.32.49:8050/execute?lua_source={}'.format(lua_url)) print(r.text)
返回lua脚本执行后的结果