Splash Lua脚本http://localhost:8050,端口为8050
入口及返回值
复制 | function main(splash, args) |
| splash:go("http://www.baidu.com") |
| splash:wait(0.5) |
| local title = splash:evaljs("document.title") |
| return {title=title} |
| end |
复制 | 通过 evaljs()方法传人 JavaSer刷脚本, 而 document.title 的执行结果就是返回网页标题,执行完毕后将其赋值给一个 title 变盘,随后将其 返回 。 |
异步处理
按照不同步的程序处理问题
复制 | function main(splash, args) |
| local example_urls = {"www.baidu.com", "www.taobao.com", "www.zhihu.com"} |
| local urls = args.urls or example_urls |
| local results = {} |
| for index, url in ipairs(urls) do |
| local ok, reason = splash:go("http://" .. url) |
| if ok then |
| splash:wait(2) |
| results[url] = splash:png() |
| end |
| end |
| return results |
| end |
复制 | wait(2) 等待2秒 |
| 字符串拼接符使用的是..操作符 |
| go()方法 返回加载页面的结果状态 |
复制 | 运行结果:(如果页面州现 4xx 或5xx状态码, ok变量就为空,就不会返回加载后的图片。) |
Splash对象属性
args属性
获取加载时配置的参数
运行:
输出:
js_enableb属性
js_enabled属性是Splash的JavaScript执行开关
可以将其配置为true或false来控制是否执行JavaScript代码,默认为true。
复制 | function main(splash, args) |
| splash:go("https://www.baidu.com") |
| splash.js_enabled = false |
| local title = splash:evaljs("document.title") |
| return {title=title} |
| end |
复制 | go()方法,加载页面 |
| js_enabled = false,禁止执行JavaScript代码 |
运行情况:
复制 | HTTP Error 400 (Bad Request) |
| |
| Type: ScriptError -> JS_ERROR |
| |
| Error happened while executing Lua script |
| |
| [string "function main(splash, args) |
| ..."]:4: unknown JS error: None |
| |
| { |
| "type": "ScriptError", |
| "info": { |
| "splash_method": "evaljs", |
| "line_number": 4, |
| "js_error_message": null, |
| "type": "JS_ERROR", |
| "error": "unknown JS error: None", |
| "message": "[string \"function main(splash, args)\r...\"]:4: unknown JS error: None", |
| "source": "[string \"function main(splash, args)\r...\"]" |
| }, |
| "description": "Error happened while executing Lua script", |
| "error": 400 |
| } |
resource_timeout属性
resource_timeout属性设置加载的超时时间,单位是秒。
复制 | function main(splash) |
| splash.resource_timeout = 0.1 |
| assert(splash:go('https://www.taobao.com')) |
| return splash:png() |
| end |
复制 | png()方法,返回页面截图 |
| resource_timeout = 0.1 表示设置的加载超时时间为0.1s |
images_enabled属性
images_enabled属性设置图片是否加载,默认情况下是加载的。不加载图片,加载的速度会快很多。
复制 | function main(splash, args) |
| splash.images_enabled = false |
| assert(splash:go('https://www.jd.com')) |
| return {png=splash:png()} |
| end |
运行后请求加载的网页不加载图片
plugins_enabled属性
plugins_enabled属性可以控制浏览器插件(如Flash插件)是否开启。默认情况下,此属性是false,表示不开启。
复制 | splash.plugins_enabled = true/false |
scoll_position属性
scroll_position属性可以控制页面上下或左右滚动
复制 | function main(splash, args) |
| assert(splash:go('https://www.taobao.com')) |
| splash.scroll_position = {y=400} |
| return {png=splash:png()} |
| end |
复制
Splash对象的方法
go()方法:请求某个链接
-
go 方法参数
复制 | ok, reason = splash:go{url, baseurl=nil, headers=nil, http_method="GET", body=nil, formdata=nil} |
| |
| baseurl |
| headers |
| http_method |
| body |
| formdata |
-
go 方法实例
复制 | function main(splash, args) |
| local ok, reason = splash:go{"http://httpbin.org/post", http_method="POST", body="name=Germey"} |
| if ok then |
| return splash:html() |
| end |
| end |
输出结果:
复制 | <html><head></head><body><pre style="word-wrap: break-word; white-space: pre-wrap;">{ |
| "args": {}, |
| "data": "", |
| "files": {}, |
| "form": { |
| "name": "Germey" |
| }, |
| "headers": { |
| "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", |
| "Accept-Encoding": "gzip, deflate", |
| "Accept-Language": "en,*", |
| "Content-Length": "11", |
| "Content-Type": "application/x-www-form-urlencoded", |
| "Host": "httpbin.org", |
| "Origin": "null", |
| "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/602.1 (KHTML, like Gecko) splash Version/9.0 Safari/602.1" |
| }, |
| "json": null, |
| "origin": "120.239.195.171, 120.239.195.171", |
| "url": "https://httpbin.org/post" |
| } |
| </pre></body></html> |
wait()方法:控制页面的等待时间
-
wait 方法参数
复制 | ok, reason = splash:wait{time, cancel_on_redirect=false, cancel_on_error=true} |
| |
| |
| time |
| cancel_on_redirect |
| cancel_on_error |
-
wait 方法实例
复制 | function main(splash) |
| splash:go("https://www.taobao.com") |
| splash:wait(2) |
| return {html=splash:html()} |
| end |
jsfunc()方法
jsfunc()方法,直接调用JavaScript定义的方法,即实现JavaScript方法到Lua脚本的转换。
复制 | function main(splash, args) |
| local get_div_count = splash:jsfunc([[ |
| function () { |
| var body = document.body; |
| var divs = body.getElementsByTagName('div'); |
| return divs.length; |
| } |
| ]]) |
| splash:go("https://www.baidu.com") |
| return ("There are %s DIVs"):format( |
| get_div_count()) |
| end |
复制
evaljs()方法
执行JavaScript代码,并返回最后一条JavaScript语句的返回结果
复制 | result = splash:evalijs(js) |
runjs()方法
runjs()方法于evaljs()方法功能类似
复制 | function main(splash, args) |
| splash:go("https://www.baidu.com") |
| splash:runjs("foo = function() { return 'bar' }") |
| local result = splash:evaljs("foo()") |
| return result |
| end |
复制
autoload()方法:sutoload()设置每个页面访问时自动加载的对象
-
autoload()方法参数
复制 | ok, reason = splash:autoload{source_or_url, source=nil, url=nil} |
| |
| source_or_url |
| source |
| url |
-
autoload()方法例子
复制 | function main(splash, args) |
| splash:autoload([[ |
| function get_document_title(){ |
| return document.title; |
| } |
| ]]) |
| splash:go("https://www.baidu.com") |
| return splash:evaljs("get_document_title()") |
| end |
复制 | 结果 |
| Splash Response: "百度一下,你就知道" |
call_later()方法
通过设置定时任务和延迟时间来实现任务延时执行,并且可以再执行前通过cancel()方法重新执行定时任务。
复制 | function main(splash, args) |
| local snapshots = {} |
| local timer = splash:call_later(function() |
| snapshots["a"] = splash:png() |
| splash:wait(1.0) |
| snapshots["b"] = splash:png() |
| end, 0.2) |
| splash:go("https://www.taobao.com") |
| splash:wait(3.0) |
| return snapshots |
| end |
http_get()方法
模拟发送HTTP的GET请求
-
http_get()方法参数
复制 | response = splash:http_get{url, headers=nil, follow_redirects=true} |
| |
| url |
| headers |
| follow_redirects |
-
http_get()方法例子
复制 | function main(splash, args) |
| local treat = require("treat") |
| local response = splash:http_get("http://httpbin.org/get") |
| return { |
| html=treat.as_string(response.body), |
| url=response.url, |
| status=response.status |
| } |
| end |
复制 | 输出结果: |
| html: String (length 347) |
| { |
| "args": {}, |
| "headers": { |
| "Accept-Encoding": "gzip, deflate", |
| "Accept-Language": "en,*", |
| "Host": "httpbin.org", |
| "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/602.1 (KHTML, like Gecko) splash Version/9.0 Safari/602.1" |
| }, |
| "origin": "120.239.195.171, 120.239.195.171", |
| "url": "https://httpbin.org/get" |
| } |
| status: 200 |
| url: "http://httpbin.org/get" |
http_post()方法
模拟发送HTTP的POST请求
-
http_post()方法参数
复制 | response = splash:http_post{url, headers=nil, follow_redirects=true, body=nil} |
| |
| url |
| headers |
| follow_redirects |
| body |
-
http_post()方法例子
复制 | function main(splash, args) |
| local treat = require("treat") |
| local json = require("json") |
| local response = splash:http_post{"http://httpbin.org/post", |
| body=json.encode({name="Germey"}), |
| headers={["content-type"]="application/json"} |
| } |
| return { |
| html=treat.as_string(response.body), |
| url=response.url, |
| status=response.status |
| } |
| end |
set_content()方法:设置页面的内容
复制 | function main(splash) |
| assert(splash:set_content("<html><body><h1>hello</h1></body></html>")) |
| return splash:png() |
| end |
html()方法:获取网页源代码
获取https://httpbin.org/get的源代码
复制 | function main(splash, args) |
| splash:go("https://httpbin.org/get") |
| return splash:html() |
| end |
png()方法:获取png格式的网页截图
复制 | function main(splash, args) |
| splash:go("https://www.taobao.com") |
| return splash:png() |
| end |
jpeg()方法:获取jpng格式的网页截图
复制 | function main(splash, args) |
| splash:go("https://www.taobao.com") |
| return splash:jpeg() |
| end |
har()方法:获取页面的加载过程
复制 | function main(splash, args) |
| splash:go("https://www.baidu.com") |
| return splash:har() |
| end |
url()方法:获取当前页面正在访问的URL
复制 | function main(splash, args) |
| splash:go("https://www.baidu.com") |
| return splash:url() |
| end |
复制
get_cookies()方法:获取当前页面的Cookies
复制 | function main(splash, args) |
| splash:go("https://www.baidu.com") |
| return splash:get_cookies() |
| end |
复制 | // 输出: |
| 0: Object |
| domain: ".baidu.com" |
| expires: "2087-08-08T12:53:28Z" |
| httpOnly: false |
| name: "BAIDUID" |
| path: "/" |
| secure: false |
| value: "B556658F0EAB497638556503063F6AEE:FG=1" |
| 1: Object |
| domain: ".baidu.com" |
| expires: "2087-08-08T12:53:28Z" |
| httpOnly: false |
| name: "BIDUPSID" |
| path: "/" |
| secure: false |
| value: "B556658F0EAB497638556503063F6AEE" |
| 2: Object |
| domain: ".baidu.com" |
| expires: "2087-08-08T12:53:28Z" |
| httpOnly: false |
| name: "PSTM" |
| path: "/" |
| secure: false |
| value: "1563701961" |
| 3: Object |
| domain: ".baidu.com" |
| httpOnly: false |
| name: "delPer" |
| path: "/" |
| secure: false |
| value: "0" |
| 4: Object |
| domain: "www.baidu.com" |
| httpOnly: false |
| name: "BD_HOME" |
| path: "/" |
| secure: false |
| value: "0" |
| 5: Object |
| domain: ".baidu.com" |
| httpOnly: false |
| name: "H_PS_PSSID" |
| path: "/" |
| secure: false |
| value: "29547_1434_21089_18560_29522_29518_28518_29099_28833_29220_26350_29459" |
| 6: Object |
| domain: "www.baidu.com" |
| expires: "2019-07-31T09:39:21Z" |
| httpOnly: false |
| name: "BD_UPN" |
| path: "/" |
| secure: false |
| value: "143354" |
add_cookie()方法:为当前页面添加Cookie
-
add_cookie()方法参数
复制 | cookies = splash:add_cookie{name, value, path=nil, domain=nil, expires=nil, httpOnly=nil, secure=nil} |
-
add_cookie()方法例子
复制 | function main(splash) |
| splash:add_cookie{"sessionid", "237465ghgfsd", "/", domain="http://example.com"} |
| splash:go("http://example.com/") |
| return splash:html() |
| end |
复制 | // 输出: |
| <!DOCTYPE html><html><head> |
| <title>Example Domain</title> |
| |
| <meta charset="utf-8"> |
| <meta http-equiv="Content-type" content="text/html; charset=utf-8"> |
| <meta name="viewport" content="width=device-width, initial-scale=1"> |
| <style type="text/css"> |
| body { |
| background-color: #f0f0f2; |
| margin: 0; |
| padding: 0; |
| font-family: "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif; |
| |
| } |
| div { |
| width: 600px; |
| margin: 5em auto; |
| padding: 50px; |
| background-color: #fff; |
| border-radius: 1em; |
| } |
| a:link, a:visited { |
| color: #38488f; |
| text-decoration: none; |
| } |
| @media (max-width: 700px) { |
| body { |
| background-color: #fff; |
| } |
| div { |
| width: auto; |
| margin: 0 auto; |
| border-radius: 0; |
| padding: 1em; |
| } |
| } |
| </style> |
| </head> |
| |
| <body> |
| <div> |
| <h1>Example Domain</h1> |
| <p>This domain is established to be used for illustrative examples in documents. You may use this |
| domain in examples without prior coordination or asking for permission.</p> |
| <p><a href="http://www.iana.org/domains/example">More information...</a></p> |
| </div> |
| |
| |
| </body></html> |
clear_cookies()方法:清除所有Cookies
复制 | function main(splash) |
| splash:go("https://www.baidu.com/") |
| splash:clear_cookies() |
| return splash:get_cookies() |
| end |
复制
get_viewport_size()方法:获取页面的宽高
复制 | function main(splash) |
| splash:go("https://www.baidu.com/") |
| return splash:get_viewport_size() |
| end |
set_viewport_size()方法:设置页面的宽高
-
set_viewport_size()参数
复制 | splash:set_viewport_size(width, height) |
-
set_viewport_size()方法例子
复制 | function main(splash) |
| splash:set_viewport_size(400, 700) |
| assert(splash:go("http://cuiqingcai.com")) |
| return splash:png() |
| end |
set_viewport_full()方法:浏览器全频显示
复制 | function main(splash) |
| splash:set_viewport_full() |
| assert(splash:go("http://cuiqingcai.com")) |
| return splash:png() |
| end |
set_user_agent()方法:设置浏览器的User_agent
复制 | function main(splash) |
| splash:set_user_agent('Splash') |
| splash:go("http://httpbin.org/get") |
| return splash:html() |
| end |
复制
set_custom_headers()方法:设置请求头
复制 | function main(splash) |
| splash:set_custom_headers({ |
| ["User-Agent"] = "Splash", |
| ["Site"] = "Splash", |
| }) |
| splash:go("http://httpbin.org/get") |
| return splash:html() |
| end |
复制 | // 输出: |
| <html><head></head><body><pre style="word-wrap: break-word; white-space: pre-wrap;">{ |
| "args": {}, |
| "headers": { |
| "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", |
| "Accept-Encoding": "gzip, deflate", |
| "Accept-Language": "en,*", |
| "Host": "httpbin.org", |
| "Site": "Splash", |
| "User-Agent": "Splash" |
| }, |
| "origin": "120.239.195.171, 120.239.195.171", |
| "url": "https://httpbin.org/get" |
| } |
| </pre></body></html> |
select()方法:选中符合条件的第一个节点----参数为CSS选择器
复制 | function main(splash) |
| splash:go("https://www.baidu.com/") |
| input = splash:select("#kw") |
| input:send_text('Splash') |
| splash:wait(3) |
| return splash:png() |
| end |
复制
select_all()方法
选中符合条件的所有节点(参数为CSS选择器)
复制 | function main(splash) |
| local treat = require('treat') |
| assert(splash:go("http://quotes.toscrape.com/")) |
| assert(splash:wait(0.5)) |
| local texts = splash:select_all('.quote .text') |
| local results = {} |
| for index, text in ipairs(texts) do |
| results[index] = text.node.innerHTML |
| end |
| return treat.as_array(results) |
| end |
复制 | // 输出: |
| Array[10] |
| 0: |
| 1: |
| 2: “There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.” |
| 3: |
| 4: |
| 5: |
| 6: |
| 7: |
| 8: |
| 9: |

mouse_click()方法
模拟鼠标点击操作,传入的参数为坐标值x
和y
。此外,也可以直接选中某个节点,然后调用此方法
复制 | function main(splash) |
| splash:go("https://www.baidu.com/") |
| input = splash:select("#kw") |
| input:send_text('Splash') |
| submit = splash:select('#su') |
| submit:mouse_click() |
| splash:wait(3) |
| return splash:png() |
| end |
复制 | 首先选中页面的输入框,输入了文本,然后选中“提交”按钮,调用了mouse_click()方法提交查询,然后页面等待三秒,返回截图。 |
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· winform 绘制太阳,地球,月球 运作规律
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 超详细:普通电脑也行Windows部署deepseek R1训练数据并当服务器共享给他人
· 上周热点回顾(3.3-3.9)