Python 调用 Splash API

render.html

render.html 接口用于获取 JavaScript 渲染的页面的 HTML 代码,接口地址就是 Splash 的运行地址加此接口名称。

例如:

http://0.0.0.0:8050/render.html?url=https://www.baidu.com&wait=5

http://0.0.0.0:8050 + render.html + https://www.baidu.com + wait=5

import requests

url = 'http://0.0.0.0:8050/render.html?url=https://www.baidu.com&wait=5'
response = requests.get(url)
print(response.text)
# 运行
给此接口传递了一个url参数来指定渲染的URL,等待5秒钟,返回结果即页面渲染后的源代码。

render.png

render.png 接口可以获取网页截图,通过 width 和 height 来控制宽高,它返回的是PNG 格式的图片二进制数据。

例如:

http://localhost:8050/render.png?url=https://www.jd.com&wait=5&width=1000&height=700
import requests

url = 'http://localhost:8050/render.png?url=https://www.jd.com&wait=5&width=1000&height=700'
response = requests.get(url)
with open('taobao.png', 'wb') as f:
    f.write(response.content)

render.jpng

render.jpng 接口和 render.png 接口类似,只是多了个 quality 参数来设置图片的质量。

render.har

render.har 接口用于获取页面加载的HAR数据,它的返回结果非常多,是一个 JSON 格式的数据,其中包含页面加载过程中的 HAR 数据。

import requests

url = 'http://localhost:8050/render.har?url=https://www.jd.com&wait=5'
response = requests.get(url)
with open('jd1.json', 'wb') as f:
    f.write(response.content)
# 运行生成一个名为jd.json的文件

execute 接口

execute 接口 是一个很强大的接口

  1. 实例1

    import requests
    from urllib.parse import quote
    
    # lua脚本
    lua = '''
    function main(splash)
        return 'hello'
    end
    '''
    
    # quote()方法将lua脚本进行URL转码。lua_source作为参数传递
    url = 'http://localhost:8050/execute?lua_source=' + quote(lua)
    
    response = requests.get(url)
    print(response.text)
    print(quote(lua))
    
    # 输出:
    hello
    %0Afunction%20main%28splash%29%0A%20%20%20%20return%20%27hello%27%0Aend%0A
    
  2. 实例2

    import requests
    from urllib.parse import quote
    
    lua = '''
    function main(splash, args)
      local treat = require("treat")
      local response = splash:http_get("http://httpbin.org/get")
        return {
        html=treat.as_string(response.body),
        url=response.url,
        status=response.status
        }
    end
    '''
    
    url = 'http://localhost:8050/execute?lua_source=' + quote(lua)
    response = requests.get(url)
    print(response.text)
    print(quote(lua))
    
    # 输出:
    {"html": "{\n  \"args\": {}, \n  \"headers\": {\n    \"Accept-Encoding\": \"gzip, deflate\", \n    \"Accept-Language\": \"en,*\", \n    \"Host\": \"httpbin.org\", \n    \"User-Agent\": \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/602.1 (KHTML, like Gecko) splash Version/9.0 Safari/602.1\"\n  }, \n  \"origin\": \"120.239.195.171, 120.239.195.171\", \n  \"url\": \"https://httpbin.org/get\"\n}\n", "status": 200, "url": "http://httpbin.org/get"}
    %0Afunction%20main%28splash%2C%20args%29%0A%20%20local%20treat%20%3D%20require%28%22treat%22%29%0A%20%20local%20response%20%3D%20splash%3Ahttp_get%28%22http%3A//httpbin.org/get%22%29%0A%20%20%20%20return%20%7B%0A%20%20%20%20html%3Dtreat.as_string%28response.body%29%2C%0A%20%20%20%20url%3Dresponse.url%2C%0A%20%20%20%20status%3Dresponse.status%0A%20%20%20%20%7D%0Aend%0A
    
  3. Lua 脚本

    function main(splash, args)
      local treat = require("treat")
      local response = splash:http_get("http://httpbin.org/get")
        return {
        html=treat.as_string(response.body),
        url=response.url,
        status=response.status
        }
    end
    
    // 输出:
    Object
    html: String (length 347)
    {
      "args": {}, 
      "headers": {
        "Accept-Encoding": "gzip, deflate", 
        "Accept-Language": "en,*", 
        "Host": "httpbin.org", 
        "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/602.1 (KHTML, like Gecko) splash Version/9.0 Safari/602.1"
      }, 
      "origin": "120.239.195.171, 120.239.195.171", 
      "url": "https://httpbin.org/get"
    }
    status: 200
    url: "http://httpbin.org/get"
    
posted @ 2019-07-21 19:59  LeeHua  阅读(665)  评论(0编辑  收藏  举报