Java根据URL截图的4种方式

合集 - SpringBoot的Demo(11)

1.SpringBoot自定义cron表达式注册定时任务2023-04-21 2.MongoDB + SpringBoot 的基础CRUD、聚合查询2023-05-18 3.SpringBoot实现WebSocket发送接收消息 + Vue实现SocketJs接收发送消息2023-05-19 4.SpringBoot获取Bean的工具类2023-10-12 5.Gradle8.4构建SpringBoot多模块项目2023-10-31 6.Gradle构建SpringBoot单模块项目2023-10-24

7.Java根据URL截图的4种方式2023-11-06

8.SpringBoot序列化、反序列化空字符串为null的三种方式2024-04-19 9.SpringBoot使用模版导出Word文件：poi-tl2024-12-25 10.EasyExcel-读取多个sheet的方法2024-12-25 11.SpringBoot将数据库中的数据导入到xml文件中2024-12-25

一、XHTMLRenderer（不要用）
二、PhantomJs（我用的）
三、Puppeteer
四、Selenium
五、总结

方案选择

XHTMLRenderer（不要用）
PhantomJs（三方库，已停更）
Puppeteer（Chrome团队开发和维护）
Selenium（支持多浏览器、多语言，服务器需要安谷歌浏览器）

回到顶部

一、XHTMLRenderer（不要用）

XHTMLRenderer它是一个Java库，用于将XHTML文档渲染为图像或PDF格式。
也不要用它来转PDF

1、XHTMLRenderer介绍

XHTMLRenderer（也被称为Flying Saucer）是一个用于将XHTML和CSS内容渲染为PDF或图像的库。然而，它的一个主要限制是它只支持一部分的CSS 2.1规范，并且不支持HTML5和CSS3的许多特性。这意味着如果您的HTML内容使用了这些库不支持的特性，那么它可能无法正确地渲染这些内容。

此外，XHTMLRenderer需要输入的HTML内容是良好的XHTML。这意味着所有的标签都必须正确地关闭，属性值必须用引号括起来，等等。如果输入的HTML内容不符合这些规则，那么XHTMLRenderer可能无法正确地解析它。

如果您发现XHTMLRenderer无法满足您的需求，您可能需要考虑使用其他的库。
例如，PhantomJS（第二种）、Selenium（终极办法）、Puppeteer可以在无头浏览器环境中渲染HTML，并且支持最新的HTML和CSS规范。这些库可以更好地处理复杂的HTML内容，并且提供了更多的选项来控制渲染过程。

2、适用场景

简单的可以解析的html
不适用：复杂的网页生成的html

3、实现

3.0导包

gradle

     implementation 'org.jsoup:jsoup:1.15.3'
    // XHTMLRenderer 的核心包
    // - 用于将XML、XHTML和CSS文档渲染为PDF、图像或Swing组件。
    // - 它基于W3C标准和开放源代码的浏览器引擎，可以将HTML文档转换为可打印或可展示的格式。
    implementation 'org.xhtmlrenderer:flying-saucer-core:9.1.18'

3.1、通过JSOUP获取url的html

 Jsoup是一个Java库，用于处理HTML。它提供了一个非常方便的API，用于提取和操作数据，使用DOM，CSS和jquery-like方法。
 
Jsoup的主要功能包括：
 
1. 从URL、文件或字符串中解析HTML。
2. 使用DOM和CSS选择器来查找、提取和操作数据。
3. 清理用户提交的内容，以防止跨站脚本攻击。
4. 输出整洁的HTML。
 
Jsoup设计用于处理所有类型的HTML：从整洁的HTML5到混乱的实际生产HTML。
Jsoup会将输入HTML解析为与浏览器相同的DOM结构，这使得它能够处理各种复杂的HTML结构。

jsoup获取url的html

 package com.cc.urlgethtml.utils;
 
import lombok.Data;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
 
import java.io.IOException;
 
/**
 * <p></p>
 *
 * @author CC
 * @since 2023/11/6
 */
@Data
public class JsoupUtils {
 
    public static String getHtmlByJsoup(String url) {
        Document doc;
        try {
            doc = Jsoup.connect(url).get();
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
        return doc.html();
    }
}

3.2、使用XHTMLRenderer

 /** <p>通过Java2DRenderer将html转为图片</p>
     * <ol>
     *     <li>通过jsoup根据url获取静态html</li>
     *     <li>使用Java2DRenderer将静态html转为图片，浏览器下载</li>
     * </ol>
     *
     * <li>优点：不用第三方插件。如果是标准的html页面，可以使用该方式</li>
     * <li>缺点：复杂的html转不了，直接会报错</li>
     */
    @GetMapping("/getJava2DRenderer")
    public String get1(HttpServletResponse response){
 
        //url
        String url = "http://www.baidu.com";
        //一、通过jsoup获取url的html（获取的html不完整，而且是渲染前的）
//        String html = JsoupUtils.getHtmlByJsoup(url);
 
        //二、模拟从url获取的：非常简单的html。测了从富文本中获取的html有可能都识别不了。恶心。
        String html = "<html><body><h1>Hello, World!</h1></body></html>";
        String fileName = "文件名.png";
 
        //转图片，有问题——复杂的html转不了
        Document document;
        try {
            DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
            document = builder.parse(new ByteArrayInputStream(html.getBytes()));
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
 
        Java2DRenderer renderer = new Java2DRenderer(document, 480, 640);
        BufferedImage img = renderer.getImage();
 
        //方式一：浏览器下载
        try {
            response.setCharacterEncoding(StandardCharsets.UTF_8.name());
            response.setContentType("image/png;charset=".concat(StandardCharsets.UTF_8.name()));
            response.setHeader(HttpHeaders.ACCESS_CONTROL_EXPOSE_HEADERS,HttpHeaders.CONTENT_DISPOSITION);
            response.setHeader(HttpHeaders.CONTENT_DISPOSITION,
                    "attachment; filename=".concat(
                            URLEncoder.encode(fileName, StandardCharsets.UTF_8.name())
                    ));
            ServletOutputStream out = response.getOutputStream();
 
            //把 BufferedImage 缓冲图像流 写到 输出流 out 中
            ImageIO.write(img, "png", out);
            out.close();
        }catch(Exception e){
            throw new RuntimeException(e.getMessage());
        }
 
        //方式二：写入本地
        /*FSImageWriter imageWriter = new FSImageWriter();
        try {
            imageWriter.write(img, "D:\\ABC.jpg");
        } catch (IOException e) {
            throw new RuntimeException(e);
        }*/
 
        return html;
    }

截图

回到顶部

二、PhantomJs（我用的）

我使用的，已经停更，建议使用Puppeteer
是三方插件，需要使用Java调用三方插件
需要下载这个插件，下载地址：https://phantomjs.org/download.html
我用的版本：2.2.1
Windows、Linux都实现了

1、PhantomJs介绍

PhantomJS是一个无头浏览器，它使用WebKit布局引擎（与旧版的Safari和Google Chrome相同）进行页面渲染。"无头"意味着它可以在没有用户界面的情况下运行浏览器，这对于自动化脚本和服务器环境非常有用。

PhantomJS的主要特性包括：

原生支持多种Web标准：如DOM处理，CSS选择器，JSON，Canvas和SVG。
页面自动化：可以通过JavaScript API控制网页，包括加载和操作网页。
屏幕捕获：可以将网页渲染为PDF或各种图片格式。
网络监控：可以监控网络活动，用于性能分析，调试和测试。

虽然PhantomJS非常强大，但请注意，它的开发已于2018年停止，因此可能不支持最新的Web标准和技术。如果你需要一个持续更新并支持最新Web技术的无头浏览器，你可能需要考虑使用Puppeteer，它是一个由Google维护的，使用Chromium（Google Chrome的开源版本）作为后端的无头浏览器库。

2、PhantomJs安装(下载)后，直接去执行命令即可截图

而我们就是需要用Java代码去系统中执行这个命令即可
Windows/Linux。都进入安装目录的bin目录
语法：win/Linux程序调用的js 截图的网站截图后保存的地址

 phantomjs.exe ……\win\examples\rasterize.js http://www.baidu.com ……Test\bddddd11.png
 
./phantomjs ……\win\examples\rasterize.js http://www.baidu.com ……Test\bddddd11.png

例如：Windows（Linux一样的）
能保存截图，说明可以使用：

3、Java代码实现phantomjs截图

3.1、思路

单独在win/linux安装phantomjs
根据不同系统生成不同的执行命令
使用Java库Runtime.getRuntime().exec("cmd")：调用phantomjs执行命令即可
最终完成截图。（截图可以加载到内存中进行处理……）

3.1、将下载的包放入项目地址中，方便调用

3.2、修改rasterize.js文件

位置：.../examples/rasterize.js

 var page = require('webpage').create(),
    system = require('system'),
    address, output, size;
 
//可以带cookie
//var flag = phantom.addCookie({
//        "domain": ".baidu.com" ,
//        "expires": "Fri, 01 Jan 2038 00:00:00 GMT",
//        "expiry": 2145916800,
//        "httponly": false,
//        "name": "token",
//        "path": "/",
//        "secure": false,
//        "value": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJVc2VySWQiOiJud19kemdkIiwiRXhwaXJlIjoiMjAyMy0wOC0xMCAxNDoyMzo0NyJ9.InsFJkcXI6C57r-1Oqb7PMn-OcP9k0W5lf1K896EasY"
//});
 
if (system.args.length < 3 || system.args.length > 5) {
    phantom.exit(1);
} else {
    address = system.args[1];//传入url地址
    output = system.args[2];//输出图片的地址
    page.viewportSize = { width: 800, height: 1800 };//自定义定义宽高
    if (system.args.length > 3 && system.args[2].substr(-4) === ".pdf") {
        size = system.args[3].split('*');
        page.paperSize = size.length === 2 ? { width: size[0], height: size[1], margin: '0px' }
                                           : { format: system.args[3], orientation: 'portrait', margin: '1cm' };
    } else if (system.args.length > 3 && system.args[3].substr(-2) === "px") {
        size = system.args[3].split('*');
        if (size.length === 2) {
            pageWidth = parseInt(size[0], 10);
            pageHeight = parseInt(size[1], 10);
            page.viewportSize = { width: pageWidth, height: pageHeight };
            page.clipRect = { top: 0, left: 0, width: pageWidth, height: pageHeight };
        } else {
            console.log("size:", system.args[3]);
            pageWidth = parseInt(system.args[3], 10);
            pageHeight = parseInt(pageWidth * 3/4, 10); // it's as good an assumption as any
            console.log ("pageHeight:",pageHeight);
            page.viewportSize = { width: pageWidth, height: pageHeight };
        }
    }
    if (system.args.length > 4) {
        page.zoomFactor = system.args[4];
    }
    page.open(address, function (status) {
        if (status !== 'success') {
            console.log('Unable to load the address!');
            phantom.exit(1);
        } else {
            window.setTimeout(function () {
                page.render(output);
                phantom.exit();
            }, 3000);
        }
    });
}
 
address = system.args[1];//传入url地址
output = system.args[2];//输出图片的地址
page.viewportSize = { width: 1200, height: 780 };//自定义定义宽高 (2000 * 1300 比较合适)

3.3、application.yml配置文件配置phantomjs的位置地址

 # phantomjs的位置地址
phantomjs:
  binPath:
    windows: plugins/phantomjs211/win/bin/phantomjs.exe
    linux: /plugins/phantomjs211/linux/bin
  jsPath:
    windows: plugins/phantomjs211/win/examples/rasterize.js
    linux: /plugins/phantomjs211/linux/examples/rasterize.js
  imagePath:
    windows: D:/Test
    linux: /downImage

3.4、获取配置文件的插件地址

 package com.cc.urlgethtml.utils;
 
import lombok.Data;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Component;
 
/**
 * <p>根据不同系统获取不同路径</p>
 *
 * @author CC
 * @since 2023/11/3
 */
@Data
@Component
public class PhantomPath {
 
    @Value("${phantomjs.binPath.windows}")
    private String binPathWin;
    @Value("${phantomjs.jsPath.windows}")
    private String jsPathWin;
    @Value("${phantomjs.binPath.linux}")
    private String binPathLinux;
    @Value("${phantomjs.jsPath.linux}")
    private String jsPathLinux;
    @Value("${phantomjs.imagePath.windows}")
    private String imagePathWin;
    @Value("${phantomjs.imagePath.linux}")
    private String imagePathLinux;
 
    //获取当前系统是否是Windows系统（不是就是服务器，服务器默认Linux系统（centos））
    private static final Boolean IS_WINDOWS = System.getProperty("os.name").toLowerCase().contains("windows");
 
    public String getImagePath() {
        return IS_WINDOWS ? imagePathWin : imagePathLinux;
    }
 
    public String getBinPath() {
        return IS_WINDOWS ? binPathWin : binPathLinux;
    }
 
    public String getJsPath() {
        return IS_WINDOWS ? jsPathWin : jsPathLinux;
    }
 
    public boolean getIsWindows() {
        return IS_WINDOWS;
    }
}

3.5、PhantomTools工具

 package com.cc.urlgethtml.utils;
 
import cn.hutool.core.util.StrUtil;
import lombok.Data;
import lombok.extern.slf4j.Slf4j;
import org.springframework.stereotype.Component;
import org.springframework.util.ResourceUtils;
 
import javax.annotation.Resource;
import java.io.*;
import java.util.ArrayList;
import java.util.List;
 
/** <p>根据网页地址转换成图片</p>
 *  <p>条件：需要插件及js脚本</p>
 *  <p>缺陷：部分网址，截图样式部分还是有问题</p>
 * @since 2023/11/6
 * @author CC
 **/
@Data
@Slf4j
@Component
public class PhantomTools {
 
    @Resource
    private PhantomPath phantomPath;
 
    // token
    private static String cookie = "token=";
 
    /**
     * @param url 地址
     * @param imageName 图片全名
     * @since 2023/11/3
     **/
    public void printUrlScreen2jpg(String url, String imageName) throws IOException{
        //执行命令，生成图片
        String[] cmd = this.getCmd(url, imageName);
        Process process;
        if (phantomPath.getIsWindows()) {
            //一个参数：直接执行命令（）
            process = Runtime.getRuntime().exec(cmd);
        }else {
            String linuxPath = getBasePath().concat(phantomPath.getBinPathLinux());
            log.info("Linux执行目录：{}", linuxPath);
            process = Runtime.getRuntime().exec(cmd, new String[]{}, new File(linuxPath));
        }
 
        //读取执行日志
        InputStream inputStream = process.getInputStream();
        BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
        String msg;
        while ((msg = reader.readLine()) != null) {
            log.info("PhantomJs导出图片日志：{}", msg);
        }
        log.info("PhantomJs导出图片：{}", imageName);
        //关闭流
        close(process, reader);
    }
 
    // 生成cmd命令：可以是直接的字符串，也可以是string数组
    public String[] getCmd(String url, String imageName) throws FileNotFoundException {
        String basePath = getBasePath();
 
        List<String> list = new ArrayList<>();
        //Windows命令
        if (phantomPath.getIsWindows()) {
            list.add(basePath.concat(phantomPath.getBinPath()));
            list.add(basePath.concat(phantomPath.getJsPath()));
            list.add(url);
            list.add(String.format("%s/%s", phantomPath.getImagePath(), imageName));
        }
 
        //Linux命令
        else {
            list.add("./phantomjs");
            list.add(phantomPath.getJsPath());
            list.add(url);
            list.add(String.format("%s/%s", phantomPath.getImagePath(), imageName));
        }
 
        log.info("PhantomJs导出图片的cmd：{}", list);
        return list.toArray(new String[0]);
    }
 
    //获取项目跟目录的：plugins目录
    private static String getBasePath() throws FileNotFoundException {
        String basePath = ResourceUtils.getURL("plugins").getPath();
        return StrUtil.removePrefix(basePath, "/");
    }
 
    //关闭命令
    public static void close(Process process, BufferedReader bufferedReader) throws IOException {
        if (bufferedReader != null) {
            bufferedReader.close();
        }
        if (process != null) {
            process.destroy();
        }
    }
 
}

3.6、测试

controller

    /** <p>PhantomJs截图<p>
     * <p>截图到本地，未下载到浏览器<p>
     * @return {@link String}
     * @since 2023/11/6
     * @author CC
     **/
    @GetMapping("/getPhantomJs")
    public void getPhantomJs(){
        String url = "https://www.baidu.com/";
        try {
            phantomTools.printUrlScreen2jpg(url, "第一个图片.png");
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

成功
导出京东等网站的缺陷（很多js无法、图片无法加载）

回到顶部

三、Puppeteer

3.1、Puppeteer介绍

Puppeteer是一个由Google Chrome团队官方提供的Node.js库，它提供了一组API来通过DevTools协议控制Chromium或Chrome。在大多数情况下，它可以用来替代PhantomJS。

Puppeteer的主要功能包括：

生成页面的屏幕截图和PDF。
爬取SPA（单页应用）并生成预渲染内容（即“SSR”（服务器端渲染））。
自动表单提交，UI测试，键盘输入等。
创建一个最新的自动化测试环境，使用最新的JavaScript和浏览器功能，如Async/Await，ES6等。
捕获网站的时间线，以便帮助诊断性能问题。

Puppeteer默认下载并使用特定版本的Chromium，所以它对于最新的web API和特性的支持非常好。

3.2、实现（未实现）

回到顶部

四、Selenium

相当于用户在操作浏览器

4.1、Selenium介绍

Selenium是一个流行的开源Web测试框架，它允许你编写脚本以自动化浏览器操作。这些脚本可以用于测试Web应用程序在各种浏览器中的行为，或者用于自动化重复的Web浏览任务。

Selenium支持多种编程语言，包括Java、C#、Python、Ruby和JavaScript。它还支持所有主流的Web浏览器，包括Chrome、Firefox、Safari和Internet Explorer。

Selenium有两个主要组件：

Selenium WebDriver：这是一个库，提供了一组API来编程式地控制浏览器。你可以使用这些API来模拟用户操作，如点击按钮、填写表单和导航页面。
Selenium Grid：这是一个工具，用于在多台机器和多种浏览器上并行运行测试。这对于测试Web应用程序在不同环境中的行为非常有用。

虽然Selenium主要用于测试，但它也可以用于Web爬虫，以自动化数据收集任务。

4.2、实现（未实现）

4.3、Puppeteer和Selenium区别

 Puppeteer和Selenium都可以用于网页截图，但它们各有优势。
 
Puppeteer：
 
- Puppeteer是由Chrome团队开发和维护的，因此它与Chrome浏览器的集成非常紧密，可以使用最新的Chrome特性。
- Puppeteer的API设计得更为现代和简洁，使用Promise，可以很好地与现代JavaScript（如async/await）一起使用。
- Puppeteer在截图和PDF生成方面的功能更为强大，例如，它可以很容易地截取全页面的屏幕截图，或者生成页面的PDF。
 
Selenium：
 
- Selenium支持多种编程语言和浏览器，如果你需要在多种浏览器或使用不同的编程语言进行截图，Selenium可能是更好的选择。
- Selenium有一个大型的社区和大量的第三方库，这可能会使得解决问题和寻找已有的解决方案更为容易。
 
总的来说，如果你只需要在Chrome浏览器中进行截图，并且喜欢使用现代的JavaScript API，那么Puppeteer可能是更好的选择。如果你需要在多种浏览器中进行截图，或者使用Java、Python等其他编程语言，那么Selenium可能是更好的选择。

回到顶部

五、总结

1、生成的命令

生成cmd命令：可以是直接的字符串，也可以是string数组(推荐)

2、Process

提供了与操作系统中的进程进行交互
方法（执行系统命令）：Runtime.getRuntime().exec()

 Runtime.getRuntime().exec(参数1)
参数1是：执行的命令
 
Runtime.getRuntime().exec(参数1，参数2，参数3)
参数1是：执行的命令
参数3是：程序执行的文件夹，如：/phantomjs211/linux/bin

3、代码地址
https://gitee.com/KakarottoChen/blog-code.git

……\blog-code\SpringBoot\UrlGetHtml

4、参考：

http://t.csdnimg.cn/hf4Bj

posted on 2023-11-06 16:05 C_C_菜园阅读(2404) 评论(0) 编辑收藏举报

刷新页面返回顶部

登录后才能查看或发表评论，立即登录或者逛逛博客园首页

相关博文：

· SpringBoot导出Word文档的三种方式

· SpringBoot获取配置：@Value、@ConfigurationProperties方式

· Java实现HTML页面截图功能

· Day 22 22.1 Web自动化之selenium&pyppeteer

· Java+Selenium实现网页截图

kakarotto-chen

Java根据URL截图的4种方式

方案选择

一、XHTMLRenderer（不要用）

1、XHTMLRenderer介绍

2、适用场景

3、实现

二、PhantomJs（我用的）

1、PhantomJs介绍

2、PhantomJs安装(下载)后，直接去执行命令即可截图

3、Java代码实现phantomjs截图

三、Puppeteer

四、Selenium

五、总结

导航

公告

嘘，别说话！

常用链接

最新随笔

合集

随笔分类

随笔档案

阅读排行榜

评论排行榜

推荐排行榜

最新评论

	implementation 'org.jsoup:jsoup:1.15.3'
	// XHTMLRenderer 的核心包
	// - 用于将XML、XHTML和CSS文档渲染为PDF、图像或Swing组件。
	// - 它基于W3C标准和开放源代码的浏览器引擎，可以将HTML文档转换为可打印或可展示的格式。
	implementation 'org.xhtmlrenderer:flying-saucer-core:9.1.18'

	Jsoup是一个Java库，用于处理HTML。它提供了一个非常方便的API，用于提取和操作数据，使用DOM，CSS和jquery-like方法。

	Jsoup的主要功能包括：

	1. 从URL、文件或字符串中解析HTML。
	2. 使用DOM和CSS选择器来查找、提取和操作数据。
	3. 清理用户提交的内容，以防止跨站脚本攻击。
	4. 输出整洁的HTML。

	Jsoup设计用于处理所有类型的HTML：从整洁的HTML5到混乱的实际生产HTML。
	Jsoup会将输入HTML解析为与浏览器相同的DOM结构，这使得它能够处理各种复杂的HTML结构。

	package com.cc.urlgethtml.utils;

	import lombok.Data;
	import org.jsoup.Jsoup;
	import org.jsoup.nodes.Document;

	import java.io.IOException;

	/**
	* <p></p>
	*
	* @author CC
	* @since 2023/11/6
	*/
	@Data
	public class JsoupUtils {

	public static String getHtmlByJsoup(String url) {
	Document doc;
	try {
	doc = Jsoup.connect(url).get();
	} catch (IOException e) {
	throw new RuntimeException(e);
	}
	return doc.html();
	}
	}

	/** <p>通过Java2DRenderer将html转为图片</p>
	* <ol>
	* <li>通过jsoup根据url获取静态html</li>
	* <li>使用Java2DRenderer将静态html转为图片，浏览器下载</li>
	* </ol>
	*
	* <li>优点：不用第三方插件。如果是标准的html页面，可以使用该方式</li>
	* <li>缺点：复杂的html转不了，直接会报错</li>
	*/
	@GetMapping("/getJava2DRenderer")
	public String get1(HttpServletResponse response){

	//url
	String url = "http://www.baidu.com";
	//一、通过jsoup获取url的html（获取的html不完整，而且是渲染前的）
	// String html = JsoupUtils.getHtmlByJsoup(url);

	//二、模拟从url获取的：非常简单的html。测了从富文本中获取的html有可能都识别不了。恶心。
	String html = "<html><body><h1>Hello, World!</h1></body></html>";
	String fileName = "文件名.png";

	//转图片，有问题——复杂的html转不了
	Document document;
	try {
	DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
	document = builder.parse(new ByteArrayInputStream(html.getBytes()));
	} catch (Exception e) {
	throw new RuntimeException(e);
	}

	Java2DRenderer renderer = new Java2DRenderer(document, 480, 640);
	BufferedImage img = renderer.getImage();

	//方式一：浏览器下载
	try {
	response.setCharacterEncoding(StandardCharsets.UTF_8.name());
	response.setContentType("image/png;charset=".concat(StandardCharsets.UTF_8.name()));
	response.setHeader(HttpHeaders.ACCESS_CONTROL_EXPOSE_HEADERS,HttpHeaders.CONTENT_DISPOSITION);
	response.setHeader(HttpHeaders.CONTENT_DISPOSITION,
	"attachment; filename=".concat(
	URLEncoder.encode(fileName, StandardCharsets.UTF_8.name())
	));
	ServletOutputStream out = response.getOutputStream();

	//把 BufferedImage 缓冲图像流写到输出流 out 中
	ImageIO.write(img, "png", out);
	out.close();
	}catch(Exception e){
	throw new RuntimeException(e.getMessage());
	}

	//方式二：写入本地
	/*FSImageWriter imageWriter = new FSImageWriter();
	try {
	imageWriter.write(img, "D:\\ABC.jpg");
	} catch (IOException e) {
	throw new RuntimeException(e);
	}*/

	return html;
	}

	phantomjs.exe ……\win\examples\rasterize.js http://www.baidu.com ……Test\bddddd11.png

	./phantomjs ……\win\examples\rasterize.js http://www.baidu.com ……Test\bddddd11.png

	var page = require('webpage').create(),
	system = require('system'),
	address, output, size;

	//可以带cookie
	//var flag = phantom.addCookie({
	// "domain": ".baidu.com" ,
	// "expires": "Fri, 01 Jan 2038 00:00:00 GMT",
	// "expiry": 2145916800,
	// "httponly": false,
	// "name": "token",
	// "path": "/",
	// "secure": false,
	// "value": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJVc2VySWQiOiJud19kemdkIiwiRXhwaXJlIjoiMjAyMy0wOC0xMCAxNDoyMzo0NyJ9.InsFJkcXI6C57r-1Oqb7PMn-OcP9k0W5lf1K896EasY"
	//});

	if (system.args.length < 3 \|\| system.args.length > 5) {
	phantom.exit(1);
	} else {
	address = system.args[1];//传入url地址
	output = system.args[2];//输出图片的地址
	page.viewportSize = { width: 800, height: 1800 };//自定义定义宽高
	if (system.args.length > 3 && system.args[2].substr(-4) === ".pdf") {
	size = system.args[3].split('*');
	page.paperSize = size.length === 2 ? { width: size[0], height: size[1], margin: '0px' }
	: { format: system.args[3], orientation: 'portrait', margin: '1cm' };
	} else if (system.args.length > 3 && system.args[3].substr(-2) === "px") {
	size = system.args[3].split('*');
	if (size.length === 2) {
	pageWidth = parseInt(size[0], 10);
	pageHeight = parseInt(size[1], 10);
	page.viewportSize = { width: pageWidth, height: pageHeight };
	page.clipRect = { top: 0, left: 0, width: pageWidth, height: pageHeight };
	} else {
	console.log("size:", system.args[3]);
	pageWidth = parseInt(system.args[3], 10);
	pageHeight = parseInt(pageWidth * 3/4, 10); // it's as good an assumption as any
	console.log ("pageHeight:",pageHeight);
	page.viewportSize = { width: pageWidth, height: pageHeight };
	}
	}
	if (system.args.length > 4) {
	page.zoomFactor = system.args[4];
	}
	page.open(address, function (status) {
	if (status !== 'success') {
	console.log('Unable to load the address!');
	phantom.exit(1);
	} else {
	window.setTimeout(function () {
	page.render(output);
	phantom.exit();
	}, 3000);
	}
	});
	}

	address = system.args[1];//传入url地址
	output = system.args[2];//输出图片的地址
	page.viewportSize = { width: 1200, height: 780 };//自定义定义宽高 (2000 * 1300 比较合适)

	# phantomjs的位置地址
	phantomjs:
	binPath:
	windows: plugins/phantomjs211/win/bin/phantomjs.exe
	linux: /plugins/phantomjs211/linux/bin
	jsPath:
	windows: plugins/phantomjs211/win/examples/rasterize.js
	linux: /plugins/phantomjs211/linux/examples/rasterize.js
	imagePath:
	windows: D:/Test
	linux: /downImage

	package com.cc.urlgethtml.utils;

	import lombok.Data;
	import org.springframework.beans.factory.annotation.Value;
	import org.springframework.stereotype.Component;

	/**
	* <p>根据不同系统获取不同路径</p>
	*
	* @author CC
	* @since 2023/11/3
	*/
	@Data
	@Component
	public class PhantomPath {

	@Value("${phantomjs.binPath.windows}")
	private String binPathWin;
	@Value("${phantomjs.jsPath.windows}")
	private String jsPathWin;
	@Value("${phantomjs.binPath.linux}")
	private String binPathLinux;
	@Value("${phantomjs.jsPath.linux}")
	private String jsPathLinux;
	@Value("${phantomjs.imagePath.windows}")
	private String imagePathWin;
	@Value("${phantomjs.imagePath.linux}")
	private String imagePathLinux;

	//获取当前系统是否是Windows系统（不是就是服务器，服务器默认Linux系统（centos））
	private static final Boolean IS_WINDOWS = System.getProperty("os.name").toLowerCase().contains("windows");

	public String getImagePath() {
	return IS_WINDOWS ? imagePathWin : imagePathLinux;
	}

	public String getBinPath() {
	return IS_WINDOWS ? binPathWin : binPathLinux;
	}

	public String getJsPath() {
	return IS_WINDOWS ? jsPathWin : jsPathLinux;
	}

	public boolean getIsWindows() {
	return IS_WINDOWS;
	}
	}

	package com.cc.urlgethtml.utils;

	import cn.hutool.core.util.StrUtil;
	import lombok.Data;
	import lombok.extern.slf4j.Slf4j;
	import org.springframework.stereotype.Component;
	import org.springframework.util.ResourceUtils;

	import javax.annotation.Resource;
	import java.io.*;
	import java.util.ArrayList;
	import java.util.List;

	/** <p>根据网页地址转换成图片</p>
	* <p>条件：需要插件及js脚本</p>
	* <p>缺陷：部分网址，截图样式部分还是有问题</p>
	* @since 2023/11/6
	* @author CC
	**/
	@Data
	@Slf4j
	@Component
	public class PhantomTools {

	@Resource
	private PhantomPath phantomPath;

	// token
	private static String cookie = "token=";

	/**
	* @param url 地址
	* @param imageName 图片全名
	* @since 2023/11/3
	**/
	public void printUrlScreen2jpg(String url, String imageName) throws IOException{
	//执行命令，生成图片
	String[] cmd = this.getCmd(url, imageName);
	Process process;
	if (phantomPath.getIsWindows()) {
	//一个参数：直接执行命令（）
	process = Runtime.getRuntime().exec(cmd);
	}else {
	String linuxPath = getBasePath().concat(phantomPath.getBinPathLinux());
	log.info("Linux执行目录：{}", linuxPath);
	process = Runtime.getRuntime().exec(cmd, new String[]{}, new File(linuxPath));
	}

	//读取执行日志
	InputStream inputStream = process.getInputStream();
	BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
	String msg;
	while ((msg = reader.readLine()) != null) {
	log.info("PhantomJs导出图片日志：{}", msg);
	}
	log.info("PhantomJs导出图片：{}", imageName);
	//关闭流
	close(process, reader);
	}

	// 生成cmd命令：可以是直接的字符串，也可以是string数组
	public String[] getCmd(String url, String imageName) throws FileNotFoundException {
	String basePath = getBasePath();

	List<String> list = new ArrayList<>();
	//Windows命令
	if (phantomPath.getIsWindows()) {
	list.add(basePath.concat(phantomPath.getBinPath()));
	list.add(basePath.concat(phantomPath.getJsPath()));
	list.add(url);
	list.add(String.format("%s/%s", phantomPath.getImagePath(), imageName));
	}

	//Linux命令
	else {
	list.add("./phantomjs");
	list.add(phantomPath.getJsPath());
	list.add(url);
	list.add(String.format("%s/%s", phantomPath.getImagePath(), imageName));
	}

	log.info("PhantomJs导出图片的cmd：{}", list);
	return list.toArray(new String[0]);
	}

	//获取项目跟目录的：plugins目录
	private static String getBasePath() throws FileNotFoundException {
	String basePath = ResourceUtils.getURL("plugins").getPath();
	return StrUtil.removePrefix(basePath, "/");
	}

	//关闭命令
	public static void close(Process process, BufferedReader bufferedReader) throws IOException {
	if (bufferedReader != null) {
	bufferedReader.close();
	}
	if (process != null) {
	process.destroy();
	}
	}

	}

	/** <p>PhantomJs截图<p>
	* <p>截图到本地，未下载到浏览器<p>
	* @return {@link String}
	* @since 2023/11/6
	* @author CC
	**/
	@GetMapping("/getPhantomJs")
	public void getPhantomJs(){
	String url = "https://www.baidu.com/";
	try {
	phantomTools.printUrlScreen2jpg(url, "第一个图片.png");
	} catch (Exception e) {
	throw new RuntimeException(e);
	}
	}

	Puppeteer和Selenium都可以用于网页截图，但它们各有优势。

	Puppeteer：

	- Puppeteer是由Chrome团队开发和维护的，因此它与Chrome浏览器的集成非常紧密，可以使用最新的Chrome特性。
	- Puppeteer的API设计得更为现代和简洁，使用Promise，可以很好地与现代JavaScript（如async/await）一起使用。
	- Puppeteer在截图和PDF生成方面的功能更为强大，例如，它可以很容易地截取全页面的屏幕截图，或者生成页面的PDF。

	Selenium：

	- Selenium支持多种编程语言和浏览器，如果你需要在多种浏览器或使用不同的编程语言进行截图，Selenium可能是更好的选择。
	- Selenium有一个大型的社区和大量的第三方库，这可能会使得解决问题和寻找已有的解决方案更为容易。

	总的来说，如果你只需要在Chrome浏览器中进行截图，并且喜欢使用现代的JavaScript API，那么Puppeteer可能是更好的选择。如果你需要在多种浏览器中进行截图，或者使用Java、Python等其他编程语言，那么Selenium可能是更好的选择。

	Runtime.getRuntime().exec(参数1)
	参数1是：执行的命令

	Runtime.getRuntime().exec(参数1，参数2，参数3)
	参数1是：执行的命令
	参数3是：程序执行的文件夹，如：/phantomjs211/linux/bin