word转pdf

1 使用documents4j+libreoffice进行转换-有缺陷

实现思路:

1-在Windows系统中使用documents4j进行word向pdf的转换,这个依赖底层主要是使用Microsoft office的apis进行文档转换,所以只能在Windows中使用

2-在Linux中由于没有Microsoft office,所以只能手动下载libreoffice,通过这个服务进行文档的转换

-- 此程序在Windows中可以正常运行, 但是Linux中执行失败

1.1 下载libreoffice

yum install libreoffice

1.2 libreoffice转换命令

/usr/bin/libreoffice --headless --convert-to pdf srcUrl --outdir destUrl

1.3 Java代码细节

/**
 * 如果源文件为word 需要转换为pdf
 * @param inputStream
 * @param type 0-doc\1-docx
 * @return
 */
private InputStream wordToPdf(InputStream inputStream, int type) {
    ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
    String os = System.getProperty("os.name").toLowerCase();
    if(os.contains("windows")) {
        IConverter converter = LocalConverter.builder().build();
        if (0 == type)
            converter.convert(inputStream).as(DocumentType.DOC).to(outputStream).as(DocumentType.PDF).execute();
        else converter.convert(inputStream).as(DocumentType.DOCX).to(outputStream).as(DocumentType.PDF).execute();
    }else if(os.contains("linux") || os.contains("unix") || os.contains("mac")){
        // 将流写进文件
        String fileExtension = (type == 0) ? ".doc" : ".docx";
        // 这里的baseDir需要为绝对路径
        String srcUrl = baseDir + "/tmp/tmpDoc-" + IdUtils.randomUUID() + fileExtension;
        String destUrl = baseDir + "/tmp/tmpPdf-" + IdUtils.randomUUID() + ".pdf";
        try {
            writeToFile(inputStream, srcUrl);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
        // 构建LibreOffice命令
        String command = String.format(
                "/usr/bin/libreoffice --headless --convert-to pdf '%s' --outdir '%s'",
                srcUrl, destUrl);

        // 执行LibreOffice命令
        Process process = null;
        try {
            process = Runtime.getRuntime().exec(command);
            process.waitFor(); // 等待LibreOffice完成转换
            // 将文件转换为流
            outputStream = readFromFile(destUrl);
            // 清理临时文件
            new File(srcUrl).delete();
            new File(destUrl).delete();
        } catch (IOException e) {
            throw new RuntimeException(e);
        } catch (InterruptedException e) {
            throw new RuntimeException(e);
        }
    }else{
        throw new RuntimeException("不支持的系统");
    }
    return new ByteArrayInputStream(outputStream.toByteArray());
}

private void writeToFile(InputStream inputStream, String filePath) throws IOException {
    try (FileOutputStream fileOutputStream = new FileOutputStream(filePath)) {
        byte[] buffer = new byte[1024];
        int bytesRead;
        while ((bytesRead = inputStream.read(buffer)) != -1) {
            fileOutputStream.write(buffer, 0, bytesRead);
        }
    }
}

private ByteArrayOutputStream readFromFile(String filePath) throws IOException {
    try (FileInputStream fileInputStream = new FileInputStream(filePath)) {
        ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
        byte[] buffer = new byte[1024];
        int bytesRead;
        while ((bytesRead = fileInputStream.read(buffer)) != -1) {
            outputStream.write(buffer, 0, bytesRead);
        }
        return outputStream;
    }
}

2 使用aspose-words.jar进行文档转换

由于上述方法在Linux中存在问题,依赖于系统中的第三方api,所以并不方便,还对部署造成了负担,使用aspose似乎没有问题-除了付费,这里使用一点魔法应该没有大问题

官方jar包地址:https://releases.aspose.com/java/repo/com/aspose/aspose-words/

魔法参考:https://www.cnblogs.com/cxll863/p/16887080.html

实现细节

/**
     * 如果源文件为word 需要转换为pdf
     * @param inputStream
     * @return
     */
    private InputStream wordToPdf(InputStream inputStream) {
        InputStream fis = null;
        ByteArrayOutputStream out = null;
        try {
            ClassPathResource classPathResource = new ClassPathResource("license.xml");
            fis = classPathResource.getInputStream();
//            fis = new FileInputStream("src/main/resources/license.xml");
            License license = new License();
            license.setLicense(fis);
            out = new ByteArrayOutputStream();
            //开始转换代码...
            Document doc = new Document(inputStream);// 加载 Word 文档
            // 创建输出流
            doc.save(out, SaveFormat.PDF);
        } catch (FileNotFoundException e) {
            throw new RuntimeException(e);
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
        return new ByteArrayInputStream(out.toByteArray());
    }

确实比上述方法简洁、有效!收费的东西就是不一样

这里需要手动引入jar包,有几个细节需要注意:

  1. 引入jar包依赖
<dependency>
    <groupId>com.aspose</groupId>
    <artifactId>aspose-words</artifactId>
    <version>21.11</version>
    <classifier>jdk17</classifier>
    <scope>system</scope>
    <systemPath>${project.basedir}/src/main/resources/lib/aspose-words-21.11-jdk17.jar</systemPath>
</dependency>
  1. 打包插件
<plugin>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-maven-plugin</artifactId>
    <configuration>
        <includeSystemScope>true</includeSystemScope>
    </configuration>
</plugin>
  1. jar包中路径问题
ClassPathResource classPathResource = new ClassPathResource("license.xml");
InputStream fis = classPathResource.getInputStream();
posted @ 2024-08-24 14:03  yuqiu2004  阅读(57)  评论(0编辑  收藏  举报