通过java实现word转PDF
原文链接:https://blog.csdn.net/ka3p06/article/details/125476270
介绍
用于java项目中解决word转pdf的需求,转换的效果跟调用的工具类、字体库、源文件(是wps还是microsoft保存的,格式版本等)、系统环境等多个因素相关,没有百分百完成的方法,只有不断尝试,选择适合的方法。以下介绍三个能实现该功能的方法。
一、通过Aspose实现
说明:毕竟是需要付费买Licnse才能去水印的,还是比较好用的,我这边使用的版本是20.12。感兴趣的同学可以去aspose-words-20.12-jdk17.jar中查看License.class的源码了解验证逻辑。网上也有文章介绍这个逻辑,但大多数都是21年之前的说明了,22年后相同版本的License验证有变动。老版本(18.6版本,当前写文章的时间2022-6-26)的Licene的认证逻辑跟网上的差别不大,但是方法名称是有变动的,不要一味照搬其他文章内容,还是需要多看多思考。
项目实践过程中遇到问题:低于19.11的版本(本人测试过18.6版本的)在保存pdf后,会出现行高自动增加的问题。
用于实现的jar包20.12和18.6的地址:https://download.csdn.net/download/ka3p06/85789859
相关依赖
<repositories>
<repository>
<id>AsposeJavaAPI</id>
<name>Aspose Java API</name>
<url>https://repository.aspose.com/repo/</url>
</repository>
</repositories>
<span class="token generics"><span class="token punctuation"><</span>dependencies<span class="token punctuation">></span></span>
<span class="token generics"><span class="token punctuation"><</span>dependency<span class="token punctuation">></span></span>
<span class="token generics"><span class="token punctuation"><</span>groupId<span class="token punctuation">></span></span>com<span class="token punctuation">.</span>aspose<span class="token operator"><</span><span class="token operator">/</span>groupId<span class="token operator">></span>
<span class="token generics"><span class="token punctuation"><</span>artifactId<span class="token punctuation">></span></span>aspose<span class="token operator">-</span>words<span class="token operator"><</span><span class="token operator">/</span>artifactId<span class="token operator">></span>
<span class="token generics"><span class="token punctuation"><</span>version<span class="token punctuation">></span></span><span class="token number">20.12</span><span class="token operator"><</span><span class="token operator">/</span>version<span class="token operator">></span>
<span class="token generics"><span class="token punctuation"><</span>classifier<span class="token punctuation">></span></span>jdk17<span class="token operator"><</span><span class="token operator">/</span>classifier<span class="token operator">></span>
<span class="token operator"><</span><span class="token operator">/</span>dependency<span class="token operator">></span>
<span class="token operator"><</span><span class="token operator">/</span>dependencies<span class="token operator">></span>
<span class="token generics"><span class="token punctuation"><</span>dependencies<span class="token punctuation">></span></span>
<span class="token generics"><span class="token punctuation"><</span>dependency<span class="token punctuation">></span></span>
<span class="token generics"><span class="token punctuation"><</span>groupId<span class="token punctuation">></span></span>com<span class="token punctuation">.</span>aspose<span class="token operator"><</span><span class="token operator">/</span>groupId<span class="token operator">></span>
<span class="token generics"><span class="token punctuation"><</span>artifactId<span class="token punctuation">></span></span>aspose<span class="token operator">-</span>words<span class="token operator"><</span><span class="token operator">/</span>artifactId<span class="token operator">></span>
<span class="token generics"><span class="token punctuation"><</span>version<span class="token punctuation">></span></span><span class="token number">20.12</span><span class="token operator"><</span><span class="token operator">/</span>version<span class="token operator">></span>
<span class="token generics"><span class="token punctuation"><</span>classifier<span class="token punctuation">></span></span>jdk17<span class="token operator"><</span><span class="token operator">/</span>classifier<span class="token operator">></span>
<span class="token operator"><</span><span class="token operator">/</span>dependency<span class="token operator">></span>
<span class="token operator"><</span><span class="token operator">/</span>dependencies<span class="token operator">></span>
核心代码
/**
* 通过aspose 将word转pdf
*
* @param sourcePath 源文件地址 如 /root/example.doc
* @param targetPath 目标文件地址 如 /root/example.pdf
*/
public static void asposeWordToPdf(String sourcePath, String targetPath) {
LoadOptions opts = new LoadOptions();
// opts.setMswVersion(MsWordVersion.WORD_2016);
opts.getLanguagePreferences().setDefaultEditingLanguage(EditingLanguage.CHINESE_PRC);
Document doc = null;
try {
doc = new Document(sourcePath, opts);
ParagraphFormat pf = doc.getStyles().getDefaultParagraphFormat();
pf.clearFormatting();
PdfSaveOptions options = new PdfSaveOptions();
// 文字和图像压缩
options.setExportDocumentStructure(true);
options.setTextCompression(PdfTextCompression.FLATE);
options.setImageCompression(PdfImageCompression.AUTO);
// 接收修订
doc.acceptAllRevisions();
// 去掉批注
NodeCollection nc = doc.getChildNodes(NodeType.COMMENT,true);
if (nc != null && nc.getCount() > 0) {
for(int i=0;i<nc.getCount();i++){
log.info("清除批注:{}",nc.get(i).getText());
Node comment =nc.get(i);
comment.getParentNode().removeChild(comment);
}
}
// 将Word另存为PDF
doc.save(targetPath, options);
} catch (Exception e) {
log.error("[aspose] word转pdf失败:{}", e.toString());
}
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
二、通过docx4j实现
通过docx4j实现
相关依赖
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j-JAXB-Internal</artifactId>
<version>8.2.4</version>
</dependency>
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j-export-fo</artifactId>
<version>8.2.4</version>
</dependency>
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
核心代码
/**
* 通过docx4j 实现word转pdf
*
* @param sourcePath 源文件地址 如 /root/example.doc
* @param targetPath 目标文件地址 如 /root/example.pdf
*/
public static void docx4jWordToPdf(String sourcePath, String targetPath) {
try {
WordprocessingMLPackage pkg = Docx4J.load(new File(sourcePath));
Mapper fontMapper = new IdentityPlusMapper();
fontMapper.put("隶书", PhysicalFonts.get("LiSu"));
fontMapper.put("宋体", PhysicalFonts.get("SimSun"));
fontMapper.put("微软雅黑", PhysicalFonts.get("Microsoft Yahei"));
fontMapper.put("黑体", PhysicalFonts.get("SimHei"));
fontMapper.put("楷体", PhysicalFonts.get("KaiTi"));
fontMapper.put("新宋体", PhysicalFonts.get("NSimSun"));
fontMapper.put("华文行楷", PhysicalFonts.get("STXingkai"));
fontMapper.put("华文仿宋", PhysicalFonts.get("STFangsong"));
fontMapper.put("仿宋", PhysicalFonts.get("FangSong"));
fontMapper.put("幼圆", PhysicalFonts.get("YouYuan"));
fontMapper.put("华文宋体", PhysicalFonts.get("STSong"));
fontMapper.put("华文中宋", PhysicalFonts.get("STZhongsong"));
fontMapper.put("等线", PhysicalFonts.get("SimSun"));
fontMapper.put("等线 Light", PhysicalFonts.get("SimSun"));
fontMapper.put("华文琥珀", PhysicalFonts.get("STHupo"));
fontMapper.put("华文隶书", PhysicalFonts.get("STLiti"));
fontMapper.put("华文新魏", PhysicalFonts.get("STXinwei"));
fontMapper.put("华文彩云", PhysicalFonts.get("STCaiyun"));
fontMapper.put("方正姚体", PhysicalFonts.get("FZYaoti"));
fontMapper.put("方正舒体", PhysicalFonts.get("FZShuTi"));
fontMapper.put("华文细黑", PhysicalFonts.get("STXihei"));
fontMapper.put("宋体扩展", PhysicalFonts.get("simsun-extB"));
fontMapper.put("仿宋_GB2312", PhysicalFonts.get("FangSong_GB2312"));
fontMapper.put("新細明體", PhysicalFonts.get("SimSun"));
pkg.setFontMapper(fontMapper);
Docx4J.toPDF(pkg, new FileOutputStream(targetPath));
} catch (Exception e) {
log.error("[docx4j] word转pdf失败:{}", e.toString());
}
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
三、通过documents4j实现
通过documents4j实现
相关依赖
<dependency>
<groupId>com.documents4j</groupId>
<artifactId>documents4j-local</artifactId>
<version>1.0.3</version>
</dependency>
<dependency>
<groupId>com.documents4j</groupId>
<artifactId>documents4j-transformer-msoffice-word</artifactId>
<version>1.0.3</version>
</dependency>
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
核心代码
/**
* 通过documents4j 实现word转pdf
*
* @param sourcePath 源文件地址 如 /root/example.doc
* @param targetPath 目标文件地址 如 /root/example.pdf
*/
public static void documents4jWordToPdf(String sourcePath, String targetPath) {
File inputWord = new File(sourcePath);
File outputFile = new File(targetPath);
try {
InputStream docxInputStream = new FileInputStream(inputWord);
OutputStream outputStream = new FileOutputStream(outputFile);
IConverter converter = LocalConverter.builder().build();
converter.convert(docxInputStream)
.as(DocumentType.DOCX)
.to(outputStream)
.as(DocumentType.PDF).execute();
outputStream.close();
} catch (Exception e) {
log.error("[documents4J] word转pdf失败:{}", e.toString());
}
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· .NET10 - 预览版1新功能体验(一)
2022-04-11 Idea使用lombok时warn:Generating equals/hashCode implementation but without a call 子类继承父类时候
2022-04-11 配置Maven环境变量
2022-04-11 idea 如何将项目改成maven项目 在pom.xml 文件上右键 Add as Maven Project
2022-04-11 idea 缺失右侧maven窗口