Apache Poi 操作word,替换字符保留样式问题,runs段落混乱问题。

关于这个问题也是刚好遇到,一通搜索也没有找到类似的或者是有效的方法。下面介绍一下。

首先apache poi的引入

        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi</artifactId>
            <version>4.1.2</version>
        </dependency>

        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-ooxml</artifactId>
            <version>4.1.2</version>
        </dependency>

  
        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-ooxml-schemas</artifactId>
            <version>4.1.2</version>
        </dependency>

        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-scratchpad</artifactId>
            <version>4.1.2</version>
        </dependency>    

<!-- poi_tl 工具,仅支持docx且友好
        开源,官方文档:http://deepoove.com/poi-tl/1.10.x/
        -->
        <dependency>
            <groupId>com.deepoove</groupId>
            <artifactId>poi-tl</artifactId>
            <version>1.10.0</version>
        </dependency>

上面的包都是一些基础的东西,然后需要注意的是版本问题。因为版本不一样可能导致的用法也不一样。 poi-tl 1.10.0 版本需要poi 4.1.2的版本来支持。这个官方的作者已经说了。

下面直接上 替换word的代码

     XWPFDocument document = new XWPFDocument(in);

        List<XWPFParagraph> paragraphs = document.getParagraphs();

        Map<String, String> replacements = new HashMap<>();
    //这里是找回替换的特殊字符,我是通过正则去找回的。因为我的比较多。而且我一般习惯${}的写法。
    //当然poi-tl也可以直接替换很方便,但是这里用的是原生的apache poi。因为担心poi-tl还不是很成熟。 List
<String> replaceFields = retrieveReplaceFields(document, regex); for (String replaceField : replaceFields) { String factField = StringUtils.substringBetween(replaceField, prefix, suffix); String val = retrieveData(factField, data); replacements.put(replaceField,val); }
    //这里是普通段落 replaceInParagraphs(replacements,paragraphs);
//处理表格 Iterator<XWPFTable> iterator = document.getTablesIterator(); while (iterator.hasNext()) { XWPFTable table = iterator.next(); List<XWPFTableRow> rows = table.getRows(); for (XWPFTableRow row : rows) { List<XWPFTableCell> tableCells = row.getTableCells(); for (XWPFTableCell cell : tableCells) { List<XWPFParagraph> cellParagraphs = cell.getParagraphs(); replaceInParagraphs(replacements,cellParagraphs); } } } ByteArrayOutputStream output = new ByteArrayOutputStream(); document.write(output); document.write(new FileOutputStream("C:\\Users\\dato\\Desktop\\dato.docx"));

这其中最重要替换方法来了:

replaceInParagraphs(replacements,cellParagraphs);

private static long replaceInParagraphs(Map<String, String> replacements, List<XWPFParagraph> xwpfParagraphs) {
long count = 0;
for (XWPFParagraph paragraph : xwpfParagraphs) {
List<XWPFRun> runs = paragraph.getRuns();
for (Map.Entry<String, String> replPair : replacements.entrySet()) {
String find = replPair.getKey();
String repl = replPair.getValue();
TextSegment found = paragraph.searchText(find, new PositionInParagraph());
if ( found != null ) {
count++;
if ( found.getBeginRun() == found.getEndRun() ) {
// whole search string is in one Run
XWPFRun run = runs.get(found.getBeginRun());
String runText = run.getText(run.getTextPosition());
String replaced = runText.replace(find, repl);
run.setText(replaced, 0);
} else {
// The search string spans over more than one Run
// Put the Strings together
StringBuilder b = new StringBuilder();
for (int runPos = found.getBeginRun(); runPos <= found.getEndRun(); runPos++) {
XWPFRun run = runs.get(runPos);
b.append(run.getText(run.getTextPosition()));
}
String connectedRuns = b.toString();
String replaced = connectedRuns.replace(find, repl);
// The first Run receives the replaced String of all connected Runs
XWPFRun partOne = runs.get(found.getBeginRun());
partOne.setText(replaced, 0);
// Removing the text in the other Runs.
for (int runPos = found.getBeginRun()+1; runPos <= found.getEndRun(); runPos++) {
XWPFRun partNext = runs.get(runPos);
partNext.setText("", 0);
}
}
}
}
}
return count;
}
 

 

posted @ 2022-06-16 10:55  大头就是我  阅读(1474)  评论(0编辑  收藏  举报