Jaxb如何优雅的处理CData

 

前言

  Jaxb确实是xml和java对象映射互转的一大利器. 但是在处理CData内容块的时候, 还是有些小坑. 结合网上搜索的资料, 本文提供了一种解决的思路, 看看能否优雅地解决CData产出的问题.

常规做法

  网上最常见的做法是借助XmlAdapter和CharacterEscapeHandler(sun的api)组合来实现.
  首先定义CDataAdapter类, 用于对象类型转换.

1
2
3
4
5
6
7
8
9
10
11
12
13
public class CDataAdapter extends XmlAdapter<String, String> {
 
    @Override
    public String unmarshal(String v) throws Exception {
        return v;
    }
 
    @Override
    public String marshal(String v) throws Exception {
        return new StringBuilder("<![CDATA[").append(v).append("]]>").toString();
    }
 
}

  其借助注解XmlJavaTypeAdapter作用于属性变量上, 如下面的类对象上:

1
2
3
4
5
6
7
8
@XmlRootElement(name="root")
public static class TNode {
         
     @XmlJavaTypeAdapter(value=CDataAdapter.class)
     @XmlElement(name="text", required = true)
     private String text;
         
}

  使用Marshaller转为xml文本的时候, 结果却是如下:

1
2
3
<root>
    <text>&lt;![CDATA[李雷爱韩梅梅]]&gt;</text>
</root>

  这和我们预期的其实有差异, 我们其实想要的是如下的:

1
2
3
<root>
    <text><![CDATA[李雷爱韩梅梅]]></text>
</root>

  本质的原因是Jaxb默认会把字符'<', '>'进行转义, 为了解决这个问题, CharacterEscapeHandler就华丽登场了.

1
2
3
4
5
6
7
8
9
10
11
12
import com.sun.xml.internal.bind.marshaller.CharacterEscapeHandler;
 
marshaller.setProperty(
    "com.sun.xml.internal.bind.marshaller.CharacterEscapeHandler",
    new CharacterEscapeHandler() {
        @Override
        public void escape(char[] ch, int start, int length, boolean isAttVal, Writer writer)
                throws IOException {
            writer.write(ch, start, length);
        }
    }
);

  测试结果, 完美地解决问题. 然后随之而来的问题, 稍有些尴尬, 使用maven进行编译打包的时候, 会遇到如下错误:

1
2
[ERROR] Compilation failure
[ERROR] 程序包com.sun.xml.internal.bind.marshaller不存在

  Java工程开发, 一般不建议直接调用内部的api(以com.sun开头).

改进方案:

  参考了不少网友的博文, 大致思路都是一样的, 就是借助重载XMLStreamWriter类实现. 更确实的做法是重载writeCharacters方法, 在遇到CData标记(<![CDATA[]]>)包围的文本时, 选择调用writeCData函数, 可用以下代码来大致说明:

1
2
3
4
5
6
7
8
9
10
11
12
13
public class CDataXMLStreamWriter implements XMLStreamWriter {
 
    // *) 重载writeCharacters, 遇CDATA标记, 则转而调用writeCData方法
    @Override
    public void writeCharacters(String text) throws XMLStreamException {
        if ( text.startsWith("<![CDATA[") && text.endsWith("]]>") ) {
            writeCData(text.substring(9, text.length() - 3));
        } else {
            writeCharacters(text);
        }
    }
    // *) 演示使用
}

  真实的做法, 不会采用完整的去实现XmlStreamWriter接口的方案, 而是采用代理模式.这边采用动态代理的方法.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
private static class CDataHandler implements InvocationHandler {
    // *) 单独拦截 writeCharacters(String)方法
    private static Method gWriteCharactersMethod = null;
    static {
        try {
            gWriteCharactersMethod = XMLStreamWriter.class
                    .getDeclaredMethod("writeCharacters", String.class);
        } catch (NoSuchMethodException e) {
            e.printStackTrace();
        }
    }
 
    private XMLStreamWriter writer;
 
    public CDataHandler(XMLStreamWriter writer) {
        this.writer = writer;
    }
 
    @Override
    public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
        if ( gWriteCharactersMethod.equals(method) ) {
            String text = (String)args[0];
            // *) 遇到CDATA标记时, 则转而调用writeCData方法
            if ( text != null && text.startsWith("<![CDATA[") && text.endsWith("]]>") ) {
                writer.writeCData(text.substring(9, text.length() - 3));
                return null;
            }
        }
        return method.invoke(writer, args);
    }
 
}

  具体的Marshaller代码片段如下所示:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
public static <T> String mapToXmlWithCData(T obj) {
 
    try {
 
        StringWriter writer = new StringWriter();
        XMLStreamWriter streamWriter = XMLOutputFactory.newInstance()
                .createXMLStreamWriter(writer);
        // *) 使用动态代理模式, 对streamWriter功能进行干涉调整
        XMLStreamWriter cdataStreamWriter = (XMLStreamWriter) Proxy.newProxyInstance(
                streamWriter.getClass().getClassLoader(),
                streamWriter.getClass().getInterfaces(),
                new CDataHandler(streamWriter)
        );
 
        JAXBContext jc = JAXBContext.newInstance(obj.getClass());
        Marshaller marshaller = jc.createMarshaller();
        marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
        marshaller.setProperty(Marshaller.JAXB_ENCODING, "UTF-8");
 
        marshaller.marshal(obj, cdataStreamWriter);
        return writer.toString();
 
    } catch (JAXBException e) {
        e.printStackTrace();
    } catch (XMLStreamException e) {
        e.printStackTrace();
    }
    return null;
 
}

  测试的结果, 完美地解决了CData的问题(功能实现+绕过sun api), 不过这里面还有点小瑕疵, 就是对齐问题, 这段代码没法控制对齐.

对齐改进

  这边需要借助Transformer类实现, 思路是对最终的xml文本进行格式化处理.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// *) 对xml文本进行格式化转化
public static String indentFormat(String xml) {
    try {
        TransformerFactory factory = TransformerFactory.newInstance();
        Transformer transformer = factory.newTransformer();
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
 
        StringWriter formattedStringWriter = new StringWriter();
        transformer.transform(new StreamSource(new StringReader(xml)),
                new StreamResult(formattedStringWriter));
        return formattedStringWriter.toString();
    } catch (TransformerException e) {
    }
    return null;
}

  

完整的解决方案

  这边把上述所有的代码完整的贴一遍:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
import javax.xml.stream.XMLOutputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamWriter;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
 
// *) XmlAdapter类, 修饰类字段, 达到自动添加CDATA标记的目标
public static class CDataAdapter extends XmlAdapter<String, String> {
    @Override
    public String unmarshal(String v) throws Exception {
        return v;
    }
 
    @Override
    public String marshal(String v) throws Exception {
        return new StringBuilder("<![CDATA[").append(v).append("]]>")
                .toString();
    }
}
 
// *) 动态代理
private static class CDataHandler implements InvocationHandler {
 
    private static Method gWriteCharactersMethod = null;
    static {
        try {
            gWriteCharactersMethod = XMLStreamWriter.class
                    .getDeclaredMethod("writeCharacters", String.class);
        } catch (NoSuchMethodException e) {
            e.printStackTrace();
        }
    }
 
    private XMLStreamWriter writer;
 
    public CDataHandler(XMLStreamWriter writer) {
        this.writer = writer;
    }
 
    @Override
    public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
        if ( gWriteCharactersMethod.equals(method) ) {
            String text = (String)args[0];
            if ( text != null && text.startsWith("<![CDATA[") && text.endsWith("]]>") ) {
                writer.writeCData(text.substring(9, text.length() - 3));
                return null;
            }
        }
        return method.invoke(writer, args);
    }
 
}
 
// *) 生成xml
public static <T> String mapToXmlWithCData(T obj, boolean formatted) {
 
    try {
 
        StringWriter writer = new StringWriter();
        XMLStreamWriter streamWriter = XMLOutputFactory.newInstance()
                .createXMLStreamWriter(writer);
        // *) 使用动态代理模式, 对streamWriter功能进行干涉调整
        XMLStreamWriter cdataStreamWriter = (XMLStreamWriter) Proxy.newProxyInstance(
                streamWriter.getClass().getClassLoader(),
                streamWriter.getClass().getInterfaces(),
                new CDataHandler(streamWriter)
        );
 
        JAXBContext jc = JAXBContext.newInstance(obj.getClass());
        Marshaller marshaller = jc.createMarshaller();
        marshaller.setProperty(Marshaller.JAXB_ENCODING, "UTF-8");
 
        marshaller.marshal(obj, cdataStreamWriter);
        // *) 对齐差异处理
        if ( formatted ) {
            return indentFormat(writer.toString());
        } else {
            return writer.toString();
        }
 
    } catch (JAXBException e) {
        e.printStackTrace();
    } catch (XMLStreamException e) {
        e.printStackTrace();
    }
    return null;
 
}
 
// *) xml文本对齐
public static String indentFormat(String xml) {
    try {
        TransformerFactory factory = TransformerFactory.newInstance();
        Transformer transformer = factory.newTransformer();
        // *) 打开对齐开关
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        // *) 忽略掉xml声明头信息
        transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
        transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
        transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
 
        StringWriter formattedStringWriter = new StringWriter();
        transformer.transform(new StreamSource(new StringReader(xml)),
                new StreamResult(formattedStringWriter));
 
        return "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"
                + formattedStringWriter.toString();
    } catch (TransformerException e) {
    }
    return null;
}

  编写具体的测试案例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
@NoArgsConstructor
@AllArgsConstructor
@XmlRootElement(name="root")
public static class TNode {
    @XmlElement(name="key", required = true)
    private String key;
 
    @XmlJavaTypeAdapter(value=CDataAdapter.class)
    @XmlElement(name="text", required = true)
    private String text;
}
 
public static void main(String[] args) {
    TNode node = new TNode("key", "李雷爱韩梅梅");
    String xml = mapToXmlWithCData(node, true);
    System.out.println(xml);
}

  测试输出的结果如下:

1
2
3
4
5
<?xml version="1.0" encoding="UTF-8"?>
<root>
    <key>key</key>
    <text><![CDATA[李雷爱韩梅梅]]></text>
</root>

 

总结

  总的来说, 改进的方案规避了sun api的编译限制. 同时能满足之前的功能需求, 值得小小鼓励一下, ^_^.

 

posted on   mumuxinfei  阅读(6335)  评论(7编辑  收藏  举报

编辑推荐:
· .NET Core 中如何实现缓存的预热?
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
阅读排行:
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· 阿里巴巴 QwQ-32B真的超越了 DeepSeek R-1吗?
· 【译】Visual Studio 中新的强大生产力特性
· 10年+ .NET Coder 心语 ── 封装的思维:从隐藏、稳定开始理解其本质意义
· 【设计模式】告别冗长if-else语句:使用策略模式优化代码结构

导航

< 2025年3月 >
23 24 25 26 27 28 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5

统计

点击右上角即可分享
微信分享提示