Jaxb如何优雅的处理CData
前言
Jaxb确实是xml和java对象映射互转的一大利器. 但是在处理CData内容块的时候, 还是有些小坑. 结合网上搜索的资料, 本文提供了一种解决的思路, 看看能否优雅地解决CData产出的问题.
常规做法
网上最常见的做法是借助XmlAdapter和CharacterEscapeHandler(sun的api)组合来实现.
首先定义CDataAdapter类, 用于对象类型转换.
1 2 3 4 5 6 7 8 9 10 11 12 13 | public class CDataAdapter extends XmlAdapter<String, String> { @Override public String unmarshal(String v) throws Exception { return v; } @Override public String marshal(String v) throws Exception { return new StringBuilder( "<![CDATA[" ).append(v).append( "]]>" ).toString(); } } |
其借助注解XmlJavaTypeAdapter作用于属性变量上, 如下面的类对象上:
1 2 3 4 5 6 7 8 | @XmlRootElement (name= "root" ) public static class TNode { @XmlJavaTypeAdapter (value=CDataAdapter. class ) @XmlElement (name= "text" , required = true ) private String text; } |
使用Marshaller转为xml文本的时候, 结果却是如下:
1 2 3 | <root> <text><![CDATA[李雷爱韩梅梅]]></text> </root> |
这和我们预期的其实有差异, 我们其实想要的是如下的:
1 2 3 | <root> <text><![CDATA[李雷爱韩梅梅]]></text> </root> |
本质的原因是Jaxb默认会把字符'<', '>'进行转义, 为了解决这个问题, CharacterEscapeHandler就华丽登场了.
1 2 3 4 5 6 7 8 9 10 11 12 | import com.sun.xml.internal.bind.marshaller.CharacterEscapeHandler; marshaller.setProperty( "com.sun.xml.internal.bind.marshaller.CharacterEscapeHandler" , new CharacterEscapeHandler() { @Override public void escape( char [] ch, int start, int length, boolean isAttVal, Writer writer) throws IOException { writer.write(ch, start, length); } } ); |
测试结果, 完美地解决问题. 然后随之而来的问题, 稍有些尴尬, 使用maven进行编译打包的时候, 会遇到如下错误:
1 2 | [ERROR] Compilation failure [ERROR] 程序包com.sun.xml.internal.bind.marshaller不存在 |
Java工程开发, 一般不建议直接调用内部的api(以com.sun开头).
改进方案:
参考了不少网友的博文, 大致思路都是一样的, 就是借助重载XMLStreamWriter类实现. 更确实的做法是重载writeCharacters方法, 在遇到CData标记(<![CDATA[]]>)包围的文本时, 选择调用writeCData函数, 可用以下代码来大致说明:
1 2 3 4 5 6 7 8 9 10 11 12 13 | public class CDataXMLStreamWriter implements XMLStreamWriter { // *) 重载writeCharacters, 遇CDATA标记, 则转而调用writeCData方法 @Override public void writeCharacters(String text) throws XMLStreamException { if ( text.startsWith( "<![CDATA[" ) && text.endsWith( "]]>" ) ) { writeCData(text.substring( 9 , text.length() - 3 )); } else { writeCharacters(text); } } // *) 演示使用 } |
真实的做法, 不会采用完整的去实现XmlStreamWriter接口的方案, 而是采用代理模式.这边采用动态代理的方法.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | private static class CDataHandler implements InvocationHandler { // *) 单独拦截 writeCharacters(String)方法 private static Method gWriteCharactersMethod = null ; static { try { gWriteCharactersMethod = XMLStreamWriter. class .getDeclaredMethod( "writeCharacters" , String. class ); } catch (NoSuchMethodException e) { e.printStackTrace(); } } private XMLStreamWriter writer; public CDataHandler(XMLStreamWriter writer) { this .writer = writer; } @Override public Object invoke(Object proxy, Method method, Object[] args) throws Throwable { if ( gWriteCharactersMethod.equals(method) ) { String text = (String)args[ 0 ]; // *) 遇到CDATA标记时, 则转而调用writeCData方法 if ( text != null && text.startsWith( "<![CDATA[" ) && text.endsWith( "]]>" ) ) { writer.writeCData(text.substring( 9 , text.length() - 3 )); return null ; } } return method.invoke(writer, args); } } |
具体的Marshaller代码片段如下所示:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | public static <T> String mapToXmlWithCData(T obj) { try { StringWriter writer = new StringWriter(); XMLStreamWriter streamWriter = XMLOutputFactory.newInstance() .createXMLStreamWriter(writer); // *) 使用动态代理模式, 对streamWriter功能进行干涉调整 XMLStreamWriter cdataStreamWriter = (XMLStreamWriter) Proxy.newProxyInstance( streamWriter.getClass().getClassLoader(), streamWriter.getClass().getInterfaces(), new CDataHandler(streamWriter) ); JAXBContext jc = JAXBContext.newInstance(obj.getClass()); Marshaller marshaller = jc.createMarshaller(); marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true ); marshaller.setProperty(Marshaller.JAXB_ENCODING, "UTF-8" ); marshaller.marshal(obj, cdataStreamWriter); return writer.toString(); } catch (JAXBException e) { e.printStackTrace(); } catch (XMLStreamException e) { e.printStackTrace(); } return null ; } |
测试的结果, 完美地解决了CData的问题(功能实现+绕过sun api), 不过这里面还有点小瑕疵, 就是对齐问题, 这段代码没法控制对齐.
对齐改进
这边需要借助Transformer类实现, 思路是对最终的xml文本进行格式化处理.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | // *) 对xml文本进行格式化转化 public static String indentFormat(String xml) { try { TransformerFactory factory = TransformerFactory.newInstance(); Transformer transformer = factory.newTransformer(); transformer.setOutputProperty(OutputKeys.INDENT, "yes" ); transformer.setOutputProperty( "{http://xml.apache.org/xslt}indent-amount" , "4" ); StringWriter formattedStringWriter = new StringWriter(); transformer.transform( new StreamSource( new StringReader(xml)), new StreamResult(formattedStringWriter)); return formattedStringWriter.toString(); } catch (TransformerException e) { } return null ; } |
完整的解决方案
这边把上述所有的代码完整的贴一遍:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 | import javax.xml.stream.XMLOutputFactory; import javax.xml.stream.XMLStreamException; import javax.xml.stream.XMLStreamWriter; import javax.xml.transform.OutputKeys; import javax.xml.transform.Transformer; import javax.xml.transform.TransformerException; import javax.xml.transform.TransformerFactory; import javax.xml.transform.stream.StreamResult; import javax.xml.transform.stream.StreamSource; // *) XmlAdapter类, 修饰类字段, 达到自动添加CDATA标记的目标 public static class CDataAdapter extends XmlAdapter<String, String> { @Override public String unmarshal(String v) throws Exception { return v; } @Override public String marshal(String v) throws Exception { return new StringBuilder( "<![CDATA[" ).append(v).append( "]]>" ) .toString(); } } // *) 动态代理 private static class CDataHandler implements InvocationHandler { private static Method gWriteCharactersMethod = null ; static { try { gWriteCharactersMethod = XMLStreamWriter. class .getDeclaredMethod( "writeCharacters" , String. class ); } catch (NoSuchMethodException e) { e.printStackTrace(); } } private XMLStreamWriter writer; public CDataHandler(XMLStreamWriter writer) { this .writer = writer; } @Override public Object invoke(Object proxy, Method method, Object[] args) throws Throwable { if ( gWriteCharactersMethod.equals(method) ) { String text = (String)args[ 0 ]; if ( text != null && text.startsWith( "<![CDATA[" ) && text.endsWith( "]]>" ) ) { writer.writeCData(text.substring( 9 , text.length() - 3 )); return null ; } } return method.invoke(writer, args); } } // *) 生成xml public static <T> String mapToXmlWithCData(T obj, boolean formatted) { try { StringWriter writer = new StringWriter(); XMLStreamWriter streamWriter = XMLOutputFactory.newInstance() .createXMLStreamWriter(writer); // *) 使用动态代理模式, 对streamWriter功能进行干涉调整 XMLStreamWriter cdataStreamWriter = (XMLStreamWriter) Proxy.newProxyInstance( streamWriter.getClass().getClassLoader(), streamWriter.getClass().getInterfaces(), new CDataHandler(streamWriter) ); JAXBContext jc = JAXBContext.newInstance(obj.getClass()); Marshaller marshaller = jc.createMarshaller(); marshaller.setProperty(Marshaller.JAXB_ENCODING, "UTF-8" ); marshaller.marshal(obj, cdataStreamWriter); // *) 对齐差异处理 if ( formatted ) { return indentFormat(writer.toString()); } else { return writer.toString(); } } catch (JAXBException e) { e.printStackTrace(); } catch (XMLStreamException e) { e.printStackTrace(); } return null ; } // *) xml文本对齐 public static String indentFormat(String xml) { try { TransformerFactory factory = TransformerFactory.newInstance(); Transformer transformer = factory.newTransformer(); // *) 打开对齐开关 transformer.setOutputProperty(OutputKeys.INDENT, "yes" ); // *) 忽略掉xml声明头信息 transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes" ); transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8" ); transformer.setOutputProperty( "{http://xml.apache.org/xslt}indent-amount" , "4" ); StringWriter formattedStringWriter = new StringWriter(); transformer.transform( new StreamSource( new StringReader(xml)), new StreamResult(formattedStringWriter)); return "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" + formattedStringWriter.toString(); } catch (TransformerException e) { } return null ; } |
编写具体的测试案例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | @NoArgsConstructor @AllArgsConstructor @XmlRootElement (name= "root" ) public static class TNode { @XmlElement (name= "key" , required = true ) private String key; @XmlJavaTypeAdapter (value=CDataAdapter. class ) @XmlElement (name= "text" , required = true ) private String text; } public static void main(String[] args) { TNode node = new TNode( "key" , "李雷爱韩梅梅" ); String xml = mapToXmlWithCData(node, true ); System.out.println(xml); } |
测试输出的结果如下:
1 2 3 4 5 | <?xml version= "1.0" encoding= "UTF-8" ?> <root> <key>key</key> <text><![CDATA[李雷爱韩梅梅]]></text> </root> |
总结
总的来说, 改进的方案规避了sun api的编译限制. 同时能满足之前的功能需求, 值得小小鼓励一下, ^_^.
posted on 2018-06-01 15:16 mumuxinfei 阅读(6335) 评论(7) 编辑 收藏 举报
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· .NET Core 中如何实现缓存的预热?
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· 阿里巴巴 QwQ-32B真的超越了 DeepSeek R-1吗?
· 【译】Visual Studio 中新的强大生产力特性
· 10年+ .NET Coder 心语 ── 封装的思维:从隐藏、稳定开始理解其本质意义
· 【设计模式】告别冗长if-else语句:使用策略模式优化代码结构