XML-SAX
XML解析技术常用的就是DOM和SAX,DOM(Document Object Model)即文档对象模型,将xml文件整体解析为内存中的特定数据结构,然后再对该内存中的数据机构进行操作(增删改查),SAX(Simple API for XML)即简单XML API,以流的方式读取xml文档,并以事件驱动来处理节点,其只能查询,无法对xml进行更新操作。
DOM4J是借助SAX方式来构建DOM对象,jdk中自带的DOM(com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl)的API是w3c标准API,DOM4J则是自己独有的一套API,二者的内在实现也略有不同,jdk的DOM需要将xml全部读入内存再进行解析,DOM4J则使用SAX进行流式解析(事件驱动)。
1 <?xml version="1.0" encoding="UTF-8"?> 2 <DocumentElement param="value"> 3 <FirstElement> 4 ¶ Some Text 5 </FirstElement> 6 <?some_pi some_attr="some_value"?> 7 <SecondElement param2="something"> 8 Pre-Text <Inline>Inlined text</Inline> Post-text. 9 </SecondElement> 10 </DocumentElement>
SAX会这样解析上面的xml文件:
- XML Element start, named DocumentElement, with an attribute param equal to "value"
- XML Element start, named FirstElement
- XML Text node, with data equal to "¶ Some Text" (note: certain white spaces can be changed)
- XML Element end, named FirstElement
- Processing Instruction event, with the target some_pi and data some_attr="some_value" (the content after the target is just text; however, it is very common to imitate the syntax of XML attributes, as in this example)
- XML Element start, named SecondElement, with an attribute param2 equal to "something"
- XML Text node, with data equal to "Pre-Text"
- XML Element start, named Inline
- XML Text node, with data equal to "Inlined text"
- XML Element end, named Inline
- XML Text node, with data equal to "Post-text."
- XML Element end, named SecondElement
- XML Element end, named DocumentElement
下面就使用SAX对一个spring配置文件简单解析。
spring配置文件
1 <spring:beans xmlns:spring="http://www.springframework.org/schema/beans" 2 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 3 xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd"> 4 <spring:bean id="myBean" class="com.zyong.spring.beanfactory.MyBean"/> 5 </spring:beans>
测试方法:
1 String file = ClassLoader.getSystemResource("spring/spring-test.xml").getFile(); 2 com.sun.org.apache.xerces.internal.parsers.SAXParser saxParser = new com.sun.org.apache.xerces.internal.parsers.SAXParser(); 3 saxParser.setContentHandler(new MyHandler()); 4 saxParser.parse(new InputSource(file));
事件处理器:
1 private class MyHandler extends DefaultHandler { 2 3 @Override 4 public void startDocument() throws SAXException { 5 System.out.println("start document -> parse begin"); 6 } 7 8 @Override 9 public void endDocument() throws SAXException { 10 System.out.println("end document -> parse finished"); 11 } 12 13 @Override 14 public void startElement(String uri, String localName, String qName, 15 Attributes attributes) throws SAXException { 16 System.out.println("start element-----------"); 17 System.out.println(" localName: " + localName); 18 System.out.println(" qName: " + qName); 19 } 20 21 @Override 22 public void characters(char[] ch, int start, int length) 23 throws SAXException { 24 System.out.println("characters-----------"); 25 System.out.println(" ch: " + ch); 26 System.out.println(" start: " + start); 27 System.out.println(" length: " + length); 28 } 29 30 @Override 31 public void endElement(String uri, String localName, String qName) 32 throws SAXException { 33 System.out.println("end element-----------"); 34 System.out.println(" localName: " + localName); 35 System.out.println(" qName: " + qName); 36 37 } 38 }
输出:
start document -> parse begin start element----------- localName: beans qName: spring:beans characters----------- ch: [C@4cc77c2e start: 272 length: 5 start element----------- localName: bean qName: spring:bean end element----------- localName: bean qName: spring:bean characters----------- ch: [C@4cc77c2e start: 348 length: 5 start element----------- localName: bean qName: spring:bean end element----------- localName: bean qName: spring:bean characters----------- ch: [C@4cc77c2e start: 436 length: 1 end element----------- localName: beans qName: spring:beans end document -> parse finished
参考:http://blog.csdn.net/u011179993/article/details/47415603
http://www.cnblogs.com/mengdd/archive/2013/06/02/3114177.html