步骤:
导入jar包
获取Document对象
获取对应的便签Element对象
获取对应的数据
<dependency> <groupId>org.jsoup</groupId> <artifactId>jsoup</artifactId> <version>1.10.2</version> </dependency>
public static void main(String[] args) throws IOException { String path = JsoupDemo1.class.getClassLoader().getResource("student.xml").getPath(); Document document = Jsoup.parse(new File(path), "utf-8"); Elements name = document.getElementsByTag("name"); System.out.println(name.size()); Element element = name.get(0); String text = element.text(); System.out.println(text); }
<?xml version="1.0" encoding="UTF-8" ?> <students> <student number="heima_0001"> <name>zhangsan</name> <age>11</age> <sex>male</sex> </student> <student number="heima_0002"> <name>wangwi</name> <age>14</age> <sex>female</sex> </student> </students>
xml_解析_jsoup_jsoup对象
Jsoup:工具类可以解析html或xml文档,返回Document
parse:解析html或xml文档,返回
parse:(File in,String charsetName)解析xml或html文件
parse:(String html):解析xml或html字符串
parse:(URL url,int timeouMillis)通过网络路径获取html或xml的文档对象
Document:文档对象。代表内存中的dom树
Elements:元秦Element对象的集合。可以当做ArrayList<Element>来使用
Element :元秦对象
Node :节点对象
public static void main(String[] args) throws IOException { String path = JsoupDemo2.class.getClassLoader().getResource("student.xml").getPath(); // Document document = Jsoup.parse(new File(path), "utf-8"); // System.out.println(document); //解析xml和html字符串 /*String str = "<?xml version=\"1.0\" encoding=\"UTF-8\" ?>\n" + "<students>\n" + " <student number=\"heima_0001\">\n" + " <name>zhangsan</name>\n" + " <age>11</age>\n" + " <sex>male</sex>\n" + " </student>\n" + " <student number=\"heima_0002\">\n" + " <name>wangwi</name>\n" + " <age>14</age>\n" + " <sex>female</sex>\n" + " </student>\n" + "</students>"; Document parse = Jsoup.parse(str); System.out.println(parse);*/ //通过网络路径获取制定的html或xml文件当对象 URL url = new URL("https://baike.baidu.com/item/jsoup/9012509?fr=aladdin"); Document parse = Jsoup.parse(url, 10000); System.out.println(parse); }