步骤:

   导入jar包

   获取Document对象

   获取对应的便签Element对象

   获取对应的数据 

<dependency>
            <groupId>org.jsoup</groupId>
            <artifactId>jsoup</artifactId>
            <version>1.10.2</version>
        </dependency>
 public static void main(String[] args) throws IOException {

            String path = JsoupDemo1.class.getClassLoader().getResource("student.xml").getPath();
            Document document = Jsoup.parse(new File(path), "utf-8");
            Elements name = document.getElementsByTag("name");
            System.out.println(name.size());
            Element element = name.get(0);
            String text = element.text();
            System.out.println(text);
    }
<?xml version="1.0" encoding="UTF-8" ?>
<students>
    <student number="heima_0001">
        <name>zhangsan</name>
        <age>11</age>
        <sex>male</sex>
    </student>
    <student number="heima_0002">
        <name>wangwi</name>
        <age>14</age>
        <sex>female</sex>
    </student>
</students>

 

 

 

 

 

 

 

 

 

 

 

 

 xml_解析_jsoup_jsoup对象

Jsoup:工具类可以解析html或xml文档,返回Document

  parse:解析html或xml文档,返回

  parse:(File in,String charsetName)解析xml或html文件

  parse:(String html):解析xml或html字符串

  parse:(URL url,int timeouMillis)通过网络路径获取html或xml的文档对象

Document:文档对象。代表内存中的dom树

Elements:元秦Element对象的集合。可以当做ArrayList<Element>来使用

Element :元秦对象

Node :节点对象

 

public static void main(String[] args) throws IOException {

            String path = JsoupDemo2.class.getClassLoader().getResource("student.xml").getPath();
//            Document document = Jsoup.parse(new File(path), "utf-8");
//            System.out.println(document);

            //解析xml和html字符串
            /*String str = "<?xml version=\"1.0\" encoding=\"UTF-8\" ?>\n" +
                    "<students>\n" +
                    "    <student number=\"heima_0001\">\n" +
                    "        <name>zhangsan</name>\n" +
                    "        <age>11</age>\n" +
                    "        <sex>male</sex>\n" +
                    "    </student>\n" +
                    "    <student number=\"heima_0002\">\n" +
                    "        <name>wangwi</name>\n" +
                    "        <age>14</age>\n" +
                    "        <sex>female</sex>\n" +
                    "    </student>\n" +
                    "</students>";
        Document parse = Jsoup.parse(str);
        System.out.println(parse);*/

        //通过网络路径获取制定的html或xml文件当对象
        URL url = new URL("https://baike.baidu.com/item/jsoup/9012509?fr=aladdin");
        Document parse = Jsoup.parse(url, 10000);
        System.out.println(parse);
    }

 

posted on 2022-08-07 10:29  淤泥不染  阅读(36)  评论(0编辑  收藏  举报