python 解析xml文件

https://www.cnblogs.com/handsome1013/p/10058838.html
ET.Parser 用法
https://www.cnblogs.com/yezuhui/p/6853323.html

https://blog.csdn.net/gz153016/article/details/90216737

 Python3 xml解析模块xml.etree.ElementTree简介

https://blog.csdn.net/asty9000/article/details/93627226?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-1.nonecase&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-1.nonecase

删除重复xml节点

https://blog.csdn.net/u014203484/article/details/74332815

import xml.etree.ElementTree as ET----------导入xml模块

root = ET.parse('GHO.xml')------------------分析指定xml文件
tree = root.getroot()-----------------------获取第一标签
data = tree.find('Data')--------------------查找第一标签中'Data'标签
for obs in data:----------------------------历遍'Data'中的所有标签
for item in obs:------------------------历遍'Data'中的'obs'标签下的所有标签
key = item.attrib()-----------------提取key值参数
print(list(key))--------------------输出key值 

如何读取属性及节点内容。

怎样将data中的 id,name及其值取出来?

问题解释

两种方式:
1.先取得node
String strID = node.getAttributes().getNamedItem("id").getNodeValue();
String strName = node.getAttributes().getNamedItem("name").getNodeValue();
2.先取得element
String strID = element.getAttribute("id");
String strName = element.getAttribute("name");

小练习

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
#!/usr/bin/env python
import sys
import xml.etree.ElementTree as ET
 
tree = ET.parse('abcdefg.xml')
root = tree.getroot()
 
iter_elem = root.findall('.//*')
print(len(iter_elem))
#elem = root.find('')
#print iter_elem
for element in iter_elem:
 
    if element is None:
        continue
    if element.text is None:
        continue
    print("hello")
    context=[] 
    src_elem = element.find("source")
    if src_elem is None:
        continue
    context.append(src_elem.text)  
 
    print( "attri :%s"%src_elem.attrib)
    print("tag :%s"%src_elem.tag)      
 
    #for item in src_elem:
    #    key = item.text()
    #    print list(key)<br><br><br><strong>del duplicatd node:</strong><br><br>import xml.etree.ElementTree as ET
path = 'in.xml'
tree = ET.parse(path)
root = tree.getroot()
prev = None
 
def elements_equal(e1, e2):
    if type(e1) != type(e2):
        return False
    if e1.tag != e1.tag: return False
    if e1.text != e2.text: return False
    if e1.tail != e2.tail: return False
    if e1.attrib != e2.attrib: return False
    if len(e1) != len(e2): return False
    return all([elements_equal(c1, c2) for c1, c2 in zip(e1, e2)])
 
for page in root:                     # iterate over pages
    elems_to_remove = []
    for elem in page:
        if elements_equal(elem, prev):
            print("found duplicate: %s" % elem.text)   # equal function works well
            elems_to_remove.append(elem)
            continue
        prev = elem
    for elem_to_remove in elems_to_remove:
        page.remove(elem_to_remove)
tree.write("out.xml")

  

 


RapidXml库的使用博客文章推荐:
https://blog.csdn.net/wqvbjhc/article/details/7662931
https://www.cnblogs.com/kanego/articles/2247602.html
http://blog.csdn.net/wqvbjhc/article/details/7662931
http://www.oschina.net/question/873634_81784
http://www.cnblogs.com/kanego/articles/2247602.html
http://blog.sina.com.cn/s/blog_a459dcf501019393.html

 

 

 

posted @   七星望  阅读(331)  评论(0编辑  收藏  举报
编辑推荐:
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
阅读排行:
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· 开源Multi-agent AI智能体框架aevatar.ai,欢迎大家贡献代码
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· AI技术革命,工作效率10个最佳AI工具
点击右上角即可分享
微信分享提示