Python xml模块
xml是实现不同语言或程序之间进行数据交换的协议,跟json差不多,但json使用起来更简单
xml的格式如下,就是通过<>节点来区别数据结构的:
<?xml version="1.0"?> <data> <country name="Liechtenstein"> <rank updated="yes">2</rank> <year>2008</year> <gdppc>141100</gdppc> <neighbor name="Austria" direction="E"/> <neighbor name="Switzerland" direction="W"/> </country> <country name="Singapore"> <rank updated="yes">5</rank> <year>2011</year> <gdppc>59900</gdppc> <neighbor name="Malaysia" direction="N"/> </country> <country name="Panama"> <rank updated="yes">69</rank> <year>2011</year> <gdppc>13600</gdppc> <neighbor name="Costa Rica" direction="W"/> <neighbor name="Colombia" direction="E"/> </country> </data>
xml协议在各个语言里的都是支持的,在python中可以用以下模块操作xml:
打开xml文件:
import xml.etree.ElementTree as ET tree = ET.parse("xml_test.xml") # 打开xml_test文件 root = tree.getroot() # 获取内存地址 print(root) print(root.tag) # 标签名 # 遍历xml文档 for child in root: print(child.tag, child.attrib) # 打印二级标签和属性 for i in child: print(i.tag, i.text) # 打印三级标签和内容 # 只遍历year 节点 for node in root.iter('year'): print(node.tag, node.text) # 打印指定标签和内容
修改xml文件:
import xml.etree.ElementTree as ET tree = ET.parse("xml_test.xml") # 打开文件 root = tree.getroot() # 获取地址 # 修改 for node in root.iter('year'): # 寻找所有year标签 new_year = int(node.text) + 1 # 给所有year加一 node.text = str(new_year) node.set("updated", "yes") # 添加属性updated = yes tree.write("xml_test.xml") # 写回原文件 # 删除node for country in root.findall('country'): # 寻找所有country标签 rank = int(country.find('rank').text) # 找到rank if rank > 50: root.remove(country) # 删除满足条件的country标签 tree.write('output.xml') # 写回原文件
创建xml文件:
import xml.etree.ElementTree as ET new_xml = ET.Element("namelist") # 根节点:namelist person_info = ET.SubElement(new_xml, "person_info", attrib={"enrolled": "yes"}) # namelist的子节点:person_info name = ET.SubElement(person_info, "name") # person_info的子节点:name age = ET.SubElement(person_info, "age", attrib={"checked": "no"}) # person_info的子节点:age,属性:checked= "no" sex = ET.SubElement(person_info, "sex") # person_info的子节点:sex name.text = 'dbf-' # name为dbf- age.text = '18' # age为18 person_info2 = ET.SubElement(new_xml, "person_info", attrib={"enrolled": "no"}) # person_infolist的子节点:person_info2 age = ET.SubElement(person_info2, "age") # person_info的子节点:age age.text = '19' # age为19 et = ET.ElementTree(new_xml) # 生成文档对象 et.write("test.xml", encoding="utf-8", xml_declaration=True) # 打开文件并写入 ET.dump(new_xml) # 打印生成的格式