lxml.etree 教程6:Tree iteration
Elements provide a tree iterator for this purpose. It yields elements in document order, i.e. in the order their tags would appear if you serialised the tree to XML:
>>> root = etree.Element("root") >>> etree.SubElement(root, "child").text = "Child 1" >>> etree.SubElement(root, "child").text = "Child 2" >>> etree.SubElement(root, "another").text = "Child 3" >>> print(etree.tostring(root, pretty_print=True)) <root> <child>Child 1</child> <child>Child 2</child> <another>Child 3</another> </root> >>> for element in root.iter(): ... print("%s - %s" % (element.tag, element.text)) root - None child - Child 1 child - Child 2 another - Child 3
If you know you are only interested in a single tag, you can pass its name to iter() to have it filter for you. Starting with lxml 3.0, you can also pass more than one tag to intercept on multiple tags during iteration.
>>> for element in root.iter("child"): ... print("%s - %s" % (element.tag, element.text)) child - Child 1 child - Child 2 >>> for element in root.iter("another", "child"): ... print("%s - %s" % (element.tag, element.text)) child - Child 1 child - Child 2 another - Child 3
By default, iteration yields all nodes in the tree, including ProcessingInstructions, Comments and Entity instances. If you want to make sure only Element objects are returned, you can pass the Element factory as tag parameter:
>>> root.append(etree.Entity("#234")) >>> root.append(etree.Comment("some comment")) >>> for element in root.iter(): ... if isinstance(element.tag, basestring): ... print("%s - %s" % (element.tag, element.text)) ... else: ... print("SPECIAL: %s - %s" % (element, element.text)) root - None child - Child 1 child - Child 2 another - Child 3 SPECIAL: ê - ê SPECIAL: <!--some comment--> - some comment >>> for element in root.iter(tag=etree.Element): ... print("%s - %s" % (element.tag, element.text)) root - None child - Child 1 child - Child 2 another - Child 3 >>> for element in root.iter(tag=etree.Entity): ... print(element.text) ê
Note that passing a wildcard "*" tag name will also yield all Element nodes (and only elements).
In lxml.etree, elements provide further iterators for all directions in the tree: children, parents (or rather ancestors) and siblings.
出处:http://bluescorpio.cnblogs.com
本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。