2013 年 6月 10 日随笔档案 - 小楼

2013年6月10日

摘要： Elements provide a tree iterator for this purpose. It yields elements in document order, i.e. in the order their tags would appear if you serialised the tree to XML:>>> root = etree.Element("root")>>> etree.SubElement(root, "child").text = "Child 1"> 阅读全文

posted @ 2013-06-10 20:37 小楼阅读(968) 评论(0) 推荐(0) 编辑

lxml.etree 教程5：Using XPath to find text

摘要：另外一个获取树里面文本内容的方法是XPath，它一样可以把文本内容提取到列表中。>>> print(html.xpath("string()")) # lxml.etree only!TEXTTAIL>>> print(html.xpath("//text()")) # lxml.etree only!['TEXT', 'TAIL']如果你比较频繁使用这个方式，可以包装成一个函数。>>> build_text_list = etree.XPath("//tex 阅读全文

posted @ 2013-06-10 20:34 小楼阅读(5057) 评论(0) 推荐(0) 编辑

lxml.etree 教程4：Elements contain text

摘要：元素可以包含文本:>>> root = etree.Element("root")>>> root.text = "TEXT">>> print(root.text)TEXT>>> etree.tostring(root)b'<root>TEXT</root>'在很多XML文档(数据中心文档)中，这是可以找到文本的唯一地方。它在树结构的底部，用一个叶标签来封装。然而，如果XML是用来标记文本，比如(X)HTML，文本也可以出现在不同的元素中。在阅读全文

posted @ 2013-06-10 20:28 小楼阅读(1914) 评论(0) 推荐(1) 编辑

lxml.etree 教程3：Elements carry attributes as a dict

摘要： XML元素支持属性，你可以在Element工厂里面直接创建它们。>>> root = etree.Element("root", interesting="totally")>>> etree.tostring(root)b'<root interesting="totally"/>'属性不过是没有顺序的名称-值对，所以一个方便的处理它们的方式是通过类字典的元素接口。>>> print(root.get("interesting")) 阅读全文

posted @ 2013-06-10 20:17 小楼阅读(916) 评论(0) 推荐(0) 编辑

公告