.contents 和 .children

tag的 .contents 属性可以将tag的子节点以列表的方式输出:

 1 head_tag = soup.head
 2 head_tag
 3 # <head><title>The Dormouse's story</title></head>
 4 
 5 head_tag.contents
 6 [<title>The Dormouse's story</title>]
 7 
 8 title_tag = head_tag.contents[0]
 9 title_tag
10 # <title>The Dormouse's story</title>
11 title_tag.contents
12 # [u'The Dormouse's story']
View Code

BeautifulSoup 对象本身一定会包含子节点,也就是说<html>标签也是 BeautifulSoup 对象的子节点:

len(soup.contents)
# 1
soup.contents[0].name
# u'html'

字符串没有 .contents 属性,因为字符串没有子节点:

text = title_tag.contents[0]
text.contents
# AttributeError: 'NavigableString' object has no attribute 'contents'

通过tag的 .children 生成器,可以对tag的子节点进行循环:

for child in title_tag.children:
    print(child)
    # The Dormouse's story
posted @ 2021-05-14 07:57  大雄的脑袋  阅读(48)  评论(0编辑  收藏  举报