BeautifulSoup学习笔记
2011-12-01 23:08 Rollen Holt 阅读(980) 评论(0) 编辑 收藏 举报from BeautifulSoup import BeautifulSoup import re doc = ['<html><head><title>Page title</title></head>', '<body><p id="firstpara" align="center">This is paragraph <b>one</b>.', '<p id="secondpara" align="blah">This is paragraph <b>two</b>.', '</html>'] soup = BeautifulSoup(''.join(doc)) print soup.prettify()
运行结果为:
print soup.contents[0].name # print soup.contents[0].contents[0].name for i in range(len(soup.contents[0])): print soup.contents[0].contents[i].name
titleTag = soup.html.head.title titleTag # <title>Page title</title> titleTag.string # u'Page title' len(soup('p')) # 2 soup.findAll('p', align="center") # [<p id="firstpara" align="center">This is paragraph <b>one</b>. </p>] soup.find('p', align="center") # <p id="firstpara" align="center">This is paragraph <b>one</b>. </p> soup('p', align="center")[0]['id'] # u'firstpara' soup.find('p', align=re.compile('^b.*'))['id'] # u'secondpara' soup.find('p').b.string # u'one' soup('p')[1].b.string # u'two'
==============================================================================
本博客已经废弃,不在维护。新博客地址:http://wenchao.ren
我喜欢程序员,他们单纯、固执、容易体会到成就感;面对压力,能够挑灯夜战不眠不休;面对困难,能够迎难而上挑战自我。他
们也会感到困惑与傍徨,但每个程序员的心中都有一个比尔盖茨或是乔布斯的梦想“用智慧开创属于自己的事业”。我想说的是,其
实我是一个程序员
==============================================================================