python 解析html 时lxml跟beautifulSoup对比

根据我使用经验lxml比beautifulSoup速度更快，容错和处理能力更强。

lxml示例如下：

　　　　　　def getGooglePlayAppInfo(self):
                pageUrl='https://play.google.com/store/apps/details?id=com.taobao.taobao'
                pageUrl_openHandle=self.open_url(pageUrl)
                if pageUrl_openHandle:
                        pageUrlHtmlSource=pageUrl_openHandle.read().decode("utf-8")
                        #print pageUrlHtmlSource
                        doc=etree.HTML(pageUrlHtmlSource)
                        hrefs = doc.xpath(u"//a[@class=\"doc-header-link\"]")
                        for href in hrefs:
                                print href.text

posted @ 2013-05-13 10:37 chaoboma 阅读(1144) 评论(0) 编辑收藏举报

刷新页面返回顶部

chaoboma

认认真真写博客，实实在在过日子。

python 解析html 时lxml跟beautifulSoup对比

公告