Python抓取网页内容

import urllib
import re
def getHtml(url):
    page=urllib.urlopen(url)
    html=page.read()
    return html
html= getHtml("http://tieba.baidu.com/p/2460150866")
print 'Size is:',len(html)
f=file('a.html','w')
f.write(html)
f.close()

Python的urllib模块还是很好用的,顺便把抓到的网页内容写到a.html里,然后模式匹配各个html标签,想得到什么都不是问题啦~~~

posted @ 2014-08-11 17:22 竹花小米阅读(191) 评论(0) 收藏举报

刷新页面返回顶部

竹花小米

Python抓取网页内容

公告