python BeautiulSoup

==============================================查找网页中<a>标签中的链接
from bs4 import BeautifulSoup

with open('beautifulSoup_test.html','r',encoding='utf-8')as f: #beautifulSoup_test.html是同级网页源代码
    bs=BeautifulSoup(f.read())
a_lst=bs.find_all('a')
for a in a_lst:
if a.text!='':
print(a.text.strip(),a['href'])
posted @ 2017-12-29 14:38  Justice-V  阅读(369)  评论(0编辑  收藏  举报