BeautifulSoup

安装BeautifulSoup

1.Linux 系统上的安装:

  sudo apt-get install python-bs4

2.Mac系统

  pip install beatifulsoup4

3.Windows系统

  pip install beatifulsoup4

 

html = urlopen("http://www.baidu.com")

这行代码可能出现两种异常

1.网页在服务器上不存在

2.服务器不存在

第一种会抛出HTTPError异常

第二种会抛出HTMLError异常

如果调用的标签不存在,就会返回AttributeError

 

返回网页标题的封装函数

from urllib.request import urlopen
from urllib.error import HTTPError,URLError
from bs4 import BeautifulSoup
def getTitle(url):
    try:
        html = urlopen(url)
    except (HTTPError,URLError) as e:
        return None
    try:
        bs0bj = BeautifulSoup(html.read())
        title = bs0bj.body.h1
    except AttributeError as e:
        return None
    return title


title = getTitle("https://www.douban.com")
if title == None:
    print("Title could not be found")
else:
    print(title)

 

posted on 2019-06-10 15:36  Little_Raccoon  阅读(169)  评论(0编辑  收藏  举报