【转】Python爬虫_示例

爬虫项目：爬取汽车之家新闻资讯

# requests+Beautifulsoup爬取汽车之家新闻

import requests
from bs4 import BeautifulSoup

response=requests.get('https://www.autohome.com.cn/news/')
response.encoding='gbk'

with open('a.html','w',encoding='utf-8') as f:
    f.write(response.text)
soup=BeautifulSoup(response.text,'lxml')


news=soup.find(id='auto-channel-lazyload-article').select('ul li a')


for tag in news:
    link=tag.attrs['href']
    imag=tag.select('.article-pic img')[0].attrs['src']
    title=tag.find('h3').get_text()
    sub_time=tag.find(class_='fn-left').get_text()
    browsing_num=tag.select('.fn-right em')[0].get_text()
    comment=tag.find('p').get_text()
    msg='''
    ======================================
    链接：http:%s
    图片：http:%s
    标题：%s
    发布时间：%s
    浏览数：%s
    介绍：%s
    ''' %(link,imag,title,sub_time,browsing_num,comment)

    print(msg)

posted @ 2017-11-06 09:20 hedeyong11 阅读(206) 评论(0) 收藏举报

刷新页面返回顶部

hedeyong11

-- 404 Not Found!

【转】Python爬虫_示例

爬虫项目：爬取汽车之家新闻资讯

公告