12.Selenium+Python案例 -- 今日头条(获取科技栏目的所有新闻标题)

一:具体代码实现

# -*- coding: utf-8 -*-
# @Time : 2018/7/26 16:33
# @Author : Nancy
# @Email : NancyWangDL@163.com
# @File : Demo4.py
# @Software: PyCharm

from selenium import webdriver
import time
from pyquery import PyQuery as pq
from lxml import etree

driver = webdriver.Ie()
driver.maximize_window() #浏览器窗口最大化
driver.get("https://www.toutiao.com/")
driver.implicitly_wait(10)


driver.find_element_by_link_text("科技").click()
driver.implicitly_wait(10)

time.sleep(5)
page = driver.page_source #page_source方法可以直接返回页面源码
doc = pq(page)
doc = etree.HTML(str(doc))
contents = doc.xpath('//div[@class="wcommonFeed"]/ul/li')

for x in contents:
title = x.xpath('div/div[1]/div/div[1]/a/text()')
if title:
title = title[0]
print(title)
else:
pass


二:实现效果

 

posted @ 2018-07-26 17:05  廖丹  阅读(427)  评论(0编辑  收藏  举报