36kr科技频道异步加载
from lxml import etree
import time
import random
from selenium import webdriver
driver = webdriver.Chrome()
url = 'https://36kr.com/information/technology/'
driver.get(url)
for page in range(1, 11):
html = driver.page_source
tree = etree.HTML(html)
time.sleep(random.randint(3, 5))
print(f'********************第{page}页******************')
driver.execute_script('window.scrollBy(0,2200)')
try:
driver.find_element_by_xpath('//div[@class="kr-loading-more-button show"]').click()
print('点击查看更多')
except:
pass
time.sleep(random.randint(3, 5))
name = tree.xpath('//div[@class="article-item-info clearfloat"]/p/a//text()')
detail = tree.xpath('//div[@class="article-item-info clearfloat"]/a//text()')
froms = tree.xpath('//div[@class="kr-flow-bar"]/a//text()')
times = tree.xpath('//span[@class="kr-flow-bar-time"]//text()')
for i in range(len(name)):
print(f'标题:{name[i]}\n内容:{detail[i]}\n来源:{froms[i]}\n发布时间:{times[i]}\n')
本文作者:布都御魂
本文链接:https://www.cnblogs.com/wolvies/p/16016630.html
版权声明:本作品采用知识共享署名-非商业性使用-禁止演绎 2.5 中国大陆许可协议进行许可。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步