遇事不决,可问春风,春风不语,谨遵本心|

布都御魂

园龄:3年9个月粉丝:2关注:1

获取百度回收网址

import random
import time
from selenium import webdriver
import requests
from lxml import etree
from selenium.webdriver.common.by import By


def request_zy(url):
    response = requests.get(url=url)
    return response.url


driver = webdriver.Chrome()
# pn=10*(page-1)
url='https://www.baidu.com/s?wd=site%3Awww.china-mcc.com&pn=390&oq=site%3Awww.china-mcc.com&ct=2097152&tn=baiduhome_pg&ie=utf-8&si=www.china-mcc.com&rsv_idx=2&rsv_pq=c26cd77200029a68&rsv_t=6fc2pzBPE28JcSUnYCQz9FF7YYjjDpQxjasgt73llb9LQ3Ilj%2FNJZ2f8m1c9Z3iZgzZK&gpc=stf%3D1684080000%2C1685462400%7Cstftype%3D2&tfflag=1'
for i in range(40, 77):
    time.sleep(random.randint(15, 18))
    driver.get(url)
    driver.maximize_window()
    driver.execute_script("var q=document.documentElement.scrollTop=10000")
    time.sleep(random.randint(5, 8))
    print(f'第{i}页----------')
    print(url)
    print('\n')
    html = driver.page_source
    tree = etree.HTML(html)
    second_list = tree.xpath('//div[@id="content_left"]//h3//a//@href')
    for second in second_list:
        zy = request_zy(second)
        open('有色技术.txt', 'a').write(zy + '\n')
        print(f'{zy}正在写入中')
    print('写入完成')
    if i == 1:
        driver.find_element(by=By.XPATH, value='//a[@class="n"]').click()
    else:
        driver.find_element(by=By.XPATH, value='//a[@class="n"][2]').click()
    url = driver.current_url
    time.sleep(random.randint(5, 8))

driver.close()
driver.quit()

 

本文作者:布都御魂

本文链接:https://www.cnblogs.com/wolvies/p/17445395.html

版权声明:本作品采用知识共享署名-非商业性使用-禁止演绎 2.5 中国大陆许可协议进行许可。

posted @   布都御魂  阅读(43)  评论(0编辑  收藏  举报
   
点击右上角即可分享
微信分享提示
评论
收藏
关注
推荐
深色
回顶
收起