python 爬取北京菜价
from concurrent.futures import ThreadPoolExecutor,ProcessPoolExecutor
import requests
from lxml import etree
import csv
f = open('newdata5.csv', mode='w', encoding='utf-8')
csvwrite = csv.writer(f)
def one_page(url):
response=requests.get(url)
response.encoding='uft-8'
demo = response.text
html=etree.HTML(demo)
table=html.xpath('/html/body/div[2]/div[4]/div[1]/table')[0]
trs=table.xpath('./tr')[1:]
for tr in trs:
tu=tr.xpath('./td/text()')
csvwrite.writerow(tu)
print(tu)
print(url,'保存完毕')
if __name__ == '__main__':
with ThreadPoolExecutor(50) as t:#几个线程?
for i in range(50):#爬取几个?
t.submit(one_page,"http://www.xinfadi.com.cn/marketanalysis/0/list/{}.shtml".format(i))
print(str(i)+"条爬取结束.........")
本文作者:小魏同学呀
本文链接:https://www.cnblogs.com/weitongxue/p/14854744.html
版权声明:本作品采用知识共享署名-非商业性使用-禁止演绎 2.5 中国大陆许可协议进行许可。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步