案例 - 异步爬取网站小说
小说 :
#https://dushu.baidu.com/api/pc/getCatalog?data={"book_id":"4306063500"} #https://dushu.baidu.com/api/pc/getChapterContent?data={"book_id":"4306063500","cid":"4306063500|1569782244","need_bookinfo":1} import requests import asyncio import aiohttp import json import aiofiles async def aiodownload(cid, book_id, title): data = { "book_id": book_id, "cid": f"{book_id}|{cid}", "need_bookinfo": 1 } data = json.dumps(data) # 将json 变成字符串 url = f'https://dushu.baidu.com/api/pc/getChapterContent?data={data}' async with aiohttp.ClientSession() as session: async with session.get(url) as resp: dic = await resp.json() async with aiofiles.open(title, mode='w', encoding='utf-8') as f: await f.write(dic['data']['novel']['content']) async def getCatlog(url): # 请求数据 resp = requests.get(url) #循环出数据 dic = resp.json() # 新建一个列表 tasks = [] for item in dic['data']['novel']['items']: title = item['title'] cid = item['cid'] #准备异步 tasks.append(aiodownload(cid, book_id, title)) await asyncio.wait(tasks) if __name__ == '__main__': book_id = '4306063500' url = 'https://dushu.baidu.com/api/pc/getCatalog?data={"book_id":"' + book_id + '"}' #链接拼接 asyncio.run(getCatlog(url))
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 分享一个免费、快速、无限量使用的满血 DeepSeek R1 模型,支持深度思考和联网搜索!
· 25岁的心里话
· 基于 Docker 搭建 FRP 内网穿透开源项目(很简单哒)
· ollama系列01:轻松3步本地部署deepseek,普通电脑可用
· 闲置电脑爆改个人服务器(超详细) #公网映射 #Vmware虚拟网络编辑器