Python异步请求【限制并发量】

 

限制特定并发量

import asyncio
import aiohttp
import logging

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s: %(message)s')

CONCURRENCY = 5
URL = 'https://www.baidu.com'
semaphore = asyncio.Semaphore(CONCURRENCY)
session = None
index = 0

async def scrape_api():
  async with semaphore:
    global index
    index += 1
    logging.info('scraping %s',str(index)+" "+URL)
    # print('scraping',str(index), URL)
    async with session.get(URL) as response:
      await asyncio.sleep(1)
      return await response.text()

async def main():
  global session
  session = aiohttp.ClientSession(connector=aiohttp.TCPConnector(limit=64, ssl=False))
  scrape_index_tasks = [asyncio.ensure_future(scrape_api()) for _ in range(10)]
  await asyncio.gather(*scrape_index_tasks)

if __name__ == '__main__':
  asyncio.get_event_loop().run_until_complete(main())

 

 

结果如下:

2021-08-11 22:17:29,772 - INFO: scraping 1 https://www.baidu.com
2021-08-11 22:17:29,774 - INFO: scraping 2 https://www.baidu.com
2021-08-11 22:17:29,775 - INFO: scraping 3 https://www.baidu.com
2021-08-11 22:17:29,775 - INFO: scraping 4 https://www.baidu.com
2021-08-11 22:17:29,776 - INFO: scraping 5 https://www.baidu.com
2021-08-11 22:17:30,901 - INFO: scraping 6 https://www.baidu.com
2021-08-11 22:17:30,914 - INFO: scraping 7 https://www.baidu.com
2021-08-11 22:17:30,917 - INFO: scraping 8 https://www.baidu.com
2021-08-11 22:17:30,920 - INFO: scraping 9 https://www.baidu.com
2021-08-11 22:17:30,923 - INFO: scraping 10 https://www.baidu.com

 

来自拉勾教育  52讲轻松搞定网络爬虫

 

posted @ 2021-08-11 21:45  宝山方圆  阅读(622)  评论(0编辑  收藏  举报