python异步框架中协程之间的并行

python异步框架中协程之间的并行

python中的异步协程框架有很多,比如 tornado , gevent , asyncio , twisted 等。协程带来的是低消耗的并发,在等待IO事件的时候可以把控制权交给其它的协程,这个是它并发能力的保障。但是光有并发还是不够的,高并发并不能保证低延迟,因为一个业务逻辑的流程可能包含多个异步IO的请求,如果这些异步IO的请求是一个一个逐步执行的,虽然server的吞吐量还是很高,但是每个请求的延迟就会很大。为了解决这类问题,每个框架都有各自不同的方式,下面我们就来分别看看,它们都是怎么管理互不相关协程之间的并行的。

tornado

python2.7及以上

tornado的代码就简短很多,直接yield一个coroutine的列表出去就好了:

 

 
#!/usr/bin/env python
# _*_coding:utf-8_*_
import random
import requests,json
import time
from tornado import gen
from tornado.ioloop import IOLoop


@gen.coroutine
def get_url(url):
    r = requests.get(url, timeout=3)
    print url, r.status_code
    resp = r.text
    print type(resp)
    raise gen.Return((url, r.status_code))

@gen.coroutine
def process_once_everything_ready():
    before = time.time()
    # coroutines = [get_url(url) for url in ['URL1', 'URL2', 'URL3']]
    coroutines = [get_url(url) for url in ['https://www.python.org/', 'https://github.com/', 'https://www.yahoo.com/']]
    result = yield coroutines
    after = time.time()
    print(result)
    print('total time: {} seconds'.format(after - before))

if __name__ == '__main__':
    IOLoop.current().run_sync(process_once_everything_ready)

 

输出:

/usr/bin/python /Users/liujianzuo/py_test/s83_company_code/edns_prober_v2/cname_search/a_sync_io.py
https://www.python.org/ 200
<type 'unicode'>
https://github.com/ 200
<type 'unicode'>
https://www.yahoo.com/ 200
<type 'unicode'>
[('https://www.python.org/', 200), ('https://github.com/', 200), ('https://www.yahoo.com/', 200)]
total time: 4.64905309677 seconds

  

 

  

import random
import time
from tornado import gen
from tornado.ioloop import IOLoop


@gen.coroutine
def get_url(url):
    wait_time = random.randint(1, 4)
    yield gen.sleep(wait_time)
    print('URL {} took {}s to get!'.format(url, wait_time))
    raise gen.Return((url, wait_time))


@gen.coroutine
def process_once_everything_ready():
    before = time.time()
    coroutines = [get_url(url) for url in ['URL1', 'URL2', 'URL3']]
    result = yield coroutines
    after = time.time()
    print(result)
    print('total time: {} seconds'.format(after - before))

if __name__ == '__main__':
    IOLoop.current().run_sync(process_once_everything_ready)
View Code
$ python3 tornado_test.py
URL URL2 took 1s to get!
URL URL3 took 1s to get!
URL URL1 took 4s to get!
[('URL1', 4), ('URL2', 1), ('URL3', 1)]
total time: 4.000649929046631 seconds
View Code

在这里,总的运行时间也是等于最长的协程的运行时间

 

 

因为现在tornado已经集成了 asyncio 以及 twisted 模块,也可以利用它们的方式去做,这里就不展开了。

 

asyncio

python3.4及以上http://xidui.github.io/2015/11/11/python%E5%BC%82%E6%AD%A5%E6%A1%86%E6%9E%B6%E5%8D%8F%E7%A8%8B%E4%B9%8B%E9%97%B4%E7%9A%84%E5%B9%B6%E8%A1%8C/?utm_source=tuicool&utm_medium=referral

在我的博客里有一篇关于asyncio库的译文,里面最后一部分就有介绍它是如何管理互不相关的协程的。这里我们还是引用它,并给他增加了计时的功能来更好地阐述协程是如何并行的:

 

import asyncio
import random
import time


@asyncio.coroutine
def get_url(url):
    wait_time = random.randint(1, 4)
    yield from asyncio.sleep(wait_time)
    print('URL {} took {}s to get!'.format(url, wait_time))
    return url, wait_time


@asyncio.coroutine
def process_as_results_come_in():
    before = time.time()
    coroutines = [get_url(url) for url in ['URL1', 'URL2', 'URL3']]
    for coroutine in asyncio.as_completed(coroutines):
        url, wait_time = yield from coroutine
        print('Coroutine for {} is done'.format(url))
    after = time.time()
    print('total time: {} seconds'.format(after - before))


@asyncio.coroutine
def process_once_everything_ready():
    before = time.time()
    coroutines = [get_url(url) for url in ['URL1', 'URL2', 'URL3']]
    results = yield from asyncio.gather(*coroutines)
    print(results)
    after = time.time()
    print('total time: {} seconds'.format(after - before))


def main():
    loop = asyncio.get_event_loop()
    print("First, process results as they come in:")
    loop.run_until_complete(process_as_results_come_in())
    print("\nNow, process results once they are all ready:")
    loop.run_until_complete(process_once_everything_ready())


if __name__ == '__main__':
    main()

  

 

总结

  • 在协程框架中的sleep,都不能用原来 time 模块中的sleep了,不然它会阻塞整个线程,而所有协程都是运行在同一个线程中的。可以看到两个框架都会sleep作了封装 gen.sleep() 和asyncio.sleep() ,内部的实现上,它们都是注册了一个定时器在eventloop中,把CPU的控制权交给其它协程。
  • 从协程的实现原理层面去说,也是比较容易理解这种并行方式的。两个框架都是把一个生成器对象的列表yield出去,交给调度器,再由调度器分别执行并注册回调,所以才能够实现并行。

 

 

posted @ 2017-03-28 22:00  众里寻,阑珊处  阅读(473)  评论(0编辑  收藏  举报
返回顶部