asyncio 基础用法

asyncio 基础用法

  • python也是在python 3.4中引入了协程的概念。也通过这次整理更加深刻理解这个模块的使用

asyncio 是干什么的?

  • asyncio是Python 3.4版本引入的标准库,直接内置了对异步IO的支持。

  • 异步网络操作

  • 并发

  • 协程

asyncio的一些关键字:

  • event_loop 事件循环:程序开启一个无限循环,把一些函数注册到事件循环上,当满足事件发生的时候,调用相应的协程函数
  • **coroutine **协程:协程对象,指一个使用async关键字定义的函数,它的调用不会立即执行函数,而是会返回一个协程对象。协程对象需要注册到事件循环,由事件循环调用。
  • **task **任务:一个协程对象就是一个原生可以挂起的函数,任务则是对协程进一步封装,其中包含了任务的各种状态
  • future: 代表将来执行或没有执行的任务的结果。它和task上没有本质上的区别
  • async/await 关键字:python3.5用于定义协程的关键字,async定义一个协程,await用于挂起阻塞的异步调用接口。

Python 3.4 asyncio 用法

import asyncio
import threading

@asyncio.coroutine
def hello():
    print("Hello world!", threading.currentThread())
    # 异步调用asyncio.sleep(1):
    r = yield from asyncio.sleep(2)
    # time.sleep(2)
    print("Hello world!", threading.currentThread())


# 获取EventLoop:
loop = asyncio.get_event_loop()
# 执行coroutine
tasks = [hello(), hello()]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()

  • Python 3.5 定义了 async/await 直接替换 @asyncio.coroutine 和 yield from

基础用法:

import asyncio
import time

now = lambda: time.time()


async def do_some_work(x):
    print(f"协程执行: {x}")
    await asyncio.sleep(x)

    return "done after {}".format(x)

def callback(future):
    print("回调执行获取返回值: ", future.result())


start = now()
# 这里是一个协程对象,这个时候do_some_work函数并没有执行
coroutine = do_some_work(2)
# print(coroutine)  # <coroutine object do_some_work at 0x000001A2AAA9FCA8>
loop = asyncio.get_event_loop()

# 创建一个 task 对象
# task = loop.create_task(coroutine)
# 第二种创建方式  通过 asyncio  创建
task = asyncio.ensure_future(coroutine)

# 绑定回调,在task执行完成的时候可以获取执行的结果,回调的最后一个参数是future对象,通过该对象可以获取协程返回值。
task.add_done_callback(callback)
print("未执行的task pending 状态: ", task)
loop.run_until_complete(task)
print("执行完的task finished 状态: ", task)

print("执行时间: ", now() - start)

  • 执行效果
未执行的task pending 状态:  <Task pending coro=<do_some_work() running at F:/爬虫/爬虫项目使用 pycharm/PyppeteerDemo/asyncioDemo.py:7> cb=[callback() at F:/爬虫/爬虫项目使用 pycharm/PyppeteerDemo/asyncioDemo.py:14]>

协程执行: 2

回调执行获取返回值:  done after 2

执行完的task finished 状态:  <Task finished coro=<do_some_work() done, defined at F:/爬虫/爬虫项目使用 pycharm/PyppeteerDemo/asyncioDemo.py:7> result='done after 2'>
 
执行时间:  2.0016515254974365

并发和并行

  • 并发指的是同时具有多个活动的系统

    并行值得是用并发来使一个系统运行的更快。并行可以在操作系统的多个抽象层次进行运用

    所以并发通常是指有多个任务需要同时进行,并行则是同一个时刻有多个任务执行

用了aysncio实现了并发

import asyncio
import time

now = lambda: time.time()


async def do_some_work(x):
    print("协程执行", x)
    await asyncio.sleep(x)
    return f"done after {x}"


start = now()

coroutine1 = do_some_work(2)
coroutine2 = do_some_work(3)
coroutine3 = do_some_work(4)

tasks = [
    asyncio.ensure_future(coroutine1),
    asyncio.ensure_future(coroutine2),
    asyncio.ensure_future(coroutine3),
]

loop = asyncio.get_event_loop()
# 执行所有的 task  接收 一个 task 列表
# loop.run_until_complete(asyncio.wait(tasks))
# 第二种写法  接收 一堆 task
loop.run_until_complete(asyncio.gather(*tasks))

for item in tasks:
    print(item.result())
print("执行时间: ", now() - start)
  • 执行效果
协程执行 2
协程执行 3
协程执行 4
done after 2
done after 3
done after 4
执行时间:  4.00330114364624

协程嵌套

  • 封装更多的io操作过程,即一个协程中await了另外一个协程,连接起来。这样就实现了嵌套的协程.
import asyncio
import time

now = lambda: time.time()

start = now()


async def do_some_work(x):
    print("协程执行: ", x)
    await asyncio.sleep(x)
    return f"done after  {x}"


async def main():
    coroutine1 = do_some_work(2)
    coroutine2 = do_some_work(3)
    coroutine3 = do_some_work(4)

    tasks = [
        asyncio.ensure_future(coroutine1),
        asyncio.ensure_future(coroutine2),
        asyncio.ensure_future(coroutine3),
    ]
    # 第一种
    # dones 完成的 task对象  pendings 等待的 task
    # dones, pendings = await asyncio.wait(tasks)
    # print(pendings)
    # for task in dones:
    #     print(task.result())
    # 或者直接返回
    # return await asyncio.wait(tasks)

    # 第二种
    # 使用  asyncio.gather 直接得到结果列表
    # results = await asyncio.gather(*tasks)
    # print(results)
    # for result in results:
    #     print(result)
    # 或者直接返回
    # return results

    # 第三种
    # asyncio.as_completed(tasks)  是一个生成器
    # print( asyncio.as_completed(tasks))  # <generator object as_completed at 0x000001F0DB4767D8>

    for task in asyncio.as_completed(tasks):
        result = await task
        print(task)  # <generator object as_completed.<locals>._wait_for_one at 0x000001557DA46830>
        print(result)  # done after  2


loop = asyncio.get_event_loop()

# 第一种 返回值
# dines, pendings = loop.run_until_complete(main())
# print(pendings)
# for task in dines:
#     print(task.result())

# 第二种返回值
# results = loop.run_until_complete(main())
# for result in results:
#     print("返回的内容 : ", result)

# 第三种
loop.run_until_complete(main())

print("执行时间: ", now() - start)

协程的停止

  • future对象有几个状态:

    • Pending
    • Running
    • Done
    • Cacelled

    创建future的时候,task为pending,事件循环调用执行的时候当然就是running,调用完毕自然就是done,如果需要停止事件循环,就需要先把task取消。可以使用asyncio.Task获取事件循环的task.

import asyncio
import time

now = lambda: time.time()

start = now()


async def do_some_work(x):
    print("协程执行: {}".format(x))
    await asyncio.sleep(x)
    return "done after  {}".format(x)


coroutine1 = do_some_work(2)
coroutine2 = do_some_work(3)
coroutine3 = do_some_work(4)

tasks = [
    asyncio.ensure_future(coroutine1),
    asyncio.ensure_future(coroutine2),
    asyncio.ensure_future(coroutine3),
]

loop = asyncio.get_event_loop()

try:
    results = loop.run_until_complete(asyncio.wait(tasks))
except KeyboardInterrupt as k:
    # print(k)

    print(asyncio.Task.all_tasks())
    for task in asyncio.Task.all_tasks():
        print(task.cancel())  # 循环task,逐个cancel
    loop.stop()  # stop之后还需要再次开启事件循环
    loop.run_forever()

finally:
    loop.close()  # 最后在close,不然还会抛出异常

print(now() - start)

  • 执行结果 : 使用 命令窗口执行 Ctrl + c 会抛出 run_until_complete 的 KeyboardInterrupt 异常
协程执行: 2
协程执行: 3
协程执行: 4

{<Task pending coro=<do_some_work() running at asyncioDemo.py:158> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x0000025F84BF3DF8>()]> cb=[_wait.<locals>._on_comp
letion() at c:\python36\Lib\asyncio\tasks.py:380]>, <Task pending coro=<do_some_work() running at asyncioDemo.py:158> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at
0x0000025F84B64C18>()]> cb=[_wait.<locals>._on_completion() at c:\python36\Lib\asyncio\tasks.py:380]>, <Task pending coro=<wait() running at c:\python36\Lib\asyncio\tasks.py:313> w
ait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x0000025F84BF3F18>()]>>, <Task pending coro=<do_some_work() running at asyncioDemo.py:158> wait_for=<Future pending cb
=[<TaskWakeupMethWrapper object at 0x0000025F84BF3D98>()]> cb=[_wait.<locals>._on_completion() at c:\python36\Lib\asyncio\tasks.py:380]>}

True
True
True
True

2.00264573097229

不同线程的事件循环

  • 我们的事件循环用于注册协程,而有的协程需要动态的添加到事件循环中。一个简单的方式就是使用多线程。当前线程创建一个事件循环,然后在新建一个线程,在新线程中启动事件循环。当前线程不会被block。

import asyncio
from threading import Thread
import time

now = lambda: time.time()


def start_loop(loop):
    asyncio.set_event_loop(loop)
    loop.run_forever()


async def do_some_work(x):
    print('Waiting {}'.format(x))
    await asyncio.sleep(x)
    print('Done after {}s'.format(x))


def work(x):
    print("开始", x)
    time.sleep(x)
    print("结束", x)


start = now()

new_loop = asyncio.new_event_loop()

t = Thread(target=start_loop, args=(new_loop,))
t.start()

new_loop.call_soon_threadsafe(work, 6)
new_loop.call_soon_threadsafe(work, 3)
print(now() - start)
"""
开始 6
0.002008199691772461
结束 6
开始 3
结束 3

"""
'''
启动上述代码之后,当前线程不会被block,新线程中会按照顺序执行call_soon_threadsafe方法注册的more_work方法, 后者因为time.sleep操作是同步阻塞的,因此运行完毕more_work需要大致6 + 3
'''
#
# asyncio.run_coroutine_threadsafe(do_some_work(6), new_loop)
# asyncio.run_coroutine_threadsafe(do_some_work(3), new_loop)
# print(now() - start)
"""
Waiting 6
Waiting 3
0.0009968280792236328
Done after 3s
Done after 6s

"""
'''
上述的例子,主线程中创建一个new_loop,然后在另外的子线程中开启一个无限事件循环。 主线程通过run_coroutine_threadsafe新注册协程对象。这样就能在子线程中进行事件循环的并发操作,同时主线程又不会被block。一共执行的时间大概在6s左右。
'''
  • 参考 廖雪峰 文档
async def wget(host):
    print('wget %s...' % host)

    reader, writer = await asyncio.open_connection(host, 80)

    header = 'GET / HTTP/1.0\r\nHost: %s\r\n\r\n' % host
    writer.write(header.encode('utf-8'))
    await writer.drain()
    while True:
        line = await reader.readline()
        if line == b'\r\n':
            break
        print('%s header > %s' % (host, line.decode('utf-8').rstrip()))
    # Ignore the body, close the socket
    writer.close()


loop = asyncio.get_event_loop()
tasks = [wget(host) for host in ['www.sina.com.cn', 'www.sohu.com', 'www.163.com']]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
wget www.sina.com.cn...
wget www.sohu.com...
wget www.163.com...
www.sina.com.cn header > HTTP/1.1 302 Moved Temporarily
www.sina.com.cn header > Server: nginx
www.sina.com.cn header > Date: Sat, 27 Apr 2019 14:14:29 GMT
www.sina.com.cn header > Content-Type: text/html
www.sina.com.cn header > Content-Length: 154
www.sina.com.cn header > Connection: close
www.sina.com.cn header > Location: https://www.sina.com.cn/
www.sina.com.cn header > X-Via-CDN: f=edge,s=cmcc.hebei.ha2ts4.140.nb.sinaedge.com,c=183.197.88.253;
www.sina.com.cn header > X-Via-Edge: 1556374469876fd58c5b798403e6f6e7f94d2
www.sohu.com header > HTTP/1.1 200 OK
www.sohu.com header > Content-Type: text/html;charset=UTF-8
www.sohu.com header > Connection: close
www.sohu.com header > Server: nginx
www.sohu.com header > Date: Sat, 27 Apr 2019 14:13:44 GMT
www.sohu.com header > Cache-Control: max-age=60
www.sohu.com header > X-From-Sohu: X-SRC-Cached
www.sohu.com header > Content-Encoding: gzip
www.sohu.com header > FSS-Cache: HIT from 4742539.7953813.5615036
www.sohu.com header > FSS-Proxy: Powered by 3628410.5725572.4500890
www.163.com header > HTTP/1.0 302 Moved Temporarily
www.163.com header > Server: Cdn Cache Server V2.0
www.163.com header > Date: Sat, 27 Apr 2019 14:14:29 GMT
www.163.com header > Content-Length: 0
www.163.com header > Location: http://www.163.com/special/0077jt/error_isp.html
www.163.com header > X-Via: 1.0 xiyidong136:1 (Cdn Cache Server V2.0)
www.163.com header > Connection: close

posted @ 2019-04-27 22:16  拐弯  阅读(2380)  评论(0编辑  收藏  举报