协程-遇到I/O自动切换
参考博客: http://www.cnblogs.com/alex3714/articles/5248247.html
一、前言
Gevent 是一个第三方库,可以轻松通过gevent实现并发同步或异步编程,在gevent中用到的主要模式是Greenlet, 它是以C扩展模块形式接入Python的轻量级协程。 Greenlet全部运行在主程序操作系统进程的内部,但它们被协作式地调度。
二、Gevent
import gevent def func1(): print('running in func1...') gevent.sleep(2) # I/O中断2秒 print('Explicit context switch to func1 again...') def func2(): print('Explicit context func2...') gevent.sleep(1) print('Implicit context switch to func2 again...') gevent.joinall([ gevent.spawn(func1), gevent.spawn(func2), ])
import gevent def func1(): print('running in func1...') gevent.sleep(2) print('Explicit context switch to func1 again...') def func2(): print('Explicit context func2...') gevent.sleep(1) print('Implicit context switch to func1 again...') def func3(): print('running in func3...') gevent.sleep(0) # 表示中断,没有时间 print('running func3 again...') gevent.joinall([ gevent.spawn(func1), gevent.spawn(func2), gevent.spawn(func3), ])
三、遇到I/O阻塞时自动切换任务
简单的gevent并发爬网页
# -*- coding: UTF-8 -*- from urllib import request import time from gevent import monkey import gevent monkey.patch_all() # gevent并不能识别urllib 的I/O中断,monkey.patch_all会给所有I/O中断做标签 def f(url): print('Get: %s' % url) resp = request.urlopen(url) data = resp.read() print('%d bytes received from %s.' % (len(data), url)) # 同步处理 urls = [ 'https://www.python.org/', 'https://www.yahoo.com/', 'https://github.com/' ] time_start = time.time() for url in urls: f(url) print('同步take %s seconds' % (time.time() - time_start)) # 异步处理 async_time = time.time() gevent.joinall([ gevent.spawn(f, 'https://www.python.org/'), gevent.spawn(f, 'https://www.yahoo.com/'), gevent.spawn(f, 'https://github.com/') ]) print('异步take %s seconds' % (time.time() - async_time))
结果:
Get: https://www.python.org/ 48956 bytes received from https://www.python.org/. Get: https://www.yahoo.com/ 517389 bytes received from https://www.yahoo.com/. Get: https://github.com/ 51473 bytes received from https://github.com/. 同步take 4.6533966064453125 seconds Get: https://www.python.org/ Get: https://www.yahoo.com/ Get: https://github.com/ 48956 bytes received from https://www.python.org/. 521617 bytes received from https://www.yahoo.com/. 51473 bytes received from https://github.com/. 异步take 1.916191816329956 seconds # 可以看到两者之间时间差别还是很大的
如果没有monkey.patch_all(),程序也能正常运行,但是同步和异步时间基本没有差别,因为gevent并不知道urllib的I/O中断,所以也不会自动切换,所以哪怕使用了gevent,程序也是同步执行(串行)。而monkey.patch_all()就可以为所有urllib中的I/O中断做标记,gevent遇到这些中断就会自动切换,执行其他没有I/O操作的程序。
四、gevent实现单线程下的多socket并发
server side
import sys import socket import time import gevent from gevent import socket,monkey monkey.patch_all() def server(port): s = socket.socket() s.bind(('0.0.0.0', port)) s.listen(500) while True: cli, addr = s.accept() gevent.spawn(handle_request, cli) def handle_request(conn): try: while True: data = conn.recv(1024) print("recv:", data) conn.send(data) if not data: conn.shutdown(socket.SHUT_WR) except Exception as ex: print(ex) finally: conn.close() if __name__ == '__main__': server(8001)
client side
import socket HOST = 'localhost' # The remote host PORT = 8001 # The same port as used by the server s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect((HOST, PORT)) while True: msg = bytes(input(">>:"),encoding="utf8") s.sendall(msg) data = s.recv(1024) #print(data) print('Received', repr(data)) s.close()
# 并发100 个socket连接 import socket import threading def sock_conn(): client = socket.socket() client.connect(("localhost",8001)) count = 0 while True: #msg = input(">>:").strip() #if len(msg) == 0:continue client.send( ("hello %s" %count).encode("utf-8")) data = client.recv(1024) print("[%s]recv from server:" % threading.get_ident(),data.decode()) #结果 count +=1 client.close() for i in range(100): t = threading.Thread(target=sock_conn) t.start()