Python中的多进程
由于cPython的gill,多线程未必是CPU密集型程序的好的选择。
多线程可以完全独立的进程环境中运行程序,可以充分利用多处理器。
但是进程本身的隔离带来的数据不共享也是一种问题,线程比进程轻量级。
1、Multiprocessing
import multiprocessing
import datetime
def calc(i):
sum = 0
for _ in range(10000000):
sum += 1
print(i,sum)
if __name__ == '__main__':
start = datetime.datetime.now()
ps =[]
for i in range(5):
p = multiprocessing.Process(target=calc,args=(i,),name='calc-{}'.format(i))
ps.append(p)
p.start()
for p in ps:
p.join()
delta = (datetime.datetime.now()-start).total_seconds()
print(delta)
print('end==')
0 10000000
1 10000000
2 10000000
3 10000000
4 10000000
2.738819
end==
多进程,是真的并行。
名称 |
说明 |
Pid |
进程id |
Exitcode |
进程的退出状态码 |
Terminate() |
终止指定的进程 |
进程间同步:同步提供了和线程同步一样的类,使用的方法一样,使用效果也类似。
不过,进程间代价要高于线程间,而且底层实现是不同的,只不过Python屏蔽了这些不同的地方,让用户简单使用多进程。
Multiprocessing还提供了共享内存,服务器进程来共享数据,还提供了queue队列,pipe管道用于进程间通信。
通信方式不同:
多进程就是启动多个解释器进程,进程间通信必须序列化和反序列化。
数据的线程安全性问题。由于每个进程之间没有实现度贤臣,gil可以说没有什么用了。
2、进程池
Multiprocessing。Pool是进程池类;
名称 |
说明 |
Apply(self,func,args=(),kwds={})
|
阻塞执行,导致主进程执行其他子进程一个个执行 |
Apply_async(self,func,args=(),kwargs={},callback=None,error_callback=None) |
与apply方法用法一直,非阻塞执行,得到结果后会执行回调 |
Close()
|
关闭池,池不能再接受新的任务 |
Terminate()
|
结束工作进程,不在处理为处理的任务 |
Join()
|
主进程阻塞等待子进程的退出,join方法要在close或terminate之后使用 |
3、多进程、多线程的选择
CPU密集型:cPython中使用到了gill,多线程的时候锁相互竞争,且多核优势不能发挥,Python多进程效率更高。
IO密集型:适合多线程,可以减少多进程之间的IO序列化开销,且在IO等待的时候,切换到其他线程继续执行,效率不错。
4、应用场景
请求/应答模型:web应用中常见的处理模型。
Master启动多个worker工作进程,一般和CPU数目相同。发挥多核优势。
Worker工作进程中,往往需要操作网络IO和磁盘IO,启动多线程,提高高并发处理能力,worker处理用户的请求,往往需要等待数据,处理完请求还需要通过网络IO响应返回,这个就是nginx工作模式。
、concurrent包
1、concurrent.futures
异步并行任务编程模块,提供一个高级的异步可执行的便利接口。
提供了两个池执行器
Threadpoolexecutor异步调用的线程池的executor
Processpoolexecutor异步调用的进程池的executor。
2、threadpoolexecutor对象
首先需要定义一个池的执行器对象,executor类子类对象。
方法 |
含义 |
Threadpoolexecutor(man_worker=1) |
池中至多创建max_wokers个线程的池来同时异步执行,返回exector实例 |
Submit(fn,*args,**kwargs) |
提交执行的函数及其参数,返回future实例 |
Shutdown(wait=True) |
清理池 |
Future类
方法 |
含义 |
Done() |
如果调用被成功的取消或者执行完成,返回true |
Cancelled() |
如果调用被成功的取消,返回true |
Running() |
如果正在运行且不能被取消,返回true |
Cancel() |
尝试取消调用,如果已经执行且不能取消返回False,否则返回true |
Result(timeout=none) |
取返回的结果,timeout为none,一直等待返回,timeout设置到期,抛出concurrent.futures.timeouterror异常 |
Exception(timeout=none) |
取返回的结果,timeout为none,一直等待返回,timeout设置到期,抛出concurrent.futures.timeouterror异常 |
import threading
from concurrent import futures
import logging
import time
FORMAT = '%(asctime)s %(threadName)s %(thread)d %(message)s'
logging.basicConfig(format=FORMAT,level=logging.INFO)
def worker(n):
logging.info('start to work{]'.format(n))
time.sleep(4)
logging.info('stop{}'.format(n))
exector = futures.ThreadPoolExecutor(max_workers=3)
fs = []
for i in range(3):
futures = exector.submit(worker,i)
fs.append(futures)
for i in range(3,6):
futures = exector.submit(worker,i)
fs.append(futures)
while True:
time.sleep(5)
logging.info(threading.enumerate())
flag = True
for f in fs:
logging.info(f.done())
flag = flag and f.done()
print('-'*30)
if flag:
exector.shutdown()
logging.info(threading.enumerate())
break
------------------------------
2018-06-13 09:57:35,049 MainThread 8376 [<_MainThread(MainThread, started 8376)>, <Thread(Thread-1, started daemon 7480)>, <Thread(Thread-3, started daemon 7368)>, <Thread(Thread-2, started daemon 7464)>]
2018-06-13 09:57:35,049 MainThread 8376 True
2018-06-13 09:57:35,050 MainThread 8376 True
2018-06-13 09:57:35,050 MainThread 8376 True
2018-06-13 09:57:35,050 MainThread 8376 True
2018-06-13 09:57:35,050 MainThread 8376 True
2018-06-13 09:57:35,051 MainThread 8376 True
2018-06-13 09:57:35,051 MainThread 8376 [<_MainThread(MainThread, started 8376)>]
3、processpoolexector对象
import threading
from concurrent import futures
import logging
import time
FORMAT = '%(asctime)s %(threadName)s %(thread)d %(message)s'
logging.basicConfig(format=FORMAT,level=logging.INFO)
def worker(n):
logging.info('start to work{]'.format(n))
time.sleep(4)
logging.info('stop{}'.format(n))
if __name__ == '__main__':
exector = futures.ThreadPoolExecutor(max_workers=3)
fs = []
for i in range(3):
futures = exector.submit(worker,i)
fs.append(futures)
for i in range(3,6):
futures = exector.submit(worker,i)
fs.append(futures)
while True:
time.sleep(5)
logging.info(threading.enumerate())
flag = True
for f in fs:
logging.info(f.done())
flag = flag and f.done()
print('-'*30)
if flag:
exector.shutdown()
logging.info(threading.enumerate())
break
------------------------------
2018-06-13 10:01:18,076 MainThread 6436 [<Thread(Thread-3, started daemon 4188)>, <Thread(Thread-1, started daemon 7284)>, <_MainThread(MainThread, started 6436)>, <Thread(Thread-2, started daemon 6164)>]
2018-06-13 10:01:18,076 MainThread 6436 True
2018-06-13 10:01:18,076 MainThread 6436 True
2018-06-13 10:01:18,077 MainThread 6436 True
2018-06-13 10:01:18,077 MainThread 6436 True
2018-06-13 10:01:18,077 MainThread 6436 True
2018-06-13 10:01:18,077 MainThread 6436 True
2018-06-13 10:01:18,078 MainThread 6436 [<_MainThread(MainThread, started 6436)>]
进程代码的执行过程中,必须要加上if __name__ == ‘__main__’
4、支持上下文管理的调用
Concurrent.futures.processpoolexecutor继承自concurrent.futures.base.executor,而父类有__enter__、__exit__、方法,支持上下文管理,可以使用with语句。
__exit__方法本质上还是调用shutdown(wait=True),就会一直阻塞到所有运行的任务完成。
import threading
from concurrent import futures
import logging
import time
FORMAT = '%(asctime)s %(threadName)s %(thread)d %(message)s'
logging.basicConfig(format=FORMAT,level=logging.INFO)
def worker(n):
logging.info('start to work{}'.format(n))
time.sleep(5)
logging.info('stop{}'.format(n))
return n + 100
if __name__ == '__main__':
executor = futures.ProcessPoolExecutor(max_workers=3)
with executor:
fs = []
for i in range(3):
futures = executor.submit(worker,i)
fs.append(futures)
for i in range(3,6):
futures = executor.submit(worker,i)
fs.append(futures)
while True:
time.sleep(2)
logging.info(threading.enumerate())
flag = True
for f in fs:
logging.info(f.done())
flag = flag and f.done()
if f.done():
logging.info('result={}'.format(f.result()))
print('-'*30)
if flag:break
logging.info('-------end--------')
logging.info(threading.enumerate())
2018-06-13 10:18:35,744 MainThread 5468 start to work0
2018-06-13 10:18:35,751 MainThread 7936 start to work1
2018-06-13 10:18:35,763 MainThread 7020 start to work2
2018-06-13 10:18:37,528 MainThread 7976 [<_MainThread(MainThread, started 7976)>, <Thread(QueueFeederThread, started daemon 8136)>, <Thread(Thread-1, started daemon 3932)>]
2018-06-13 10:18:37,528 MainThread 7976 False
2018-06-13 10:18:37,529 MainThread 7976 False
2018-06-13 10:18:37,529 MainThread 7976 False
2018-06-13 10:18:37,529 MainThread 7976 False
2018-06-13 10:18:37,529 MainThread 7976 False
2018-06-13 10:18:37,529 MainThread 7976 False
------------------------------
------------------------------
2018-06-13 10:18:39,530 MainThread 7976 [<_MainThread(MainThread, started 7976)>, <Thread(QueueFeederThread, started daemon 8136)>, <Thread(Thread-1, started daemon 3932)>]
2018-06-13 10:18:39,530 MainThread 7976 False
2018-06-13 10:18:39,531 MainThread 7976 False
2018-06-13 10:18:39,531 MainThread 7976 False
2018-06-13 10:18:39,531 MainThread 7976 False
2018-06-13 10:18:39,532 MainThread 7976 False
2018-06-13 10:18:39,532 MainThread 7976 False
2018-06-13 10:18:40,744 MainThread 5468 stop0
2018-06-13 10:18:40,745 MainThread 5468 start to work3
2018-06-13 10:18:40,751 MainThread 7936 stop1
2018-06-13 10:18:40,753 MainThread 7936 start to work4
2018-06-13 10:18:40,764 MainThread 7020 stop2
2018-06-13 10:18:40,764 MainThread 7020 start to work5
2018-06-13 10:18:41,533 MainThread 7976 [<_MainThread(MainThread, started 7976)>, <Thread(QueueFeederThread, started daemon 8136)>, <Thread(Thread-1, started daemon 3932)>]
2018-06-13 10:18:41,533 MainThread 7976 True
2018-06-13 10:18:41,534 MainThread 7976 result=100
2018-06-13 10:18:41,534 MainThread 7976 True
2018-06-13 10:18:41,535 MainThread 7976 result=101
2018-06-13 10:18:41,535 MainThread 7976 True
2018-06-13 10:18:41,536 MainThread 7976 result=102
2018-06-13 10:18:41,536 MainThread 7976 False
2018-06-13 10:18:41,536 MainThread 7976 False
2018-06-13 10:18:41,537 MainThread 7976 False
------------------------------
------------------------------
2018-06-13 10:18:43,537 MainThread 7976 [<_MainThread(MainThread, started 7976)>, <Thread(QueueFeederThread, started daemon 8136)>, <Thread(Thread-1, started daemon 3932)>]
2018-06-13 10:18:43,537 MainThread 7976 True
2018-06-13 10:18:43,537 MainThread 7976 result=100
2018-06-13 10:18:43,538 MainThread 7976 True
2018-06-13 10:18:43,538 MainThread 7976 result=101
2018-06-13 10:18:43,538 MainThread 7976 True
2018-06-13 10:18:43,538 MainThread 7976 result=102
2018-06-13 10:18:43,538 MainThread 7976 False
2018-06-13 10:18:43,539 MainThread 7976 False
2018-06-13 10:18:43,539 MainThread 7976 False
------------------------------
2018-06-13 10:18:45,540 MainThread 7976 [<_MainThread(MainThread, started 7976)>, <Thread(QueueFeederThread, started daemon 8136)>, <Thread(Thread-1, started daemon 3932)>]
2018-06-13 10:18:45,540 MainThread 7976 True
2018-06-13 10:18:45,540 MainThread 7976 result=100
2018-06-13 10:18:45,541 MainThread 7976 True
2018-06-13 10:18:45,541 MainThread 7976 result=101
2018-06-13 10:18:45,541 MainThread 7976 True
2018-06-13 10:18:45,541 MainThread 7976 result=102
2018-06-13 10:18:45,542 MainThread 7976 False
2018-06-13 10:18:45,542 MainThread 7976 False
2018-06-13 10:18:45,542 MainThread 7976 False
2018-06-13 10:18:45,746 MainThread 5468 stop3
2018-06-13 10:18:45,754 MainThread 7936 stop4
2018-06-13 10:18:45,765 MainThread 7020 stop5
------------------------------
2018-06-13 10:18:47,542 MainThread 7976 [<_MainThread(MainThread, started 7976)>, <Thread(QueueFeederThread, started daemon 8136)>, <Thread(Thread-1, started daemon 3932)>]
2018-06-13 10:18:47,542 MainThread 7976 True
2018-06-13 10:18:47,542 MainThread 7976 result=100
2018-06-13 10:18:47,543 MainThread 7976 True
2018-06-13 10:18:47,543 MainThread 7976 result=101
2018-06-13 10:18:47,544 MainThread 7976 True
2018-06-13 10:18:47,544 MainThread 7976 result=102
2018-06-13 10:18:47,544 MainThread 7976 True
2018-06-13 10:18:47,544 MainThread 7976 result=103
2018-06-13 10:18:47,544 MainThread 7976 True
2018-06-13 10:18:47,545 MainThread 7976 result=104
2018-06-13 10:18:47,545 MainThread 7976 True
2018-06-13 10:18:47,545 MainThread 7976 result=105
2018-06-13 10:18:47,587 MainThread 7976 -------end--------
2018-06-13 10:18:47,587 MainThread 7976 [<_MainThread(MainThread, started 7976)>]
总结,统一了线程池、进程池调用,简化了编程。
无法设置线程名称。