Python 多进程概述
multiprocessing
python中的多线程其实并不是真正的多线程,如果想要充分地使用多核CPU的资源,在python中大部分情况需要使用多进程。Python提供了非常好用的多进程包multiprocessing,只需要定义一个函数,Python会完成其他所有事情。借助这个包,可以轻松完成从单进程到并发执行的转换。multiprocessing支持子进程、通信和共享数据、执行不同形式的同步,提供了Process、Queue、Pipe、Lock等组件。
multiprocessing包是Python中的多进程管理包。与threading.Thread类似,它可以利用multiprocessing.Process对象来创建一个进程。该进程可以运行在Python程序内部编写的函数。该Process对象与Thread对象的用法相同,也有start(), run(), join()的方法。此外multiprocessing包中也有Lock/Event/Semaphore/Condition类 (这些对象可以像多线程那样,通过参数传递给各个进程),用以同步进程,其用法与threading包中的同名类一致。所以,multiprocessing的很大一部份与threading使用同一套API,只不过换到了多进程的情境。
但在使用这些共享API的时候,我们要注意以下几点:
- 在UNIX平台上,当某个进程终结之后,该进程需要被其父进程调用wait,否则进程成为僵尸进程(Zombie)。所以,有必要对每个Process对象调用join()方法 (实际上等同于wait)。对于多线程来说,由于只有一个进程,所以不存在此必要性。
- multiprocessing提供了threading包中没有的IPC(比如Pipe和Queue),效率上更高。应优先考虑Pipe和Queue,避免使用Lock/Event/Semaphore/Condition等同步方式 (因为它们占据的不是用户进程的资源)。
- 多进程应该避免共享资源。在多线程中,我们可以比较容易地共享资源,比如使用全局变量或者传递参数。在多进程情况下,由于每个进程有自己独立的内存空间,以上方法并不合适。此时我们可以通过共享内存和Manager的方法来共享资源。但这样做提高了程序的复杂度,并因为同步的需要而降低了程序的效率。
Process.PID中保存有PID,如果进程还没有start(),则PID为None。
window系统下,需要注意的是要想启动一个子进程,必须加上那句if __name__ == "main",进程相关的要写在这句下面。
1. Process
创建进程的类:
Process([group [, target [, name [, args [, kwargs]]]]]),target表示调用对象,args表示调用对象的位置参数元组。kwargs表示调用对象的字典。name为别名。group实质上不使用。
方法:is_alive()、join([timeout])、run()、start()、terminate()。其中,Process以start()启动某个进程。
属性:
authkey、daemon(要通过start()设置)
exitcode(进程在运行时为None、如果为–N,表示被信号N结束)
name、pid。其中daemon是父进程终止后自动终止,且自己不能产生新进程,必须在start()之前设置。
1 #coding:UTF8 2 3 import multiprocessing 4 import time 5 6 def worker(interval): 7 n = 5 8 while n > 0: 9 print ("The time is {0}".format(time.ctime())) 10 time.sleep(1) 11 n -= 1 12 13 if __name__=='__main__': 14 p = multiprocessing.Process(target=worker,args=(3,)) 15 p.start() 16 print "p.pid:", p.pid 17 print "p.name:", p.name 18 print "p.is_alive:", p.is_alive() 19 print format(time.ctime())
1 p.pid: 2208 2 p.name: Process-1 3 p.is_alive: True 4 Thu Feb 23 15:31:02 2017 5 The time is Thu Feb 23 15:31:02 2017 6 The time is Thu Feb 23 15:31:03 2017 7 The time is Thu Feb 23 15:31:04 2017 8 The time is Thu Feb 23 15:31:05 2017 9 The time is Thu Feb 23 15:31:06 2017
1 #coding:UTF8 2 3 import multiprocessing 4 import time 5 6 7 def work1(interval): 8 print 'work1' 9 time.sleep(interval) 10 print 'end_work1' 11 12 def work2(interval): 13 print 'work2' 14 time.sleep(interval) 15 print 'end_work2' 16 17 def work3(interval): 18 print 'work3' 19 time.sleep(interval) 20 print 'end_work3' 21 22 if __name__=="__main__": 23 p1 = multiprocessing.Process(target=work1,args=(2,)) 24 p2 = multiprocessing.Process(target=work2,args=(2,)) 25 p3 = multiprocessing.Process(target=work3,args=(2,)) 26 27 p1.start() 28 p2.start() 29 p3.start() 30 31 print ("The number of CPU is:" + str(multiprocessing.cpu_count())) 32 for p in multiprocessing.active_children(): 33 print ("Child p.name:" + p.name + "\tp.id:" + str(p.pid)) 34 35 print "END!!!" 36
1 The number of CPU is:4 2 Child p.name:Process-1 p.id:7884 3 Child p.name:Process-3 p.id:5948 4 Child p.name:Process-2 p.id:4288 5 END!!! 6 work1 7 work2 8 work3 9 end_work1 10 end_work2 11 end_work3
1 #coding:UTF8 2 3 #将进程定义为 类 4 5 import multiprocessing 6 import time 7 8 class ClockProcess(multiprocessing.Process): 9 def __init__(self,interval): 10 multiprocessing.Process.__init__(self) 11 self.interval = interval 12 13 def run(self): 14 n = 5 15 while n > 0: 16 print ("The time is {0}".format(time.ctime())) 17 time.sleep((self.interval)) 18 n -= 1 19 20 if __name__=="__main__": 21 p = ClockProcess(3) 22 p.start()
注:进程p调用start()时,自动调用run()
1 The time is Thu Feb 23 16:28:56 2017 2 The time is Thu Feb 23 16:28:59 2017 3 The time is Thu Feb 23 16:29:02 2017 4 The time is Thu Feb 23 16:29:05 2017 5 The time is Thu Feb 23 16:29:08 2017
daemon程序对比结果
不加daemon
1 #coding:UTF8 2 3 import multiprocessing 4 import time 5 6 def work(interval): 7 print ("work start:{0}".format(time.ctime())) 8 time.sleep(interval) 9 print ("word end:{0}".format(time.ctime())) 10 11 if __name__=="__main__": 12 p = multiprocessing.Process(target=work,args = (2,)) 13 p.start() 14 print "End!!"
End!! work start:Thu Feb 23 16:36:58 2017 word end:Thu Feb 23 16:37:00 2017
1 #coding:UTF8 2 3 import multiprocessing 4 import time 5 6 def work(interval): 7 print ("work start:{0}".format(time.ctime())) 8 time.sleep(interval) 9 print ("word end:{0}".format(time.ctime())) 10 11 if __name__=="__main__": 12 p = multiprocessing.Process(target=work,args = (2,)) 13 #加上daemon属性 14 p.daemon = True 15 p.start() 16 print "End!!"
End!!
注:因子进程设置了daemon属性,主进程结束,它们就随着结束了。
1 #coding:UTF8 2 3 import multiprocessing 4 import time 5 6 def work(interval): 7 print ("work start:{0}".format(time.ctime())) 8 time.sleep(interval) 9 print ("word end:{0}".format(time.ctime())) 10 11 if __name__=="__main__": 12 p = multiprocessing.Process(target=work,args = (2,)) 13 #加上daemon属性 14 p.daemon = True 15 p.start() 16 p.join() #设置daemon执行完结束的方法 17 print "End!!"
work start:Thu Feb 23 16:38:47 2017 word end:Thu Feb 23 16:38:49 2017 End!!
2. Lock
当多个进程需要访问共享资源的时候,Lock可以用来避免访问的冲突。
1 #coding:UTF8 2 3 import multiprocessing 4 import sys 5 6 def Work_With(lock,f): 7 with lock: 8 fs = open(f,'a+') 9 n = 10 10 while n > 1: 11 fs.write("Lock acquired via with\n") 12 n -= 1 13 fs.close() 14 15 def Worker_No_With(lock, f): 16 lock.acquire() 17 try: 18 fs = open(f,'a+') 19 n = 10 20 while n > 1: 21 fs.write("Lock acquired directly\n") 22 n -= 1 23 fs.close() 24 finally: 25 lock.release() 26 27 if __name__=="__main__": 28 lock = multiprocessing.Lock() 29 f = "filetmp.txt" 30 pw = multiprocessing.Process(target=Work_With, args = (lock,f)) 31 pnw = multiprocessing.Process(target = Worker_No_With,args = (lock,f)) 32 pw.start() 33 pnw.start() 34 print "End!!"
1 End!! 2 3 文件内容:filetmp.txt 4 Lock acquired via with 5 Lock acquired via with 6 Lock acquired via with 7 Lock acquired via with 8 Lock acquired via with 9 Lock acquired via with 10 Lock acquired via with 11 Lock acquired via with 12 Lock acquired via with 13 Lock acquired directly 14 Lock acquired directly 15 Lock acquired directly 16 Lock acquired directly 17 Lock acquired directly 18 Lock acquired directly 19 Lock acquired directly 20 Lock acquired directly 21 Lock acquired directly