Python攻克之路-线程进程
1.进程线程
(1).进程
一个py文件中的代码是从上到下执行的,解释器(通用的cpython)做了一个翻译成字节码,最终到CPU上执行.硬件有cpu,mem,disk,到操作系统. py文件给cpython解释器翻译,到操作系统,最终给CPU执行,py文件也是一个进程.
(2).线程
描述:操作系统能够进行运算调度的最小单位的,它被包含在进程之中,是进程的实际动作单位,一条线程指的是进程中单一的控制流,一个进程中可以并发多个线程,每条线程并行执行不同的任务.
分析:可以让操作系统运行起来最小的单位是一个线程,以下代码从上到下执行,当执行时也是最终交给CPU执行,可以同时有多个线程
例子1
[root@node2 threading]# cat test.py #!/usr/local/python3/bin/python3 print('ok') print('ok1')
例子2
[root@node2 threading]# cat test.py #当执行时,以下的代码可以看作一个线程 #!/usr/local/python3/bin/python3 import time begin=time.time() def foo(): print('foo') time.sleep(1) #sleep时,cpu没有工作,但是下面也不能工作,实际是一个线程的执行 def bar(): print('bar') time.sleep(2) foo() bar() end=time.time() print(end-begin) [root@node2 threading]# python3 test.py foo bar 3.004286050796509
例子3:多线程
[root@node2 threading]# cat test.py #!/usr/local/python3/bin/python3 import time #从这开始到sleep(2)是主线程,主线程也是在从上到下在执行任务 import threading begin=time.time() def foo(n): print('foo%s'%n) time.sleep(1) def bar(n): print('bar%s'%n) time.sleep(2) # 子线程: 创建线程是为了执行任务,也就是执行某个函数,每个子线程都有自身的任务,而且以抢占式的与主线程一起执行任务 t1=threading.Thread(target=foo,args=(1,)) #相当于foo(2),target是要执行的函数,args要传递的参数,实际是创建一个线程对象 t2=threading.Thread(target=bar,args=(2,)) t1.start() ##执行 t2.start() print('......in the main......') end=time.time() print(end-begin) [root@node2 threading]# python3 test.py foo1 ##子线程1 bar2 ##子线程2 ......in the main...... ##主线程,三个线程不分先后的执行,但是还没有结束,当foo过1秒后停,bar过2秒后停 0.0025224685668945312 ##执行时间,但是只是主线程使用的时间,就创建从start开始,创建两个对象,打印一行没有其他时间
扩展:查看两个子线程运行时间
[root@node2 threading]# cat test.py #!/usr/local/python3/bin/python3 import time import threading begin=time.time() def foo(n): print('foo%s'%n) time.sleep(1) def bar(n): print('bar%s'%n) time.sleep(2) t1=threading.Thread(target=foo,args=(1,)) t2=threading.Thread(target=bar,args=(2,)) t1.start() t2.start() print('......in the main......') #join()方法是t1和t2执行完成才会向下继续运行 t1.join() t2.join() end=time.time() print(end-begin) [root@node2 threading]# python3 test.py foo1 bar2 ......in the main...... 2.0047287940979004 ###整个程序执行2秒
(3).小结:以上线程的执行并不是真正的并行,因为有个sleep时间并不会占用CPU,这时单个CPU情况下,会切换执行其他事件.CPU在执行相当于看一本书,把看作某一页时停下来休息,自己会记住页数,行数,那个单词,休息完继续之前的位置开始进行阅读,CPU也是它会记住这种上下切换的状态,完成某个任务时能够正常切换
并行:真正意义上两事件同时在执行
并发:如在一个CPU情况下,电脑既看电影又听音乐,但是它在某一时刻只能做一件事,所以只是来回的切换
CPU切换的条件:a.分配给每个任务一段时间,当某个任务时间运行完,就切换到另一个事件
b.I/O阻塞:如accept(),read(),sleep并不知道切换什么时间切换停止了
进程:不能资源共享(线程能资源共享)
描述:对一堆线程的资源组合,进程可能一个或多个线程
按执行角度:进程和线程执行速度一样,并没有快慢
在没有使用sleep情况下,而且并发执行时要不断的线程之间的切换,并且是计算密集型的任务,这时串行会占优势
I/O密集型:有阻塞状态,不会一直使用CPU,中间会有等待
计算密集型:没有阻塞,且在只有一颗CPU下,计算会偏慢,python在这方面的劣势
串行执行
[root@node2 threading]# cat compute.py #!/usr/local/python3/bin/python3 import time begin_time=time.time() def add(n): sum=0 for i in range(n): sum+=i print(sum) add(10000000) add(20000000) end_time=time.time() print(end_time-begin_time) [root@node2 threading]# python3 compute.py 49999995000000 199999990000000 7.737548828125
并发执行
[root@node2 threading]# cat compute.py #!/usr/local/python3/bin/python3 import time import threading begin_time=time.time() def add(n): sum=0 for i in range(n): sum+=i print(sum) t1=threading.Thread(target=add,args=(10000000,)) t1.start() t2=threading.Thread(target=add,args=(20000000,)) t2.start() t1.join() t2.join() end_time=time.time() print(end_time-begin_time) [root@node2 threading]# python3 compute.py 49999995000000 199999990000000 7.809268951416016
2.python 的GIL
(1).问题:计算机有多颗CPU,python不能很好的发挥它的多线程执行功效
分析:是解释器的问题,特别是Cpython,因为它加了一把锁gil( global interpreter lock,全局解释器锁),它在影响是在同一时刻只能有一个线程进入解释器,这样会影响出即使物理服务器有多核的CPU却无法使用,发挥它的最大功效,主要原因是历史原因,使用之无法去除
解决:a.实现多进程,不同的任务分配到不同的进程,进行同时执行,但是存在问题,因为线程之间可以实现数据共享,进程不可以,彼此之间是独立的
b. 协程,不抢占,还是一个CPU+多进程
summary: a.处理的任务I/O密集型的,可以使用多线程
b.处理计算密集型,使用C来实现比较好
(2).线程与进程的区别
进程:一个程序的执行实例,可以创建子进程,实现是COPY整个主进程,所以开进程比线程开销大,而且是互相独立,不影响
区别:
- 线程共享一份进程的内存空间,进程是有独立的地址空间
- 进程里的线程是可以数据共享
- 线程之间可以通信
- 线程容易创建,进程是以复制形式创建
- 线程之间可以互相操作
- 主线程影响子线程,进程不能影响子进程
(3).例子:串行执行
[root@node2 threading]# cat music.sh #!/usr/local/python3/bin/python3 from time import ctime,sleep def music(func): for i in range(2): print("i was listening %s. %s" %(func,ctime())) sleep(1) def movie(func): for i in range(2): print('i was at the %s! %s' %(func,ctime())) sleep(5) if __name__=='__main__': music('lady ga ga') movie('backstreet') [root@node2 threading]# python3 music.sh i was listening lady ga ga. Thu May 17 07:56:13 2018 i was listening lady ga ga. Thu May 17 07:56:14 2018 i was at the backstreet! Thu May 17 07:56:15 2018 i was at the backstreet! Thu May 17 07:56:20 2018
优化
[root@node2 threading]# cat music-a.sh #!/usr/local/python3/bin/python3 import threading from time import ctime,sleep import time def music(func): for i in range(2): print("Start listening %s. %s" %(func,ctime())) sleep(1) #修改为4,提高主要看sleep的时间,取决于长的,所以是提高8秒 #分析:串行时,music两个循环是4*2=8,movie两个循环是10,一共是18秒,并发执行时,第一个是8,第二个是10,因为是并发可以看作同时执行,两个是前8秒钟是重复的, #所以取决于长的,一共是10秒,串行一并行=18-10=8 print("end %s"%ctime()) def movie(func): for i in range(2): print("Start watching %s! %s" %(func,ctime())) sleep(5) print('end %s'%ctime()) threads = [] t1 = threading.Thread(target=music,args=('lady ga ga',)) threads.append(t1) t2 = threading.Thread(target=movie,args=('backstreet',)) threads.append(t2) if __name__=='__main__': for t in threads: t.start() print("all over %s" %ctime()) [root@node2 threading]# python3 music-a.sh Start listening lady ga ga. Thu May 17 08:18:42 2018 all over Thu May 17 08:18:42 2018 Start watching backstreet! Thu May 17 08:18:42 2018 all over Thu May 17 08:18:42 2018 end Thu May 17 08:18:43 2018 #1秒后 Start listening lady ga ga. Thu May 17 08:18:43 2018 #1秒后,第二次循环 end Thu May 17 08:18:44 2018 end Thu May 17 08:18:47 2018 #5秒后 Start watching backstreet! Thu May 17 08:18:47 2018 #5秒后,第二次循环 end Thu May 17 08:18:52 2018
(4).线程简单实例2
join(): 在子线程完成运行之前,这个子线程的父线程将一直被阻塞,那个调用它就阻塞那个
分析:实例1中的print("all over %s" %ctime())中的很快就打印,说明并行执行,但是按原理它是主线程的一部分,应该等子线程执行完后,它才最后出现
[root@node2 threading]# cat music-a.sh #!/usr/local/python3/bin/python3 import threading from time import ctime,sleep import time def music(func): for i in range(2): print("Start listening %s. %s" %(func,ctime())) sleep(1) print("end %s"%ctime()) def movie(func): for i in range(2): print("Start watching %s! %s" %(func,ctime())) sleep(5) print('end %s'%ctime()) threads = [] t1 = threading.Thread(target=music,args=('lady ga ga',)) threads.append(t1) t2 = threading.Thread(target=movie,args=('backstreet',)) threads.append(t2) if __name__=='__main__': for t in threads: t.start() t.join() #t取的是threads的内容,threads把t1,t2加入,实际是两个线程对象,上面创建时并没有开启,真正运行时是start时, t是第一次时就是t1,t1.start()先运行,然后t1.join()阻塞了,相当于t1不结束,无法再执行向下的代码的,执行music 使用了2秒钟,再执行t.join(),再过movie的10秒钟,最后才打印all over,整个过程相当于串行,这操作没意义 #t.join(): 在python中可以开作用域的是函数,类,模块,但是if,for不行,for没有能力开一个作用域,在这个位置它也不会 报错,在C中不会报错,它会认为for最后一次赋值就是t,python中的变量是取最后一次为主,第一次是t=t1,第二次是t=t2, t2是最后一次,再使用就按t2来使用,就变成t2.join(),t1和t2同时执行,t1休息时间短,但是t2已经阻塞了,要等t2阻塞完才打印 #t1.join(): 只阻塞t1,2秒后,就会执行all over print("all over %s" %ctime()) [root@node2 threading]# python3 music-a.sh Start listening lady ga ga. Sat May 19 06:28:03 2018 end Sat May 19 06:28:04 2018 Start listening lady ga ga. Sat May 19 06:28:04 2018 end Sat May 19 06:28:05 2018 all over Sat May 19 06:28:05 2018 Start watching backstreet! Sat May 19 06:28:05 2018 end Sat May 19 06:28:10 2018 Start watching backstreet! Sat May 19 06:28:10 2018 end Sat May 19 06:28:15 2018 all over Sat May 19 06:28:15 2018
(4).守护线程Daemon
情况一
分析:按原理主线程结束,还要等待子线程把任务完成,以下情况加了daemon后不等待,主线程结束,子线程就不执行完
[root@node2 threading]# cat music-a.sh #!/usr/local/python3/bin/python3 import threading from time import ctime,sleep import time def music(func): for i in range(2): print("Start listening %s. %s" %(func,ctime())) sleep(1) print("end %s"%ctime()) def movie(func): for i in range(2): print("Start watching %s! %s" %(func,ctime())) sleep(5) print('end %s'%ctime()) threads = [] t1 = threading.Thread(target=music,args=('lady ga ga',)) threads.append(t1) t2 = threading.Thread(target=movie,args=('backstreet',)) threads.append(t2) if __name__=='__main__': for t in threads: t.setDaemon(True) ########### t.start() print("all over %s" %ctime()) [root@node2 threading]# python3 music-a.sh Start listening lady ga ga. Sat May 19 08:12:29 2018 all over Sat May 19 08:12:29 2018 Start watching backstreet! Sat May 19 08:12:29 2018 all over Sat May 19 08:12:29 2018
情况二
分析:分别设置t1和t2守护,守护那个进程,如果有先执行完的就不等待被守护的,主要是使用在某种场景如主线程出问题,主线程要结束,子线程也会要会结束
设置t1为守护对象,由于t2的执行对象比较长,所以当t2执行完后,t1也执行完
[root@node2 threading]# cat music-a.sh #!/usr/local/python3/bin/python3 import threading from time import ctime,sleep import time def music(func): for i in range(2): print("Start listening %s. %s" %(func,ctime())) sleep(1) print("end %s"%ctime()) def movie(func): for i in range(2): print("Start watching %s! %s" %(func,ctime())) sleep(5) print('end %s'%ctime()) threads = [] t1 = threading.Thread(target=music,args=('lady ga ga',)) threads.append(t1) t2 = threading.Thread(target=movie,args=('backstreet',)) threads.append(t2) if __name__=='__main__': t1.setDaemon(True) #不等待t1 for t in threads: t.start() print("all over %s" %ctime()) [root@node2 threading]# python3 music-a.sh Start listening lady ga ga. Sat May 19 08:23:52 2018 all over Sat May 19 08:23:52 2018 Start watching backstreet! Sat May 19 08:23:52 2018 all over Sat May 19 08:23:52 2018 end Sat May 19 08:23:53 2018 Start listening lady ga ga. Sat May 19 08:23:53 2018 end Sat May 19 08:23:54 2018 end Sat May 19 08:23:57 2018 Start watching backstreet! Sat May 19 08:23:57 2018 end Sat May 19 08:24:02 2018
设置t2为守护对象,但是由于t1的执行时间比较短,所以t1一完,t2是没执行完
[root@node2 threading]# cat music-a.sh #!/usr/local/python3/bin/python3 import threading from time import ctime,sleep import time def music(func): for i in range(2): print("Start listening %s. %s" %(func,ctime())) sleep(1) print("end %s"%ctime()) def movie(func): for i in range(2): print("Start watching %s! %s" %(func,ctime())) sleep(5) print('end %s'%ctime()) threads = [] t1 = threading.Thread(target=music,args=('lady ga ga',)) threads.append(t1) t2 = threading.Thread(target=movie,args=('backstreet',)) threads.append(t2) if __name__=='__main__': t2.setDaemon(True) ##不等待t2 for t in threads: t.start() print("all over %s" %ctime()) [root@node2 threading]# python3 music-a.sh Start listening lady ga ga. Sat May 19 08:27:36 2018 all over Sat May 19 08:27:36 2018 Start watching backstreet! Sat May 19 08:27:36 2018 all over Sat May 19 08:27:36 2018 end Sat May 19 08:27:37 2018 Start listening lady ga ga. Sat May 19 08:27:37 2018 end Sat May 19 08:27:38 2018
打印主子线程
[root@node2 threading]# cat music-a.sh #!/usr/local/python3/bin/python3 import threading from time import ctime,sleep import time def music(func): print(threading.current_thread()) ######### for i in range(2): print("Start listening %s. %s" %(func,ctime())) sleep(1) print("end %s"%ctime()) def movie(func): print(threading.current_thread()) ######### for i in range(2): print("Start watching %s! %s" %(func,ctime())) sleep(5) print('end %s'%ctime()) threads = [] t1 = threading.Thread(target=music,args=('lady ga ga',)) threads.append(t1) t2 = threading.Thread(target=movie,args=('backstreet',)) threads.append(t2) if __name__=='__main__': for t in threads: t.start() print("all over %s" %ctime()) print(threading.current_thread()) ######### [root@node2 threading]# python3 music-a.sh <Thread(Thread-1, started 139857303627520)> ######### Start listening lady ga ga. Sat May 19 08:35:05 2018 all over Sat May 19 08:35:05 2018 <Thread(Thread-2, started 139857221383936)> ######### Start watching backstreet! Sat May 19 08:35:05 2018 all over Sat May 19 08:35:05 2018 <_MainThread(MainThread, started 139857428322112)> ######### end Sat May 19 08:35:06 2018 Start listening lady ga ga. Sat May 19 08:35:06 2018 end Sat May 19 08:35:07 2018 end Sat May 19 08:35:10 2018 Start watching backstreet! Sat May 19 08:35:10 2018 end Sat May 19 08:35:15 2018
打印活动线程
[root@node2 threading]# cat music-a.sh #!/usr/local/python3/bin/python3 import threading from time import ctime,sleep import time def music(func): print(threading.current_thread()) for i in range(2): print("Start listening %s. %s" %(func,ctime())) sleep(1) print("end %s"%ctime()) def movie(func): print(threading.current_thread()) for i in range(2): print("Start watching %s! %s" %(func,ctime())) sleep(5) print('end %s'%ctime()) threads = [] t1 = threading.Thread(target=music,args=('lady ga ga',)) threads.append(t1) t2 = threading.Thread(target=movie,args=('backstreet',)) threads.append(t2) if __name__=='__main__': for t in threads: t.start() print("all over %s" %ctime()) print(threading.current_thread()) print(threading.active_count()) ################# [root@node2 threading]# python3 music-a.sh <Thread(Thread-1, started 139627204196096)> Start listening lady ga ga. Sat May 19 08:37:59 2018 all over Sat May 19 08:37:59 2018 <Thread(Thread-2, started 139627195803392)> Start watching backstreet! Sat May 19 08:37:59 2018 all over Sat May 19 08:37:59 2018 <_MainThread(MainThread, started 139627328890688)> 3 ############### end Sat May 19 08:38:00 2018 Start listening lady ga ga. Sat May 19 08:38:00 2018 end Sat May 19 08:38:01 2018 end Sat May 19 08:38:04 2018 Start watching backstreet! Sat May 19 08:38:04 2018 end Sat May 19 08:38:09 2018 [root@node2 threading]# cat music-a.sh #!/usr/local/python3/bin/python3 import threading from time import ctime,sleep import time def music(func): print(threading.current_thread()) for i in range(2): print("Start listening %s. %s" %(func,ctime())) sleep(1) print("end %s"%ctime()) def movie(func): print(threading.current_thread()) for i in range(2): print("Start watching %s! %s" %(func,ctime())) sleep(5) print('end %s'%ctime()) threads = [] t1 = threading.Thread(target=music,args=('lady ga ga',)) threads.append(t1) t2 = threading.Thread(target=movie,args=('backstreet',)) threads.append(t2) if __name__=='__main__': for t in threads: t.start() t2.join() ######### print("all over %s" %ctime()) print(threading.current_thread()) print(threading.active_count()) ########当加了t2.join时,因为t2是执行时间长的,直到它完成为止,意味着t1,t2都完了,所以只有一个主线程,个数为1 [root@node2 threading]# python3 music-a.sh <Thread(Thread-1, started 139890295301888)> Start listening lady ga ga. Sat May 19 08:41:04 2018 <Thread(Thread-2, started 139890286909184)> Start watching backstreet! Sat May 19 08:41:04 2018 end Sat May 19 08:41:05 2018 Start listening lady ga ga. Sat May 19 08:41:05 2018 end Sat May 19 08:41:06 2018 end Sat May 19 08:41:09 2018 Start watching backstreet! Sat May 19 08:41:09 2018 end Sat May 19 08:41:14 2018 all over Sat May 19 08:41:14 2018 <_MainThread(MainThread, started 139890419996480)> 1