多线程threading
threading用于提供线程相关的操作,线程是应用程序中工作的最小单元。python当前版本的多线程库没有实现优先级、线程组,线程也不能被停止、暂停、恢复、中断。
1. threading模块提供的类:
Thread, Lock, Rlock, Condition, [Bounded]Semaphore, Event, Timer, local。
2. threading 模块提供的常用方法:
threading.currentThread(): 返回当前的线程变量。
threading.enumerate(): 返回一个包含正在运行的线程的list。正在运行指线程启动后、结束前,不包括启动前和终止后的线程。
threading.activeCount(): 返回正在运行的线程数量,与len(threading.enumerate())有相同的结果。
3. threading 模块提供的常量:
threading.TIMEOUT_MAX 设置threading全局超时时间。
4. Thread类
isAlive(): 返回线程是否在运行。正在运行指启动后、终止前。
get/setName(name): 获取/设置线程名。
start(): 线程准备就绪,等待CPU调度
is/setDaemon(bool): 获取/设置是后台线程(默认前台线程(False))。(在start之前设置)
如果是后台线程,主线程执行过程中,后台线程也在进行,主线程执行完毕后,后台线程不论成功与否,主线程和后台线程均停止
如果是前台线程,主线程执行过程中,前台线程也在进行,主线程执行完毕后,等待前台线程也执行完成后,程序停止
start(): 启动线程。
join([timeout]): 阻塞当前上下文环境的线程,直到调用此方法的线程终止或到达指定的timeout(可选参数)。
#coding=utf-8 import threading def thread_job(): print('This is a thread of %s' % threading.current_thread()) def main(): thread = threading.Thread(target=thread_job,) # 定义线程 thread.start() # 让线程开始工作 if __name__ == '__main__': main()
#coding=utf-8 import threading def thread_job(): print('This is a thread of %s' % threading.current_thread()) def main(): thread = threading.Thread(target=thread_job,) # 定义线程 thread.setDaemon(False) thread.start() # 让线程开始工作 if __name__ == '__main__': print("1") main() print("2") '''output: 1 2 This is a thread of <Thread(Thread-1, started 9424)> ''' #coding=utf-8 import threading def thread_job(): print('This is a thread of %s' % threading.current_thread()) def main(): thread = threading.Thread(target=thread_job,) # 定义线程 thread.setDaemon(True) thread.start() # 让线程开始工作 if __name__ == '__main__': print("1") main() print("2") '''output: 1 2 '''
#coding=utf-8 import threading def thread_job(): print('This is a thread of %s' % threading.current_thread()) def main(): thread = threading.Thread(target=thread_job,) # 定义线程 thread.setDaemon(True) thread.start() # 让线程开始工作 thread.join() if __name__ == '__main__': print("1") main() print("2") '''output: 1 This is a thread of <Thread(Thread-1, started daemon 5292)> 2 ''' #coding=utf-8 import threading def thread_job(): print('This is a thread of %s' % threading.current_thread()) def main(): thread = threading.Thread(target=thread_job,) # 定义线程 #thread.setDaemon(True) thread.start() # 让线程开始工作 thread.join() if __name__ == '__main__': print("1") main() print("2") '''output: 1 This is a thread of <Thread(Thread-1, started 9220)> 2 ''' #coding=utf-8 import threading def thread_job(): print('This is a thread of %s' % threading.current_thread()) def main(): thread = threading.Thread(target=thread_job,) # 定义线程 #thread.setDaemon(True) thread.start() # 让线程开始工作 #thread.join() if __name__ == '__main__': print("1") main() print("2") '''output: 1 2 This is a thread of <Thread(Thread-1, started 10032)> '''
1、setDaemon默认是False,无论主程序是否执行完,子程序必须都要执行完,只有等主、子程序都执行完才会退出程序(先后关系不一定是按代码至上向下执行---非阻塞式。)
import threading,time def job1(): global A for i in range(2): A+=1 time.sleep(1) print('job1',A) def job2(): global A for i in range(2): A+=10 time.sleep(2) print('job2',A) if __name__== '__main__': print("begin") lock=threading.Lock() A=0 t1=threading.Thread(target=job1) t2=threading.Thread(target=job2) t1.start() t2.start() print("end") '''output: begin end ('job1', 11) ('job2', 12) ('job1', 22) ('job2', 22) '''
2、setDaemon默认是True,无论子程序是否执行完,主程序只要执行完就会退出程序。(先后关系不一定是按代码至上向下执行---非阻塞式。)
import threading,time def job1(): global A for i in range(2): A+=1 time.sleep(1) print('job1',A) def job2(): global A for i in range(2): A+=10 time.sleep(2) print('job2',A) if __name__== '__main__': print("begin") lock=threading.Lock() A=0 t1=threading.Thread(target=job1) t2=threading.Thread(target=job2) t1.setDaemon(True) t2.setDaemon(True) t1.start() t2.start() print("end") '''output: begin end '''
3、join方法存按代码至上而下先后执行(必须得执行完中间代码的子线程,才会去执行末行主线程代码)---阻塞式
同一线程对象的start和join是否紧挨着区别:
import threading,time def job1(): print("start the 1st threading") time.sleep(0.5) print("1st threading ends") def job2(): print("start the 2nd threading") time.sleep(2) print("2nd threading ends") def job3(): print("start the 3rd threading") time.sleep(1) print("3rd threading ends") if __name__== '__main__': print("begins") lock=threading.Lock() first = threading.Thread(target=job1) second = threading.Thread(target=job2) third = threading.Thread(target=job3) first.start() first.join() second.start() second.join() third.start() third.join() print("ends") '''输出: begins start the 1st threading 1st threading ends start the 2nd threading 2nd threading ends start the 3rd threading 3rd threading ends ends '''
import threading,time def job1(): print("start the 1st threading") time.sleep(0.5) print("1st threading ends") def job2(): print("start the 2nd threading") time.sleep(2) print("2nd threading ends") def job3(): print("start the 3rd threading") time.sleep(1) print("3rd threading ends") if __name__== '__main__': print("begins") lock=threading.Lock() first = threading.Thread(target=job1) second = threading.Thread(target=job2) third = threading.Thread(target=job3) first.start() second.start() third.start() first.join() second.join() third.join() print("ends") '''输出: begins start the 1st threading start the 2nd threading start the 3rd threading 1st threading ends 3rd threading ends 2nd threading ends ends '''
小结:个人感觉同一线程对象的start和join最好不要紧挨着。
5、Queue
在多线程函数中定义一个Queue
,用来保存返回值,代替return
,定义一个多线程列表,初始化一个多维数据列表。
import threading import time from queue import Queue def job(l,q): for i in range (len(l)): l[i] = l[i]**2 q.put(l) def multithreading(): q =Queue() threads = [] data = [[1,2,3],[3,4,5],[4,4,4],[5,5,5]] for i in range(4): t = threading.Thread(target=job,args=(data[i],q)) t.start() threads.append(t) for thread in threads: thread.join() results = [] for _ in range(4): results.append(q.get()) print(results) if __name__=='__main__': multithreading()
6、线程锁
lock在不同线程使用同一共享内存时,能够确保线程之间互不影响,使用lock的方法是, 在每个线程执行运算修改共享内存之前,执行lock.acquire()
将共享内存上锁, 确保当前线程执行时,内存不会被其他线程访问,执行运算完毕后,使用lock.release()
将锁打开, 保证其他的线程可以使用该共享内存。
当多个线程同时执行lock.acquire()时,只有一个线程能成功地获取锁,然后继续执行代码,其他线程就继续等待直到获得锁为止。
获得锁的线程用完后一定要释放锁,否则那些苦苦等待锁的线程将永远等待下去,成为死线程。所以我们用try...finally来确保锁一定会被释放。
锁的好处就是确保了某段关键代码只能由一个线程从头到尾完整地执行,坏处当然也很多,首先是阻止了多线程并发执行,包含锁的某段代码实际上只能以单线程模式执行,效率就大大地下降了。其次,由于可以存在多个锁,不同的线程持有不同的锁,并试图获取对方持有的锁时,可能会造成死锁,导致多个线程全部挂起,既不能执行,也无法结束,只能靠操作系统强制终止。
import threading,time def job1(): global A for i in range(2): print("get A_value of job1",A) A+=1 print("A + 1 = %d"%(A)) time.sleep(1) print("get A_value of job1",A) def job2(): global A for i in range(2): print("get A_value of job2",A) A+=10 print("A + 10 = %d"%(A)) time.sleep(3) print("get A_value of job2",A) if __name__== '__main__': print("begin") lock=threading.Lock() A=0 t1=threading.Thread(target=job1) t2=threading.Thread(target=job2) t1.start() t2.start() t1.join() t2.join() print("end") '''输出: begin ('get A_value of job1', 0) A + 1 = 1 ('get A_value of job2', 1) A + 10 = 11 ('get A_value of job1', 11) A + 1 = 12 ('get A_value of job1', 12) ('get A_value of job2', 12) A + 10 = 22 ('get A_value of job2', 22) end '''
import threading,time def job1(): global A,lock lock.acquire() for i in range(2): print("get A_value of job1",A) A+=1 print("A + 1 = %d"%(A)) time.sleep(1) print("get A_value of job1",A) lock.release() def job2(): global A,lock lock.acquire() for i in range(2): print("get A_value of job2",A) A+=10 print("A + 10 = %d"%(A)) time.sleep(3) print("get A_value of job2",A) lock.release() if __name__== '__main__': print("begin") lock=threading.Lock() A=0 t1=threading.Thread(target=job1) t2=threading.Thread(target=job2) t1.start() t2.start() t1.join() t2.join() print("end") '''输出: begin ('get A_value of job1', 0) A + 1 = 1 ('get A_value of job1', 1) A + 1 = 2 ('get A_value of job1', 2) ('get A_value of job2', 2) A + 10 = 12 ('get A_value of job2', 12) A + 10 = 22 ('get A_value of job2', 22) end '''
7、死锁
死锁现象例子:共享资源A,B;锁lock1,lock2;两个线程threading1,threading2。
个人感觉就是锁中锁:
一个线程threading1里一个锁lock1还没释放A资源(也就是lock1已锁定了A),就想获取另一个锁lock2来得到B资源(也就是想锁定B),所以在等待B资源释放;
一个线程threading2里一个锁lock1还没释放B资源(也就是lock1已锁定了B),就想获取另一个锁lock2来得到A资源(也就是想锁定A),所以在等待A资源释放;
这样因两个线程相互等待资源释放就造成死锁。
#coding=utf-8 import threading,time def job1(): global A,B,lock1,lock2 lock1.acquire() for i in range(2): print("获取job1中: A = %d"%(A)) A+=1 print("A + 1 = %d"%(A)) time.sleep(1) print("获取job1中: A = %d"%(A)) lock2.acquire() for i in range(2): print("获取job1中: B = %d"%(B)) B+=10 print("B + 10 = %d"%(B)) time.sleep(3) print("获取job1中: B = %d"%(B)) lock2.release() lock1.release() def job2(): global A,B,lock1,lock2 lock2.acquire() for i in range(2): print("获取job2中: B = %d"%(B)) B+=1 print("B + 10 = %d"%(B)) time.sleep(3) print("获取job2中: B = %d"%(B)) lock1.acquire() for i in range(2): print("获取job1中: A = %d"%(A)) A+=10 print("A + 1 = %d"%(A)) time.sleep(1) print("获取job1中: A = %d"%(A)) lock1.release() lock2.release() if __name__== '__main__': print("begin") lock1=threading.Lock() lock2=threading.Lock() A=B=0 t1=threading.Thread(target=job1) t2=threading.Thread(target=job2) t1.start() t2.start() t1.join() t2.join() print("end")
小结:无论A是否等于B,lock1是否等于lock2,感觉只要存在锁中锁(每个线程里已acquire了还未释放又进行acquire)就会产生死锁。
8、递归锁:
RLock本身有一个计数器,如果碰到acquire,那么计数器+1;如果计数器大于0,那么其他线程无法查收,如果碰到release,计数器-1;直到如果计数器等于0,才会去执行下一个线程。
#coding=utf-8 import threading,time def job1(): global A,B,lock1,lock2 lock1.acquire() for i in range(2): print("获取job1中: A = %d"%(A)) A+=1 print("A + 1 = %d"%(A)) time.sleep(1) print("获取job1中: A = %d"%(A)) lock2.acquire() for i in range(2): print("获取job1中: B = %d"%(B)) B+=10 print("B + 10 = %d"%(B)) time.sleep(3) print("获取job1中: B = %d"%(B)) lock2.release() lock1.release() def job2(): global A,B,lock1,lock2 lock2.acquire() for i in range(2): print("获取job2中: B = %d"%(B)) B+=1 print("B + 10 = %d"%(B)) time.sleep(3) print("获取job2中: B = %d"%(B)) lock1.acquire() for i in range(2): print("获取job1中: A = %d"%(A)) A+=10 print("A + 1 = %d"%(A)) time.sleep(1) print("获取job1中: A = %d"%(A)) lock1.release() lock2.release() if __name__== '__main__': print("begin") lock1=lock2=threading.RLock() A=B=0 t1=threading.Thread(target=job1) t2=threading.Thread(target=job2) t1.start() t2.start() t1.join() t2.join() print("end") '''输出: begin 获取job1中: A = 0 A + 1 = 1 获取job1中: A = 1 A + 1 = 2 获取job1中: A = 2 获取job1中: B = 0 B + 10 = 10 获取job1中: B = 10 B + 10 = 20 获取job1中: B = 20 获取job2中: B = 20 B + 10 = 21 获取job2中: B = 21 B + 10 = 22 获取job2中: B = 22 获取job1中: A = 2 A + 1 = 12 获取job1中: A = 12 A + 1 = 22 获取job1中: A = 22 end '''
每一个线程job里必须只有一个Rlock对象,有不同的Rlock对象也会产生死锁。如下:
#coding=utf-8 import threading,time def job1(): global A,B,lock1,lock2 lock1.acquire() for i in range(2): print("获取job1中: A = %d"%(A)) A+=1 print("A + 1 = %d"%(A)) time.sleep(1) print("获取job1中: A = %d"%(A)) lock2.acquire() for i in range(2): print("获取job1中: B = %d"%(B)) B+=10 print("B + 10 = %d"%(B)) time.sleep(3) print("获取job1中: B = %d"%(B)) lock2.release() lock1.release() def job2(): global A,B,lock1,lock2 lock2.acquire() for i in range(2): print("获取job2中: B = %d"%(B)) B+=1 print("B + 10 = %d"%(B)) time.sleep(3) print("获取job2中: B = %d"%(B)) lock1.acquire() for i in range(2): print("获取job1中: A = %d"%(A)) A+=10 print("A + 1 = %d"%(A)) time.sleep(1) print("获取job1中: A = %d"%(A)) lock1.release() lock2.release() if __name__== '__main__': print("begin") lock1=threading.RLock() lock2=threading.RLock() A=B=0 t1=threading.Thread(target=job1) t2=threading.Thread(target=job2) t1.start() t2.start() t1.join() t2.join() print("end")
9、GIL
GIL并不是Python的特性,它是在实现Python解析器(CPython)时所引入的一个概念,是为了实现不同线程对共享资源访问的互斥,才引入了GIL。
在Cpython解释器中,同一个进程下开启的多线程,同一时刻只能有一个线程执行,无法利用多核优势
CPython implementation detail: 在 CPython 中,由于存在 全局解释器锁,同一时刻只有一个线程可以执行 Python 代码(虽然某些性能导向的库可能会去除此限制)。 如果你想让你的应用更好地利用多核心计算机的计算资源,推荐你使用 multiprocessing 或 concurrent.futures.ProcessPoolExecutor。 但是,如果你想要同时运行多个 I/O 密集型任务,则多线程仍然是一个合适的模型。