python 并发

GIL

使得一个时刻只有一个线程在一个cpu上执行，无法将多个线程映射到多个cpu上
GIL会根据执行的字节码行数及时间片释放GIL，遇到IO时主动释放。

对于io操作来说，多线程与多进程的性能差别不大

多线程

通过实例化类实现threading.Thread(target=fun, args=('arg',))，适合动态启动线程，更加灵活。

2.通过继承来实现，适合复杂逻辑

class GetHtml(threading.Thread):
    def __init__(self, name):
        super().__init__(name=name)

    def run(self):
        time.sleep(4)

线程通信

当多进程通过变量共享数据时需要加锁，建议使用queue.Queue其get put方法在队列为空或满时会阻塞。
get_nowait put_nowait为异步方法，不等方法执行完就返回。
queue.join()执行后阻塞，其他进程调用queue.task_donw()后才结束阻塞。

线程同步

例如：变量加一对应四步：

load a
load 1
+
赋值

每步执行完之后都有可能切换线程，使用锁保证这些步骤不会被打断

num = 0
lock = threading.Lock()

def add(lock):
    global num
    lock.acquire()
    num += 1
    lock.release()

用锁会影响性能
死锁

Lock只能申请一次，RLock是能重入的锁。

def outer():
    lock.acquire()
    inner(lock)
    lock.release()

def inner():
    lock.acquire()      # 再次申请锁时会发生死锁
    # ...
    lock.release()

上述情况可以使用threading.RLock，一个线程内可以acquire多次，但也要release同样次数。

condition

condition.wait()
condition.notify()
可用于生产者-消费者哲学家就餐等问题
必须调用condition.acquire()后才能使用，最好使用with condition方法打开。
condition内部有两层锁，一把底层锁在线程调用wait方法时释放，上面的锁会在每次调用wait的时候分配一把并放入到condition的等待队列中，等待notify方法的唤醒。

Semaphore

用于控制进入数量的锁
例：文件的读写，写一般只允许一个线程，读可以允许多个线程。

def fun(sem):
    # do something
    sem.release()

sem = threading.Semaphore(3)
for i in range(20):
    sem.acquire()
    th = threading.Thread(target=fun, args=(sem,))
    th.start()

限制每次只能运行3个线程

线程池

优点：
主线程可以获取一个线程的状态或某一个任务的状态，以及返回值
当一个线程完成的时候我们主线程能立即知道
futures可以保证多线程与多进程编码接口一致

executor = ThreadPoolExecutor(max_workers=2)
task1 = executor.submit(fun, ('arg',))
task2 = executor.submit(fun, ('arg',))
# task 可以用于获得线程相关信息，submit是立即返回，不会阻塞
task1.done()    # 返回 bool 判断是否完成
task1.result()  # 返回执行的结果，是阻塞方法
task2.cancel()  # 返回 bool 显示是否取消执行。已经执行时无法取消，还未执行的可以取消。

获取已成功的task的返回

as_completed是一个生成器函数，不断返回已经完成的函数任务

all_tasks = [executor.submit(fun, (url,)) for url in urls]
for future in as_completed(all_task):
    data = future.result()

for data in executor.map(fun, arg_list):
    print(data) # data 即为函数执行结果

多进程

from multiprocessing import Process
import os

# 子进程要执行的代码
def run_proc(name):
    print('Run child process %s (%s)...' % (name, os.getpid()))

if __name__=='__main__':
    print('Parent process %s.' % os.getpid())
    p = Process(target=run_proc, args=('test',))
    print('Child process will start.')
    p.start()
    p.join()
    print('Child process end.')

from multiprocessing import Pool
import os, time, random

def long_time_task(name):
    print('Run task %s (%s)...' % (name, os.getpid()))
    start = time.time()
    time.sleep(random.random() * 3)
    end = time.time()
    print('Task %s runs %0.2f seconds.' % (name, (end - start)))

if __name__=='__main__':
    print('Parent process %s.' % os.getpid())
    p = Pool(4)
    for i in range(5):
        p.apply_async(long_time_task, args=(i,))
    print('Waiting for all subprocesses done...')
    p.close()
    p.join()
    print('All subprocesses done.')

posted @ 2021-01-21 17:13 某某人8265 阅读(80) 评论(0) 编辑收藏举报

刷新页面返回顶部