python之线程和进程(并发编程)

python的GIL

In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython’s memory management is not thread-safe. (However, since the GIL exists, other features have grown to depend on the guarantees that it enforces.)

上面的核心意思就是，无论你启多少个线程，你有多少个cpu, Python在执行的时候会淡定的在同一时刻只允许一个线程运行

线程

1同步锁

2死锁, 递归锁

3:信号量和同步对象(了解)

4队列------生产者消费者模型

5进程

线程的基本调用

<python的线程与threading模块>

import threading  # 线程
import time


def Hi(num):
    print('hello %d' % num)
    time.sleep(3)

if __name__ == '__main__':
    # 第一个参数是要执行的函数名,第二个是函数的参数(必须是可迭代对象)
    t1 = threading.Thread(target=Hi, args=(10, ))  # 创建一个线程对象
    t1.start()  # 开启线程
    t2 = threading.Thread(target=Hi, args=(9, ))  # 创建一个线程对象
    t2.start()  # 开启线程
    print('ending....')
#   这就是并发的现象

#   并行:指的是两个或者多个事件在同一时刻发生(同时调用多核)
#   并发:指的是两个或者多个事件在同一时间间隔内发生(在一个核内快速的切换)

第二种调用方式

import threading
import time


class MyThread(threading.Thread):
    def __init__(self,num):
        threading.Thread.__init__(self)
        self.num = num

    def run(self):#定义每个线程要运行的函数

        print("running on number:%s" %self.num)

        time.sleep(3)

if __name__ == '__main__':

    t1 = MyThread(1)
    t2 = MyThread(2)
    t1.start()
    t2.start()
    
    print("ending......")

join和setDaemon

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 18-5-22 下午8:45
# @Author  : LK
# @File    : lesson2.py
# @Software: PyCharm

import threading
import time

def music():
    print('begin listen music %s'%time.ctime())
    # t = input('请输入内容>>>')
    # time.sleep(3)
    print('stop listen music %s'%time.ctime())
    # print(t)


def game():
    print('begin to game %s'%time.ctime())
    time.sleep(5)
    print('stop to game %s'%time.ctime())

if __name__ == '__main__':
    t1 = threading.Thread(target=music)
    t2 = threading.Thread(target=game)

    t1.start()
    t2.start()

    # t1.join()   #   join就是等待的意思,让该线程执行完毕后,在执行主线程
    # t2.join()   # 注意如果注释这一句,结果是什么

    print('ending....')

    print(t1.getName())  #  获取线程名,

join

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 18-5-22 下午9:26
# @Author  : LK
# @File    : 守护线程.py
# @Software: PyCharm

#   守护线程 就是:和主线程一起退出,如果主线程结束了,那么不管守护的线程,有没有执行完毕,都退出
import threading
import time


def music():
    print('begin listen music %s' % time.ctime())
    time.sleep(3)
    print('stop listen music %s' % time.ctime())


def game():
    print('begin to game %s' % time.ctime())
    time.sleep(5)
    print('stop to game %s' % time.ctime())


t1 = threading.Thread(target=music)
t2 = threading.Thread(target=game)
threads = []
threads.append(t1)
threads.append(t2)
if __name__ == '__main__':

    # t1.setDaemon(True)
    t2.setDaemon(True)
    for t in threads:
        t.start()
        # t.setDaemon(True) # 守护线程, 就是和主线程一块结束
    print('ending....')

setDaemon

join()：在子线程完成运行之前，这个子线程的父线程将一直被阻塞。

setDaemon(True)：

将线程声明为守护线程，必须在start() 方法调用之前设置，如果不设置为守护线程程序会被无限挂起。这个方法基本和join是相反的。

当我们在程序运行中，执行一个主线程，如果主线程又创建一个子线程，主线程和子线程就分兵两路，分别运行，那么当主线程完成

想退出时，会检验子线程是否完成。如果子线程未完成，则主线程会等待子线程完成后再退出。但是有时候我们需要的是只要主线程

完成了，不管子线程是否完成，都要和主线程一起退出，这时就可以用setDaemon方法啦

其他方法

# run():  线程被cpu调度后自动执行线程对象的run方法
# start():启动线程活动。
# isAlive(): 返回线程是否活动的。
# getName(): 返回线程名。
# setName(): 设置线程名。

threading模块提供的一些方法：
# threading.currentThread(): 返回当前的线程变量。
# threading.enumerate(): 返回一个包含正在运行的线程的list。正在运行指线程启动后、结束前，不包括启动前和终止后的线程。
# threading.activeCount(): 返回正在运行的线程数量，与len(threading.enumerate())有相同的结果。

同步锁lock

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 18-5-23 下午4:42
# @Author  : LK
# @File    : 同步锁.py
# @Software: PyCharm
import threading, time

#   用多线程去调用一个每次减1的函数,里面让他停留一会就会切换cpu,模拟线程安全问题,引出同步锁(互斥锁)


def sub():
    global num
    #  这样坐是0,因为num-=1 这个执行的很快,没有到达时间片切换就执行完了,每次取的都是不同的值
    # num -= 1
    # 如果这里让他停一会,就会发生,前面的一些拿到的值为100,然后开始停下来,让其他线程取值
    # 取得值还可能是100, 再过几个线程,取的时候就是前面的线程执行完了(到达时间片切换了)
    #  取得是就可能是其他值(前面几个线程执行后的结果),最终结果就不在是0
    #   这就是线程安全问题,多个线程都在处理同一个数据,
    # 如果处理的太慢(sleep)就会发生多个线程处理的值都是原始的值(多个线程处理一个值)

    #   处理方法: 就是在执行这三条语句时,不让cpu切换,知道处理完成时,再切换(同步锁)
    '''
    定义一个锁:lock = threading.Lock()
    lock.acquire()  #   锁起来(获取)
    这里是要执行的语句,
    lock.release()   #释放
    这里的不执行完,不切换cpu
    '''
    lock.acquire()
    temp = num
    time.sleep(0.001)
    num = temp - 1
    lock.release()


num = 100
l = []
lock = threading.Lock()
for i in range(100):
    t = threading.Thread(target=sub)
    l.append(t)
    t.start()

for t in l:
    t.join()
print(num)

同步锁(互斥锁)

死锁(递归锁)

#   产生死锁:
'''
第一个线程执行doLockA, 然后lockB, 执行完actionA, 都释放了
开始执行actionB,对B上锁,不允许切换cpu,与此同时第二个线程开始执行actionA,对A上锁
不允许切换cpu,两个线程同时需要对方的资源释放但是都没有释放,所以死锁
(就像你问我要个香蕉, 我问你要个苹果, 你说我先给你,我说你先给我,于是就死锁了)
'''

'''
解决方法:
用threading.RLock() 去替换锁A和锁B
rlock就是在内部实现一个计数器,在同一个线程中,加锁就+1,释放就减一,只要锁数大于0,
其他线程就不能申请锁,
就是执行过程就是:当A函数执行完之后,所有线程开始抢占资源,谁抢到谁开始执行,
比如线程2抢到, 线程2开始执行A函数,A没有执行完之前其他线程不能够继续执行
'''

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 18-5-23 下午8:33
# @Author  : LK
# @File    : 递归锁_死锁.py
# @Software: PyCharm
import threading, time

#   产生死锁:
'''
第一个线程执行doLockA, 然后lockB, 执行完actionA, 都释放了
开始执行actionB,对B上锁,不允许切换cpu,与此同时第二个线程开始执行actionA,对A上锁
不允许切换cpu,两个线程同时需要对方的资源释放但是都没有释放,所以死锁
(就像你问我要个香蕉, 我问你要个苹果, 你说我先给你,我说你先给我,于是就死锁了)
'''


class myThread(threading.Thread):
    def actionA(self):
        lockA.acquire()
        print(self.name, 'doLockA', time.ctime())
        time.sleep(3)
        lockB.acquire()
        print(self.name, 'doLockB', time.ctime())
        lockB.release()
        lockA.release()

    def actionB(self):
        lockB.acquire()
        print(self.name, 'doLockB', time.ctime())
        time.sleep(1)
        lockA.acquire()
        print(self.name, 'doLockA', time.ctime())
        lockA.release()
        lockB.release()

    def run(self):
        self.actionA()
        time.sleep(0.5)
        self.actionB()


if __name__ == '__main__':
    lockA = threading.Lock()
    lockB = threading.Lock()
    r_lock = threading.RLock()
    '''
    解决方法:
    用threading.RLock() 去替换锁A和锁B
    rlock就是在内部实现一个计数器,在同一个线程中,加锁就+1,释放就减一,只要锁数大于0,
    其他线程就不能申请锁,
    就是执行过程就是:当A函数执行完之后,所有线程开始抢占资源,谁抢到谁开始执行,
    比如线程2抢到, 线程2开始执行A函数,A没有执行完之前其他线程不能够继续执行
    '''

    threads = []
    for i in range(5):
        '''创建5个线程对象'''
        threads.append(myThread())
    for t in threads:
        t.start()
    for i in threads:
        t.join()
    print('ending.....')

死锁(递归锁)

信号量与同步对象

信号量, 也是一种锁的机制, 在那个递归锁中的同时事件的安全问题中
里面是多个线程同时抢占资源, 但是这个是只允许指定的个数,去抢占

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 18-5-24 下午4:45
# @Author  : LK
# @File    : 信号量.py
# @Software: PyCharm

'''
信号量, 也是一种锁的机制, 在那个递归锁中的同时事件的安全问题中
里面是多个线程同时抢占资源, 但是这个是只允许指定的个数,去抢占
'''
import threading, time
class myThread(threading.Thread):
    def run(self):
        if semaphore.acquire():
            print(self.name)
            time.sleep(3)
            semaphore.release()

if __name__ == '__main__':

    #   信号量, 参数的意思是同时允许几个线程去执行
    semaphore = threading.Semaphore(5)

    threads = []
    for i in range(10):
        threads.append(myThread())

    for t in threads:
        t.start()
    for t in threads:
        t.join()

信号量

同步对象

event

'''
event 同步对象标志位,就是在一个线程中设定后,在其他线程中也能捕获到
需求:一个boss类,一个worker类,
当boss对象执行后,worker才执行,就需要设置一个同步的标志位,用来判断
该执行那个线程
'''

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 18-5-23 下午9:38
# @Author  : LK
# @File    : 同步对象.py
# @Software: PyCharm

'''
event 同步对象标志位,就是在一个线程中设定后,在其他线程中也能捕获到
需求:一个boss类,一个worker类,
当boss对象执行后,worker才执行,就需要设置一个同步的标志位,用来判断
该执行那个线程
'''
'''
执行过程:
刚开始有5个worker线程和1个boss线程执行,
当worker执行时,wait没有被设定,一直在阻塞,但是boss可以执行
于是设定event并开始等5秒, 同时5个worker中的标志位变了,可以继续执行了
1秒后,worker清除event标志,继续阻塞,又过了4秒boss又设置了,
同时worker又可以执行了
'''
import threading, time
class Boss(threading.Thread):
    def run(self):
        print(self.name,'boss今天大家要加班到22点')
        print(event.isSet())  # false,判断是否标志
        event.set()
        time.sleep(5)
        print(self.name,'boss: <22:00>可以下班了')
        print(event.isSet())
        event.set()


class Worker(threading.Thread):
    def run(self):
        event.wait()  # 一旦event被设定 就等同于pass, 不设定就阻塞等待被设定
        print(self.name,'worker:哎呀生活苦呀,')
        time.sleep(1)
        event.clear()
        event.wait()
        print(self.name,'worker: oyear下班了')


#   如果不用event的话,就可能导致worker先说话,但是需求是让boss先说话,
#   因为cpu是抢占的方式执行的,不会按照需求走,所以就要引入同步标志位event
if __name__ == '__main__':
    event = threading.Event()
    threads =[]

    #   创建5个worker线程,1个boss线程
    for i in range(5):
        threads.append(Worker())
    threads.append(Boss())

    for t in threads:
        t.start()
    for t in threads:
        t.join()
    print('ending....')

同步对象

队列----多线程常用方法

队列的引入

'''
用多个线程同时对列表进行删除最后一个值,
会报错, 是因为可能会有多个线程同时取到最后一个值,都进行删除,
就会报错, 解决方法:可以用锁,将其锁住,(相当于串行的执行), 或者用队列
'''
'''
l = [1,2,3,4,5]
def fun():
    while 1:
        if not l:
            break
        a = l[-1]
        print(a)
        time.sleep(1)
        l.remove(a)
if __name__ == '__main__':
    t1 = threading.Thread(target=fun, args=())
    t2 = threading.Thread(target=fun, args=())
    t1.start()
    t2.start()
'''

队列的引入

Python Queue模块有三种队列及构造函数:
1、Python Queue模块的FIFO队列先进先出。   class queue.Queue(maxsize)
2、LIFO类似于堆，即先进后出。               class queue.LifoQueue(maxsize)
3、还有在一种是优先级队列级别越低越先出来。        class queue.PriorityQueue(maxsize)

import queue

#   创建一个队列,里面最多可以存放3个数据
q = queue.Queue(maxsize=3)  # FIFO,先进先出
# 将一个值入队
q.put(10)
q.put(9)
q.put(8)
#   如果这里在加上一个put, 就是有4个数据入队,但是空间不够,就会一直阻塞,直到,队列有空间时(从队列中取出get)
#   block参数默认是True, 如果改成False,当队列满的时候如果继续入队,就不会堵塞而是报错queue.Full
# q.put(9, block=False)
while 1:
    '''
    将一个值从队列中取出
    q.get()
    调用队列对象的get()
    方法从队头删除并返回一个项目。可选参数为block，默认为True。如果队列为空且block为True，
    get()
    就使调用线程暂停，直至有项目可用。如果队列为空且block为False，队列将引发Empty异常。
    '''
    # 取完之后将会阻塞, block=Fales,如果队列为空时,继续出队,就会报错,queue.Empty
    #   get类似与put
    data = q.get(block=False)
    print(data)

调用一

# 先进后出  LIFO 类似与栈
import queue
# 先进后出  LIFO 类似与栈
q = queue.LifoQueue(maxsize=4)
q.put('你好')
q.put(123)
q.put({"name":"lucky"})

while 1:
    data = q.get()
    print(data)

调用二

# 还有在一种是优先级队列级别越低越先出来1级最低。class queue.PriorityQueue(maxsize)
import queue
q = queue.PriorityQueue(maxsize=3)
q.put([2,'你好'])
q.put([3,123,'aa'])
q.put([1,{"name":"lucky"}])

while 1:
    data = q.get()
    # print(data)
    print(data[1])  #   只显示数据,

调用三

import queue
q = queue.Queue(maxsize=4)
q.put('a')
q.put('b')
q.put('c')
print(q.qsize())  # 打印当前队列的大小,现在队列中有几个值
print(q.full())  # 判断是否满
print(q.empty())  # 判断是否为空
# q.put_nowait(3)  #  相当于q.put(3, block=Flase)
# q.task_done() 在完成一项工作之后，q.task_done() 函数向任务已经完成的队列发送一个信号
# q.join() 实际上意味着等到队列为空，再执行别的操作
while 1:
    data = q.get()
    print(data)
    print(q.qsize())  # 打印当前队列的大小,现在队列中有几个值
    print(q.full())   # 判断是否满
    print(q.empty())  # 判断是否为空

队列的其他方法

进程的基本使用

`多进程模块 multiprocessing`

进程的使用和线程的使用和调用的方法基本一样

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 18-5-25 下午5:31
# @Author  : LK
# @File    : 测试进程和线程.py
# @Software: PyCharm
from threading import Thread
from multiprocessing import Process
import time

def fun():
    print(time.ctime())
if __name__ == '__main__':
    threads = []
    for i in range(1000):
        # t = Thread(target=fun)  # 线程
        t = Process(target=fun)   # 进程
        threads.append(t)
    for t in threads:
        t.start()
    for t in threads:
        t.join()

from  multiprocessing import Process
import time

def f(name):
    time.sleep(1)
    print('hello  %s'%name, time.ctime())

if __name__ == '__main__':
    p_list = []
    for t in range(3):
        p = Process(target=f, args=('luck',))
        p_list.append(p)

    #   三个进程时在同一时刻开启的
    for p in p_list:
        p.start()
    for p in p_list:
        p.join()

    print('ending....')

进程的调用方法一

class myProcess(Process):
    def __init__(self):
        # 父类初始化, 不传参数不用写
        super(myProcess, self).__init__()
    def run(self):
        time.sleep(1)
        print(self.name,'hello ',time.ctime())

if __name__ == '__main__':
    p_list = []
    for i in range(3):
        p_list.append(myProcess())
    for p in p_list:
        p.start()
    for p in p_list:
        p.join()

进程的调用方法二

查看进程号

#   查查看进程的id
#   os.getppid()获得当前进程的父进程,os.getpid()获得当前进程
#   os.getppid 是pycharm的进程号
'''
from multiprocessing import Process
import os
import time


def info(title):
    print("title:", title)
    print('父进程 id:', os.getppid())
    print('当前进程 id:', os.getpid())
    # time.sleep(3)


if __name__ == '__main__':
    info('主进程')
    time.sleep(1)
    print("------------------")
    p = Process(target=info, args=('lucky子进程',))
    p.start()
    p.join()
    p_list = []
    for i in range(3):
        p = Process(target=info, args=('子进程%d' % i,))
        p_list.append(p)
    for p in p_list:
        p.start()
    for p in p_list:
        p.join()

查看进程号,进程之间的关系理解

Process 类
构造方法：

Process([group [, target [, name [, args [, kwargs]]]]])

　　group: 线程组，目前还没有实现，库引用中提示必须是None； 
　　target: 要执行的方法； 
　　name: 进程名； 
　　args/kwargs: 要传入方法的参数。

实例方法：

　　is_alive()：返回进程是否在运行。

　　join([timeout])：阻塞当前上下文环境的进程程，直到调用此方法的进程终止或到达指定的timeout（可选参数）。

　　start()：进程准备就绪，等待CPU调度

　　run()：strat()调用run方法，如果实例进程时未制定传入target，这star执行t默认run()方法。

　　terminate()：不管任务是否完成，立即停止工作进程

属性：

　　daemon：和线程的setDeamon功能一样

　　name：进程名字。


'''

from multiprocessing import Process
import time

class myProcess(Process):
    def __init__(self, args):
        super(myProcess,self).__init__()
        self.args = args
    def fun(self):
        time.sleep(1)
        print(self.name, self.is_alive(), self.args, self.pid)
        time.sleep(1)

    def run(self):
        self.fun()



if __name__ == '__main__':
    p_list = []
    for i in range(10):
        # p = Process(target=fun, args=('子进程%d' % i,))
        p_list.append(myProcess('子进程%d' % i))
    # p_list[-2].daemon = True
    # 因为这三个进程是同时执行的,所以这个属性不明显
    for p in p_list:
        p.start()
    for p in p_list:
        p.join()

进程模块的其他方法

进程间的通信

队列,这里的队列和线程里的队列调用方法不一样

#   什么是进程通信,为什么要进程通信
#   因为进程与进程之间是相互独立的,不像线程可以信息共享,要想让进程间相互信息共享,就要传参数(利用队列)

from multiprocessing import Process, Queue # 进程队列
import time
import queue   # 线程队列

#   主进程get数据,子进程put数据,
#   阻塞的原因是,这个队列的数据不共享,在子进程中队列和主进程的队列信息不共享,没有任何关系
#   结局方法,将队列作为参数传过去
#   注意在win上面不传参数不能运行,但是在linux下能运行
#  所以轻易不要用多进程,进行进程间通信时,会涉及到copy资源
# def fun(q):
# '''  队列
def fun():
    q.put(1)
    q.put(2)
    q.put(3)
    print(id(q))
    # print(q.empty())
# def fun2():
def fun2(q):
    while 1:
        # print(q.empty())
        data = q.get()
        print(data)
        print(id(q))

if __name__ == '__main__':
    # q = queue.Queue()  # 线程队列,
    q = Queue()          # 进程队列
    # p = Process(target=fun, args=(q,))  # 传参进行了copy占用资源
    p = Process(target=fun)
    p.start()
    p.join()
    p2 = Process(target=fun2)
    p2 = Process(target=fun2, args=(q,))
    p2.start()
    p2.join()

管道

from multiprocessing import Pipe
def fun(child_conn):
    child_conn.send([12, {"name":"yuan"}, 'hello'])
    print('主进程说:',child_conn.recv())
    print('子进程id',id(child_conn))

if __name__ == '__main__':
    prepare_conn, child_conn = Pipe()  # 双向管道
    print('id:',id(prepare_conn), id(child_conn))
    p = Process(target=fun, args=(child_conn,))
    p.start()
    print('子进程说:',prepare_conn.recv())
    prepare_conn.send('你好')
    print('主进程id',id(prepare_conn))
    p.join()

Managers

Queue和pipe只是实现了数据交互,并没有实现数据共享,即一个进程去更改另外一个进程的数据

Managers支持的数据类型
# A manager returned by Manager() will support types list, dict, Namespace,
# Lock, RLock, Semaphore, BoundedSemaphore, Condition, Event, Barrier,
#  Queue, Value and Array. For example：

from multiprocessing import Manager
# Queue和pipe只是实现了数据交互，并没实现数据共享，即一个进程去更改另一个进程的数据。
# 所以利用Managers 进行数据共享, 同样也需要传参

def fun(d, l, i):
    d[i]='字典'+ str(i)
    l.append(i)
    # pass
if __name__ == '__main__':
    with Manager() as manager:
        d = manager.dict()  #  d = {} 创建一个字典
        l = manager.list(range(5)) # l = range(5)
        p_list = []
        for i in range(5):
            p = Process(target=fun,args=(d,l,i,))
            p.start()
            p_list.append(p)
        for p in p_list:
            p.join()

        print(d)
        print(l)

Managers模块

进程同步

#   进程锁, lock
from multiprocessing import Process, Lock
import time

def fun(i):
    lock.acquire()
    print('%d'%i)
    lock.release()
if __name__ == '__main__':
    lock = Lock()
    for i in range(10):
        p = Process(target=fun, args=(i,)).start()

进程同步,锁

进程池

进程池内部维护一个进程序列，当使用时，则去进程池中获取一个进程，
如果进程池序列中没有可供使用的进程，那么程序就会等待，直到进程池中有可用进程为止。
进程池中有两个方法：
apply   同步
apply_async   异步(常用)

在利用Python进行系统管理的时候，特别是同时操作多个文件目录，或者远程控制多台主机，并行操作可以节约大量的时间。多进程是实现并发的手段之一，需要注意的问题是：

很明显需要并发执行的任务通常要远大于核数
一个操作系统不可能无限开启进程，通常有几个核就开几个进程
进程开启过多，效率反而会下降（开启进程是需要占用系统资源的，而且开启多余核数目的进程也无法做到并行）

例如当被操作对象数目不大时，可以直接利用multiprocessing中的Process动态成生多个进程，十几个还好，但如果是上百个，上千个。。。手动的去限制进程数量却又太过繁琐，此时可以发挥进程池的功效。

我们就可以通过维护一个进程池来控制进程数目，比如httpd的进程模式，规定最小进程数和最大进程数...
ps：对于远程过程调用的高级应用程序而言，应该使用进程池，Pool可以提供指定数量的进程，供用户调用，当有新的请求提交到pool中时，如果池还没有满，那么就会创建一个新的进程用来执行该请求；但如果池中的进程数已经达到规定最大值，那么该请求就会等待，直到池中有进程结束，就重用进程池中的进程。

'''
进程池内部维护一个进程序列，当使用时，则去进程池中获取一个进程，
如果进程池序列中没有可供使用的进程，那么程序就会等待，直到进程池中有可用进程为止。
进程池中有两个方法：
apply   同步
apply_async   异步(常用)
'''
from multiprocessing import Process, Pool
import time, os

def foo(i):
    time.sleep(1)
    print('子进程:',os.getpid(),i)
    return 'Hello %s'%i
def bar(args):
    # print('',os.getpgid())
    print(args)
    print('主进程:',os.getpid())

if __name__ == '__main__':
    pool = Pool(5)   #  进程池中开5个进程, 有100个任务需要执行, 每次执行5个
    print('main :',os.getpid())
    for i in range(100):
        #   回调函数:某个动作或者 函数执行成功后,再去执行的函数
        # pool.apply_async(func=foo, args=(i,))
        #   bar作为回调函数, 每次执行一个进程后都会执行回调函数, 回调函数是主进程中调用的
        #   func中函数的返回值,传给回调函数做参数
        pool.apply_async(func=foo, args=(i,), callback=bar)
    #   close和join必须加,而且顺序固定
    pool.close()
    pool.join()

进程池

回调函数

　回掉函数：

需要回调函数的场景：进程池中任何一个任务一旦处理完了，就立即告知主进程：我好了额，你可以处理我的结果了。主进程则调用一个函数去处理该结果，该函数即回调函数

我们可以把耗时间（阻塞）的任务放到进程池中，然后指定回调函数（主进程负责执行），这样主进程在执行回调函数时就省去了I/O的过程，直接拿到的是任务的结果。

from multiprocessing import Pool
import requests
import json
import os

def get_page(url):
    print('<进程%s> get %s' %(os.getpid(),url))
    respone=requests.get(url)
    if respone.status_code == 200:
        return {'url':url,'text':respone.text}

def pasrse_page(res):
    print('<进程%s> parse %s' %(os.getpid(),res['url']))
    parse_res='url:<%s> size:[%s]\n' %(res['url'],len(res['text']))
    with open('db.txt','a') as f:
        f.write(parse_res)


if __name__ == '__main__':
    urls=[
        'https://www.baidu.com',
        'https://www.python.org',
        'https://www.openstack.org',
        'https://help.github.com/',
        'http://www.sina.com.cn/'
    ]

    p=Pool(3)
    res_l=[]
    for url in urls:
        res=p.apply_async(get_page,args=(url,),callback=pasrse_page)
        res_l.append(res)

    p.close()
    p.join()
    print([res.get() for res in res_l]) #拿到的是get_page的结果,其实完全没必要拿该结果,该结果已经传给回调函数处理了

进程池应用

协程

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 18-5-26 上午10:19
# @Author  : LK
# @File    : 协程.py
# @Software: PyCharm
from greenlet import greenlet


#   优缺点
'''
def test1():
    print(12)
    g2.switch()  #  切换到g2 
    print(24)
    g2.switch()  #  切换到g2
def test2():
    print(56)
    g1.switch()  #  切换到g1
    print(65)
    g1.switch()

if __name__ ==  '__main__':
    g1 = greenlet(test1)
    g2 = greenlet(test2)
    g2.switch()
    
'''
'''
import gevent

import requests,time


start=time.time()

def f(url):
    print('GET: %s' % url)
    resp =requests.get(url)
    data = resp.text
    f = open('new.html', 'w', encoding='utf-8')
    f.write(data)
    f.close()
    print('%d bytes received from %s.' % (len(data), url))

#   这就相当于一个协程, 每次进行读写操作阻塞时,都会切换cpu
gevent.joinall([

        gevent.spawn(f, 'https://www.python.org/'),
        gevent.spawn(f, 'https://www.yahoo.com/'),
        gevent.spawn(f, 'https://www.baidu.com/'),
        gevent.spawn(f, 'https://www.sina.com.cn/'),

])

# f('https://www.python.org/')
#
# f('https://www.yahoo.com/')
#
# f('https://baidu.com/')
#
# f('https://www.sina.com.cn/')

print("cost time:",time.time()-start)

'''
# import threading, gevent
# def fun():
#     a = input('>>>')
#     print(a)
# def fun2():
#     print('\nhello')
# # gevent.joinall([
# #
# #         gevent.spawn(fun()),
# #         gevent.spawn(fun2()),
# # ])
# if __name__ == '__main__':
#     # t1 = threading.Thread(target=fun).start()
#     # t2 = threading.Thread(target=fun2).start()
#     gevent.joinall([
# 
#             gevent.spawn(fun),
#             gevent.spawn(fun2),
#     ])

未完整

协程参考:

http://www.cnblogs.com/linhaifeng/articles/7429894.html

协程,协作式----- 非抢占式的程序

自己调动

Yield(协程)

用户态的切换呢

关键点在哪一步切换

协程主要解决的是io操作

协程:本质上就是一个线程

协程的优势:

1:没有切换的消耗

2:没有锁的概念

但是不能用多核,所以用多进程+协程,是一个很好的解决并发的方案

驱动事件

@font-face{ font-family:"Times New Roman"; } @font-face{ font-family:"宋体"; } p.MsoNormal{ mso-style-name:正文; mso-style-parent:""; margin:0pt; margin-bottom:.0001pt; mso-pagination:none; text-align:left; font-family:'Times New Roman'; font-size:12.0000pt; mso-font-kerning:1.0000pt; } span.msoIns{ mso-style-type:export-only; mso-style-name:""; text-decoration:underline; text-underline:single; color:blue; } span.msoDel{ mso-style-type:export-only; mso-style-name:""; text-decoration:line-through; color:red; } @page{mso-page-border-surround-header:no; mso-page-border-surround-footer:no;}@page Section0{ } div.Section0{page:Section0;}

Io多路复用

Select 触发方式

1:水平处罚

2:边缘触发

3:IO多路复用优势:同时可以监听多个链接

Select 负责监听

IO多路复用:

Select 那个平台都有

Poll

Epoll 效率最高

posted @ 2018-05-27 16:33 Lucky& 阅读(1890) 评论(0) 编辑收藏举报

刷新页面返回顶部

Lucky&