python标准库之MultiProcessing库的研究 (1)

MultiProcessing模块是一个优秀的类似多线程MultiThreading模块处理并发的包
之前接触过一点这个库，但是并没有深入研究，这次闲着无聊就研究了一下，算是解惑吧。
今天先研究下apply_async与map方法。传闻就是这两个方法分配进程池中的进程给相关函数，我想验证下。
看下官网对这两个的解释：
apply_async(func[, args[, kwds[, callback[, error_callback]]]])
A variant of the apply() method which returns a result object.

If callback is specified then it should be a callable which accepts a single argument. When the result becomes ready callback is applied to it, that is unless the call failed, in which case the error_callback is applied instead.

If error_callback is specified then it should be a callable which accepts a single argument. If the target function fails, then the error_callback is called with the exception instance.

Callbacks should complete immediately since otherwise the thread which handles the results will get blocked.

map(func, iterable[, chunksize])
A parallel equivalent of the map() built-in function (it supports only one iterable argument though). It blocks until the result is ready.

This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks. The (approximate) size of these chunks can be specified by setting chunksize to a positive integer.

Pool可以提供指定数量的进程供用户调用，当有新的请求提交到pool中时，如果池还没有满，那么就会创建一个新的进程用来执行该请求；但如果池中的进程数已经达到规定最大值，那么该请求就会等待，直到池中有进程结束，才会创建新的进程来运行它

下面看下程序吧：

from multiprocessing import Pool
import time
import os


def func(msg):
    print('msg: %s %s' % (msg, os.getpid()))
    time.sleep(3)
    print("end")


if __name__ == '__main__':
    pool = Pool(4)
    for i in range(4):
        msg = 'hello %d' % (i)
        pool.apply_async(func, (msg, ))
    # pool.map(func, range(4))
    print("Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~")
    pool.close()
    pool.join()   # 调用join之前，先调用close函数，否则会出错。执行完close后不会有新的进程加入到pool,join函数等待所有子进程结束
    print("Sub-process(es) done.")

运行结果：

去掉map注释，在apply_async函数处加上注释

看下进程池进程不够的情况下的程序及运行结果：

from multiprocessing import Pool
import time
import os


def func(msg):
    print('msg: %s %s' % (msg, os.getpid()))
    time.sleep(3)
    print("end")


if __name__ == '__main__':
    pool = Pool(3)
    '''for i in range(4):
        msg = 'hello %d' % (i)
        pool.apply_async(func, (msg, ))'''
    pool.map(func, range(4))
    print("Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~")
    pool.close()
    pool.join()   # 调用join之前，先调用close函数，否则会出错。执行完close后不会有新的进程加入到pool,join函数等待所有子进程结束
    print("Sub-process(es) done.")

程序结果：

可以看到，如果进程池的进程数量大于等于所要运行的函数的次数，那就可以很顺利，而且看着结果也很理所当然；但是如果进程池的进程的数量小于所要运行的函数的次数，那么就会有一个进程发生阻塞，即两个或多个函数共用一个进程.
而且，apply_async函数的第二个参数传入的是一个参数值，一旦运行这个函数，就会分配一个进程给函数，注意是异步的哦，因此如果需要分配多个进程就需要有一个for循环或是while循环；对于map函数，其第二个参数值接收的是一个迭代器，因此就不用在用for循环了。要记住，这两个函数所实现的就是依次将进程池里的进程分配给函数。

顺便吐槽下，全英文的 MultiProcessing官网看的很懵逼痛苦，又很有意思，不得不说，对英语还是很有帮助的.....

posted @ 2017-07-16 21:30 又见阿郎阅读(1177) 评论(0) 编辑收藏举报

刷新页面返回顶部

又见阿郎

python标准库之MultiProcessing库的研究 (1)

公告