multiprocessing.pool 使用 多参数【map_async】【apply_async】

  1. 区别【apply】【map】【apply_async】【map_async】
【map】各个进程执行顺序确定,当前进程阻塞
【map_async】各个进程执行顺序确定,当前进程不阻塞
【apply】各个进程执行顺序不确定,当前进程阻塞
【apply_async】各个进程执行顺序不确定,当前进程不阻塞,子进程异步执行



https://stackoverflow.com/questions/8533318/multiprocessing-pool-when-to-use-apply-apply-async-or-map


Notice, unlike pool.map, the order of the results may not correspond to the order in which the pool.apply_async calls were made.

So, if you need to run a function in a separate process, but want the current process to block until that function returns, use Pool.apply. 
Like Pool.apply, Pool.map blocks until the complete result is returned.

If you want the Pool of worker processes to perform many function calls asynchronously, use Pool.apply_async.
The order of the results is not guaranteed to be the same as the order of the calls to Pool.apply_async.

Notice also that you could call a number of different functions with Pool.apply_async (not all calls need to use the same function).

In contrast, Pool.map applies the same function to many arguments. However, unlike Pool.apply_async, the results are returned in an order corresponding to the order of the arguments.




import multiprocessing as mp
import time

def foo_pool(x):
    time.sleep(2)
    return x*x

result_list = []
def log_result(result):
    # This is called whenever foo_pool(i) returns a result.
    # result_list is modified only by the main process, not the pool workers.
    result_list.append(result)

def apply_async_with_callback():
    pool = mp.Pool()
    for i in range(10):
        pool.apply_async(foo_pool, args = (i, ), callback = log_result)
    pool.close()
    pool.join()
    print(result_list)

if __name__ == '__main__':
    apply_async_with_callback()

may yield a result such as
[1, 0, 4, 9, 25, 16, 49, 36, 81, 64]
  1. 多参数理解
*add 的使用方法
不加*会出现:TypeError: action() takes 1 positional argument but 3 were given

import threading
#定义线程要调用的方法,*add可接收多个以非关键字方式传入的参数  
def action(*add):
    for arc in add:
        #调用 getName() 方法获取当前执行该程序的线程名
        print(threading.current_thread().getName() +" "+ arc)
#定义为线程方法传入的参数
my_tuple = ("http://c.biancheng.net/python/",\
            "http://c.biancheng.net/shell/",\
            "http://c.biancheng.net/java/")
#创建线程
thread = threading.Thread(target = action,args =my_tuple)
#启动线程
thread.start()
#指定 thread 线程优先执行完毕
thread.join()
#主线程执行如下语句
for i in range(5):
    print(threading.current_thread().getName())

  1. map ,map_async
# f 参数是两个,multiprocessing.pool.map框架只能传一个的时候

from multiprocessing import Pool
import time

# 1 这个方法不行,但是装饰器思路好
# def my_function_helper(func):
#     def inner(tup):
#         print('start wrapper is {}'.format(tup))
#         r = func(*tup)
#         print('end wrapper')
#         return r
#     return inner
#
#
# @my_function_helper
# def f(x,y):
#     print (x*x)
#     print(y)


# 2 这个方法可行,不用改变原函数
def f(x,y):
    print (x*x,y)


def my_function_helper(tup):
    return f(*tup)


if __name__ == '__main__':
    pool = Pool(processes=4)

    # 一个参数的情况
    # pool.map(f, [i for i in range(10)])
    # r = pool.map_async(f, [i for i in range(10)])

    # 两个参数的情况
    pool.map(my_function_helper, [(i,2) for i in range(10)])
    r = pool.map_async(my_function_helper, [(i,2) for i in range(10)])
    # DO STUFF
    print ('HERE')
    print ('MORE')
    r.wait()
    print ('DONE')

  1. starmap (类似map是同步的,看例子)

from multiprocessing import Pool, cpu_count
import time


build_links = [1,2,3,4,5,6,7,8]
auth = 'auth'


def test(url, auth):
    time.sleep(2)
    with open('1.txt',mode='w') as f:
        f.write('test')
    print(url, auth)

def test2(url, auth):
    time.sleep(6)
    with open('2.txt',mode='w') as f:
        f.write('test2')
    print(url, auth)

if __name__ == '__main__':  # 必须要加,不然出错,以避免递归创建子流程。
    with Pool(processes=int(cpu_count() - 1) or 1) as pool:
        pool.starmap(test, [(link, auth) for link in build_links])
        print('要等上面的函数执行完成!')
        pool.starmap(test2, [(link, auth) for link in build_links])

  1. 多参数异步的实现

from multiprocessing import Pool, cpu_count
import time

build_links = [1,2,3,4,5,6,7,8]
auth = 'auth'


def test(url, auth):
    time.sleep(2)
    with open('1.txt',mode='w') as f:
        f.write('test')
    print(url, auth)

def test2(url, auth):
    time.sleep(6)
    with open('2.txt',mode='w') as f:
        f.write('test2')
    print(url, auth)


def my_test_helper(tup):
    return test(*tup)


def my_test2_helper(tup):
    return test2(*tup)


if __name__ == '__main__':  # 必须要加,不然出错,以避免递归创建子流程。
    with Pool(processes=int(cpu_count() - 1) or 1) as pool:
        print('1')
        r = pool.map_async(my_test_helper, [(link, auth) for link in build_links])
        r2 = pool.map_async(my_test2_helper, [(link, auth) for link in build_links])
        print('2')
        r.wait()
        print('MORE')
        r2.wait()
        print('DONE')


posted @ 2020-12-28 14:04  该显示昵称已被使用了  阅读(2109)  评论(0编辑  收藏  举报