multiprocessing在类中使用问题记录

问题:当在类中使用multiprocessing.Pool会报pickling error错误.

原因:pool方法都使用了queue.Queue将task传递给工作进程。multiprocessing必须将数据序列化以在进程间传递。方法只有在模块的顶层时才能被序列化,跟类绑定的方法不能被序列化,就会出现上面的异常。

解决:

方法1:用线程代替进程

# coding: utf8
from multiprocessing.pool import ThreadPool as Pool


class MyTask(object):
    def task(self, x):
        return x*x

    def run(self):
        pool = Pool(3)

        a = [1, 2, 3]
        ret = pool.map(self.task, a)
        print ret


if __name__ == '__main__':
    t = MyTask()
    t.run()

方法2:可以使用copy_reg来规避上面的异常.

# coding: utf8
import multiprocessing
import types
import copy_reg


def _pickle_method(m):
    if m.im_self is None:
        return getattr, (m.im_class, m.im_func.func_name)
    else:
        return getattr, (m.im_self, m.im_func.func_name)


copy_reg.pickle(types.MethodType, _pickle_method)


class MyTask(object):
    def __init__(self):
        self.__result = []

    def task(self, x):
        return x * x

    def result_collector(self, result):
        self.__result.append(result)

    def run(self):
        pool = multiprocessing.Pool(processes=3)

        a = [1, 2, 3]
        ret = pool.map(self.task, a)
        print ret


if __name__ == '__main__':
    t = MyTask()
    t.run()

方法3:换库(dill或者pathos)
dill 或pathos.multiprocesssing :use pathos.multiprocesssing, instead of multiprocessing. pathos.multiprocessing is a fork of multiprocessing that uses dill. dill can serialize almost anything in python, so you are able to send a lot more around in parallel.

posted @ 2022-03-25 14:37  27岁的太阳  阅读(215)  评论(0编辑  收藏  举报