multiprocessing在类中使用问题记录
问题:当在类中使用multiprocessing.Pool会报pickling error错误.
原因:pool方法都使用了queue.Queue将task传递给工作进程。multiprocessing必须将数据序列化以在进程间传递。方法只有在模块的顶层时才能被序列化,跟类绑定的方法不能被序列化,就会出现上面的异常。
解决:
方法1:用线程代替进程
# coding: utf8
from multiprocessing.pool import ThreadPool as Pool
class MyTask(object):
def task(self, x):
return x*x
def run(self):
pool = Pool(3)
a = [1, 2, 3]
ret = pool.map(self.task, a)
print ret
if __name__ == '__main__':
t = MyTask()
t.run()
方法2:可以使用copy_reg来规避上面的异常.
# coding: utf8
import multiprocessing
import types
import copy_reg
def _pickle_method(m):
if m.im_self is None:
return getattr, (m.im_class, m.im_func.func_name)
else:
return getattr, (m.im_self, m.im_func.func_name)
copy_reg.pickle(types.MethodType, _pickle_method)
class MyTask(object):
def __init__(self):
self.__result = []
def task(self, x):
return x * x
def result_collector(self, result):
self.__result.append(result)
def run(self):
pool = multiprocessing.Pool(processes=3)
a = [1, 2, 3]
ret = pool.map(self.task, a)
print ret
if __name__ == '__main__':
t = MyTask()
t.run()
方法3:换库(dill或者pathos)
dill 或pathos.multiprocesssing :use pathos.multiprocesssing, instead of multiprocessing. pathos.multiprocessing is a fork of multiprocessing that uses dill. dill can serialize almost anything in python, so you are able to send a lot more around in parallel.
本文来自博客园,作者:27岁的太阳,转载请注明原文链接:https://www.cnblogs.com/isxjj/p/16054566.html