Cannot re-initialize CUDA in forked subprocess.
"Cannot re-initialize CUDA in forked subprocess. " + msg)
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
anconda_install/envs/pytorch1.7/lib/python3.7/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/anconda_install/envs/pytorch1.7/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/anconda_install/envs/pytorch1.7/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/anconda_install/envs/pytorch1.7/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/3-tmp/data_custom.py", line 106, in __getitem__
flag_same, type_classes, tra, rank_, img, bev_feature_x1 = self.get_from_idx(idx)
File "/3-tmp/data_custom.py", line 65, in get_from_idx
bev_feature_x1 = torch.load(path_bev_feature_x1) # [2, 64, 200, 96]
File "/anconda_install/envs/pytorch1.7/lib/python3.7/site-packages/torch/serialization.py", line 594, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/anconda_install/envs/pytorch1.7/lib/python3.7/site-packages/torch/serialization.py", line 853, in _load
result = unpickler.load()
File "/anconda_install/envs/pytorch1.7/lib/python3.7/site-packages/torch/serialization.py", line 845, in persistent_load
load_tensor(data_type, size, key, _maybe_decode_ascii(location))
File "/anconda_install/envs/pytorch1.7/lib/python3.7/site-packages/torch/serialization.py", line 834, in load_tensor
loaded_storages[key] = restore_location(storage, location)
File "/anconda_install/envs/pytorch1.7/lib/python3.7/site-packages/torch/serialization.py", line 175, in default_restore_location
result = fn(storage, location)
File "/anconda_install/envs/pytorch1.7/lib/python3.7/site-packages/torch/serialization.py", line 157, in _cuda_deserialize
return obj.cuda(device)
File "/anconda_install/envs/pytorch1.7/lib/python3.7/site-packages/torch/_utils.py", line 71, in _cuda
with torch.cuda.device(device):
File "/anconda_install/envs/pytorch1.7/lib/python3.7/site-packages/torch/cuda/__init__.py", line 225, in __enter__
self.prev_idx = torch._C._cuda_getDevice()
File "/anconda_install/envs/pytorch1.7/lib/python3.7/site-packages/torch/cuda/__init__.py", line 164, in _lazy_init
"Cannot re-initialize CUDA in forked subprocess. " + msg)
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
解决方法1:
开头添加:
from torch.multiprocessing import set_start_method
try:
set_start_method('spawn')
except RuntimeError:
pass
torch.cuda.empty_cache()
解决方法2:
train_loader = torch.utils.data.DataLoader(train_dataset,
batch_size=batch_size,
shuffle=True,
drop_last=True,
num_workers=num_workers,
multiprocessing_context='spawn')
好记性不如烂键盘---点滴、积累、进步!