Jetson TX1 compile pytorch issues

1. c++: internal compiler error: Killed (program cc1plus)

reason: memory out, need swapfile

2. NCCL issues

/home/ubuntu/Project/pytorch/build/lib/libcaffe2_gpu.so: undefined reference to `ncclAllReduce'
/home/ubuntu/Project/pytorch/build/lib/libcaffe2_gpu.so: undefined reference to `ncclGetErrorString'
/home/ubuntu/Project/pytorch/build/lib/libcaffe2_gpu.so: undefined reference to `ncclGroupEnd'
/home/ubuntu/Project/pytorch/build/lib/libcaffe2_gpu.so: undefined reference to `ncclGroupStart'
/home/ubuntu/Project/pytorch/build/lib/libcaffe2_gpu.so: undefined reference to `ncclBcast'
/home/ubuntu/Project/pytorch/build/lib/libcaffe2_gpu.so: undefined reference to `ncclCommDestroy'
/home/ubuntu/Project/pytorch/build/lib/libcaffe2_gpu.so: undefined reference to `ncclReduceScatter'
/home/ubuntu/Project/pytorch/build/lib/libcaffe2_gpu.so: undefined reference to `ncclCommInitAll'
/home/ubuntu/Project/pytorch/build/lib/libcaffe2_gpu.so: undefined reference to `ncclAllGather'
/home/ubuntu/Project/pytorch/build/lib/libcaffe2_gpu.so: undefined reference to `ncclReduce'
collect2: error: ld returned 1 exit status
caffe2/CMakeFiles/utility_ops_gpu_test.dir/build.make:107: recipe for target 'bin/utility_ops_gpu_test' failed
make[2]: *** [bin/utility_ops_gpu_test] Error 1
CMakeFiles/Makefile2:3094: recipe for target 'caffe2/CMakeFiles/utility_ops_gpu_test.dir/all' failed
make[1]: *** [caffe2/CMakeFiles/utility_ops_gpu_test.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

 

The dependency target "nccl_external" of target "gloo_cuda" does not exist.
Call Stack (most recent call first):
  CMakeLists.txt:236 (include)
This warning is for project developers.  Use -Wno-dev to suppress it.

solver: https://devtalk.nvidia.com/default/topic/1042821/jetson-tx2/pytorch-install-with-python3-broken/post/5291480/#5291480

CmakeLists.txt : Change NCCL to 'Off' 
setup.py: Add USE_NCCL = False 

################################################################################
# Parameters parsed from environment
################################################################################
USE_NCCL = False
VERBOSE_SCRIPT = True
RUN_BUILD_DEPS = True

 

3. package is in a very bad inconsistent state

sudo apt-get -f --reinstall install <your package>

posted @ 2019-04-21 23:48  BlueOceans  阅读(488)  评论(0编辑  收藏  举报