PaddlePaddle使用paddle.utils.run_check()检测出现PaddlePaddle meets some problem with 8 GPUs

WARNING:root:PaddlePaddle meets some problem with 8 GPUs. This may be caused by:
1. There is not enough GPUs visible on your system
2. Some GPUs are occupied by other process now
3. NVIDIA-NCCL2 is not installed correctly on your system. Please follow instruction on https://github.com/NVIDIA/nccl-tests
to test your NCCL, or reinstall it following https://docs.nvidia.com/deeplearning/sdk/nccl-install-guide/index.html
WARNING:root:
Original Error is: (External) NCCL error(2), unhandled system error.
[Hint: 'ncclSystemError'. A call to the system failed.] (at /paddle/paddle/fluid/platform/device/gpu/nccl_helper.h:155)

解决办法:

创建容器时加上--shm-size 8g参数

docker run --name paddle_docker_v2 --gpus all --shm-size 8g -it -v $PWD:/paddle paddlepaddle/paddle:2.3.1-gpu-cuda11.2-cudnn8 /bin/bash
posted @   盛世芳华  阅读(1218)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 分享一个免费、快速、无限量使用的满血 DeepSeek R1 模型,支持深度思考和联网搜索!
· 基于 Docker 搭建 FRP 内网穿透开源项目(很简单哒)
· ollama系列1:轻松3步本地部署deepseek,普通电脑可用
· 按钮权限的设计及实现
· 【杂谈】分布式事务——高大上的无用知识?
点击右上角即可分享
微信分享提示