[Devops] ubuntu 重装 nvida-docker 驱动

重装 nvidia-docker2

apt-get remove docker docker-engine docker.io containerd runc
apt-get install apt-transport-https ca-certificates curl gnupg-agent software-properties-common

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"

apt-get update

apt-get install docker-ce docker-ce-cli containerd.io
docker run hello-world

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

apt-get update

apt-get install -y nvidia-docker2
pkill -SIGHUP dockerd

apt-get install nvidia-cuda-toolkit
nvcc --version

此时查看 nvidia-smi 提示驱动失败

重装 nvida-linux 驱动

cd /root/ && NVIDIA-Linux-x86_64-470.63.01.run -s -q

在容器内验证 nvidia-smi

容器启动的时候不带 --gpus all 参数进入容器,,查看 nvidia-smi 失败

docker run --runtime=nvidia --rm nvidia/cuda:9.1-base nvidia-smi

容器启动的时候带 --gpus all 参数进入容器,查看 nvidia-smi 成功

docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu18.04 nvidia-smi
cd /root/ && NVIDIA-Linux-x86_64-470.63.01.run -s -q
nvidia-smi
posted @ 2024-03-27 14:26  ffl  阅读(15)  评论(0编辑  收藏  举报