ubuntu 22.04离线安装cuda 11.7.1、cudnn 8.9.3.28、nccl 2.18.3、tensorrt 8.6.1
最近在使用飞桨OCR,有几个特殊的符号需要进行识别,手上只有两台机器,一台1080TI单卡(windows 11),一台1080Ti双卡(linux 22.04),习惯性追新到飞桨最高支持的cuda11.7,其实1080Ti到cuda10就够用了,后面的新版本差没有明显的性能提升。
windows上无脑安装,linux上安装比较麻烦,记录下安装过程。
cuda、cudnn对nvidia驱动以及内核有依赖关系,cuda 11.7最低驱动版本是450.80,详细请看https://docs.nvidia.com/deeplearning/cudnn/support-matrix/index.html#cudnn-versions-linux
注意:使用离线方式进行安装,要注册Nvidia的开发者账号才能下载相应的安装包。
-
清理之前残留的nvidia驱动
sudo apt autoremove -y nvidia* --purge sudo rm /etc/apt/sources.list.d/cuda* sudo apt-get autoremove && sudo apt-get autoclean sudo rm -rf /usr/local/cuda*
-
更新显卡驱动
ubuntu-drivers devices sudo ubuntu-drivers autoinstall sudo apt install -y nvidia-driver-525 sudo reboot
重启后使用
nvidia-smi
检测驱动安装是否正确 -
安装 cuda 11.7.1: https://developer.nvidia.com/cuda-toolkit-archive https://developer.nvidia.com/cuda-11-7-1-download-archive
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600 wget https://developer.download.nvidia.com/compute/cuda/11.7.1/local_installers/cuda-repo-ubuntu2204-11-7-local_11.7.1-515.65.01-1_amd64.deb sudo dpkg -i cuda-repo-ubuntu2204-11-7-local_11.7.1-515.65.01-1_amd64.deb sudo cp /var/cuda-repo-ubuntu2204-11-7-local/cuda-*-keyring.gpg /usr/share/keyrings/ sudo apt update sudo apt -y install cuda-11-7 sudo reboot
使用
nvcc --version
检测cuda版本,如果找不到ncvv
需要把/usr/local/cuda-11.7/bin
添加到环境变量。 -
安装 cudnn 8.9.3 for cuda 11: https://developer.nvidia.com/rdp/cudnn-download
wget https://developer.nvidia.com/downloads/compute/cudnn/secure/8.9.3/local_installers/11.x/cudnn-local-repo-ubuntu2204-8.9.3.28_1.0-1_amd64.deb/ sudo dpkg -i cudnn-local-repo-ubuntu2204-8.9.3.28_1.0-1_amd64.deb sudo cp /var/cudnn-local-repo-ubuntu2204-8.9.3.28/cudnn-local-7F7A158C-keyring.gpg /usr/share/keyrings/ sudo apt update sudo apt -y install libcudnn8=8.9.3.28-1+cuda11.8 libcudnn8-dev=8.9.3.28-1+cuda11.8
-
安装 nccl 2.18.3 for cuda 11: https://developer.nvidia.com/nccl/nccl-download
wget https://developer.nvidia.com/downloads/compute/machine-learning/nccl/secure/2.18.3/agnostic/x64/nccl_2.18.3-1+cuda11.0_x86_64.txz/ tar xvf nccl_2.18.3-1+cuda11.0_x86_64.txz sudo mv nccl_2.18.3-1+cuda11.0_x86_64 /usr/local/nccl_2.18.3
-
安装 tensorRT 8.6.1 for cuda 11: https://developer.nvidia.com/nvidia-tensorrt-8x-download
wget https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/secure/8.6.1/local_repos/nv-tensorrt-local-repo-ubuntu2204-8.6.1-cuda-11.8_1.0-1_amd64.deb sudo dpkg -i nv-tensorrt-local-repo-ubuntu2204-8.6.1-cuda-11.8_1.0-1_amd64.deb sudo cp $ ls /var/nv-tensorrt-local-repo-ubuntu2204-8.6.1-cuda-11.8/nv-tensorrt-local-0628887B-keyring.gpg /usr/share/keyrings/ sudo apt update sudo apt -y install tensorrt=8.6.1.6-1+cuda11.8
-
添加路径到环境变量或者
.bashrc
export PATH=/usr/local/cuda-11.7/bin:~/.local/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda-11.7/lib64:/usr/local/nccl_2.18.3/lib:$LD_LIBRARY_PATH