centos 安装cuda
零 修订记录
序号 | 修订内容 | 修订时间 |
---|---|---|
1 | 新增 | 2021/1/20 |
一 摘要
本文主要介绍centos 8.1 安装cuda
二 环境信息
(一) 操作系统
[root@ussuritest004 ~]# cat /etc/centos-release
CentOS Linux release 8.1.1911 (Core)
[root@ussuritest004 ~]#
(二) cuda 版本
我这里用的是
cuda_10.2.89_440.33.01_linux.run
三 实施
(一)准备工作
3.1.1 检查机器是否装有支持cuda的gpu
[root@ussuritest004 software]# lspci | grep -i nvidia
af:00.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1)
[root@ussuritest004 software]#
3.1.2 下载
此处先略
(二) runfile 安装
3.2.1 安装基础依赖
[root@ussuritest004 yum.repos.d]# yum install gcc
这个可以不要
[root@ussuritest004 yum.repos.d]# yum install libglu1-mesa libxi-dev libxmu-dev libglu1-mesa-dev freeglut3-dev
3.2.2 关闭 the Nouveau drivers
3.2.2.1 检查nouveau 驱动是否启动
[root@ussuritest004 log]# lsmod | grep nouveau
nouveau 2215936 1
mxm_wmi 16384 1 nouveau
video 45056 1 nouveau
wmi 32768 2 mxm_wmi,nouveau
i2c_algo_bit 16384 2 ast,nouveau
drm_kms_helper 217088 2 ast,nouveau
ttm 110592 2 ast,nouveau
drm 524288 7 drm_kms_helper,ast,ttm,nouveau
[root@ussuritest004 log]#
有输出表示启动了。
3.2.2.2 关闭nouveau 驱动
3.2.2.2.1 新增黑名单
To disable the Nouveau drivers, creating a file at "/usr/lib/modprobe.d/blacklist-nouveau.conf" with following content:
blacklist nouveau
options nouveau modeset=0
[root@ussuritest004 ~]# ll /usr/lib/modprobe.d/blacklist-nouveau.conf
ls: cannot access '/usr/lib/modprobe.d/blacklist-nouveau.conf': No such file or directory
[root@ussuritest004 ~]# vim /usr/lib/modprobe.d/blacklist-nouveau.conf
[root@ussuritest004 ~]#
[root@ussuritest004 ~]# cat /usr/lib/modprobe.d/blacklist-nouveau.conf
blacklist nouveau
options nouveau modeset=0
[root@ussuritest004 ~]#
3.2.2.2.2 重新生成 kernel inittramfs
先备份
[root@ussuritest004 boot]# uname -r
4.18.0-147.el8.x86_64
[root@ussuritest004 boot]# cp /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak.orig
[root@ussuritest004 boot]# ll
total 167724
-rw-------. 1 root root 3838259 Dec 5 2019 System.map-4.18.0-147.el8.x86_64
-rw-r--r--. 1 root root 184613 Dec 5 2019 config-4.18.0-147.el8.x86_64
drwxr-xr-x. 3 root root 4096 Jan 19 11:44 efi
drwx------. 4 root root 4096 Jan 19 15:02 grub2
-rw-------. 1 root root 71694380 Jan 19 11:49 initramfs-0-rescue-c7dcb861dc20453f8e275d6036842581.img
-rw-------. 1 root root 30310567 Jan 19 11:50 initramfs-4.18.0-147.el8.x86_64.img
-rw------- 1 root root 30310567 Jan 20 13:50 initramfs-4.18.0-147.el8.x86_64.img.bak.orig
-rw-------. 1 root root 19141009 Jan 19 11:57 initramfs-4.18.0-147.el8.x86_64kdump.img
drwxr-xr-x. 3 root root 4096 Jan 19 11:47 loader
drwx------. 2 root root 16384 Jan 19 11:28 lost+found
-rwxr-xr-x. 1 root root 8106744 Jan 19 11:48 vmlinuz-0-rescue-c7dcb861dc20453f8e275d6036842581
-rwxr-xr-x. 1 root root 8106744 Dec 5 2019 vmlinuz-4.18.0-147.el8.x86_64
[root@ussuritest004 boot]#
再重新生成
[root@ussuritest004 boot]# dracut /boot/initramfs-$(uname -r).img --force
[root@ussuritest004 boot]# ll
total 166988
-rw-------. 1 root root 3838259 Dec 5 2019 System.map-4.18.0-147.el8.x86_64
-rw-r--r--. 1 root root 184613 Dec 5 2019 config-4.18.0-147.el8.x86_64
drwxr-xr-x. 3 root root 4096 Jan 19 11:44 efi
drwx------. 4 root root 4096 Jan 19 15:02 grub2
-rw-------. 1 root root 71694380 Jan 19 11:49 initramfs-0-rescue-c7dcb861dc20453f8e275d6036842581.img
-rw-------. 1 root root 29560525 Jan 20 13:53 initramfs-4.18.0-147.el8.x86_64.img
-rw------- 1 root root 30310567 Jan 20 13:50 initramfs-4.18.0-147.el8.x86_64.img.bak.orig
-rw-------. 1 root root 19141009 Jan 19 11:57 initramfs-4.18.0-147.el8.x86_64kdump.img
drwxr-xr-x. 3 root root 4096 Jan 19 11:47 loader
drwx------. 2 root root 16384 Jan 19 11:28 lost+found
-rwxr-xr-x. 1 root root 8106744 Jan 19 11:48 vmlinuz-0-rescue-c7dcb861dc20453f8e275d6036842581
-rwxr-xr-x. 1 root root 8106744 Dec 5 2019 vmlinuz-4.18.0-147.el8.x86_64
[root@ussuritest004 boot]#
3.2.2.3 运行级别修改为文本模式
[root@ussuritest004 boot]# systemctl set-default multi-user.target
Removed /etc/systemd/system/default.target.
Created symlink /etc/systemd/system/default.target → /usr/lib/systemd/system/multi-user.target.
[root@ussuritest004 boot]#
修改完重启机器
3.2.3 安装cuda_10.2.89_440.33.01_linux.run
3.2.3.1 step by step
[root@ussuritest004 software]# sh cuda_10.2.89_440.33.01_linux.run
该命令执行后需要等一段时间
输入accept
选择install
装失败了
报错日志
[root@ussuritest004 log]# cat nvidia-installer.log
nvidia-installer log file '/var/log/nvidia-installer.log'
creation time: Wed Jan 20 14:59:43 2021
installer version: 440.33.01
PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
nvidia-installer command line:
./nvidia-installer
--ui=none
--no-questions
--accept-license
--disable-nouveau
--no-cc-version-check
--install-libglvnd
Using built-in stream user interface
-> Detected 48 CPUs online; setting concurrency level to 32.
-> Installing NVIDIA driver version 440.33.01.
WARNING: One or more modprobe configuration files to disable Nouveau are already present at: /usr/lib/modprobe.d/nvidia-installer-disable-nouveau.conf, /etc/modprobe.d/nvidia-installer-disable-nouveau.conf. Please be sure you have rebooted your system since these files were written. If you have rebooted, then Nouveau may be enabled for other reasons, such as being included in the system initial ramdisk or in your X configuration file. Please consult the NVIDIA driver README and your Linux distribution's documentation for details on how to correctly disable the Nouveau kernel driver.
-> For some distributions, Nouveau can be disabled by adding a file in the modprobe configuration directory. Would you like nvidia-installer to attempt to create this modprobe file for you? (Answer: Yes)
-> One or more modprobe configuration files to disable Nouveau have been written. For some distributions, this may be sufficient to disable Nouveau; other distributions may require modification of the initial ramdisk. Please reboot your system and attempt NVIDIA driver installation again. Note if you later wish to reenable Nouveau, you will need to delete these files: /usr/lib/modprobe.d/nvidia-installer-disable-nouveau.conf, /etc/modprobe.d/nvidia-installer-disable-nouveau.conf
ERROR: Unable to find the development tool `make` in your path; please make sure that you have the package 'make' installed. If make is installed on your system, then please check that `make` is in your PATH.
ERROR: Installation has failed. Please see the file '/var/log/nvidia-installer.log' for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.
[root@ussuritest004 log]#
posted on 2021-01-20 13:58 weiwei2021 阅读(2066) 评论(0) 编辑 收藏 举报
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· .NET Core 中如何实现缓存的预热?
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· 阿里巴巴 QwQ-32B真的超越了 DeepSeek R-1吗?
· 【译】Visual Studio 中新的强大生产力特性
· 10年+ .NET Coder 心语 ── 封装的思维:从隐藏、稳定开始理解其本质意义
· 【设计模式】告别冗长if-else语句:使用策略模式优化代码结构