CentOS查看GPU显卡信息
一、安装命令工具
# yum install pciutils lshw -y
二、获取显示信息
# lspci | grep -E "VGA|NVIDIA" 03:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. Integrated Matrox G200eW3 Graphics Controller (rev 04) 3b:00.0 3D controller: NVIDIA Corporation GA100 [GRID A100 PCIe 40GB] (rev a1) d8:00.0 3D controller: NVIDIA Corporation GA100 [GRID A100 PCIe 40GB] (rev a1)
# lshw -C display *-display description: VGA compatible controller product: Integrated Matrox G200eW3 Graphics Controller vendor: Matrox Electronics Systems Ltd. physical id: 0 bus info: pci@0000:03:00.0 version: 04 width: 32 bits clock: 66MHz capabilities: pm vga_controller bus_master cap_list rom configuration: driver=mgag200 latency=0 maxlatency=32 mingnt=16 resources: irq:16 memory:91000000-91ffffff memory:92808000-9280bfff memory:92000000-927fffff *-display description: 3D controller product: GA100 [GRID A100 PCIe 40GB] vendor: NVIDIA Corporation physical id: 0 bus info: pci@0000:3b:00.0 version: a1 width: 64 bits clock: 33MHz capabilities: pm bus_master cap_list configuration: driver=nvidia latency=0 resources: iomemory:38a00-389ff iomemory:38b00-38aff irq:408 memory:ab000000-abffffff memory:38a000000000-38afffffffff memory:38b000000000-38b001ffffff *-display description: 3D controller product: GA100 [GRID A100 PCIe 40GB] vendor: NVIDIA Corporation physical id: 0 bus info: pci@0000:d8:00.0 version: a1 width: 64 bits clock: 33MHz capabilities: pm bus_master cap_list configuration: driver=nvidia latency=0 resources: iomemory:39e00-39dff iomemory:39f00-39eff irq:409 memory:ef000000-efffffff memory:39e000000000-39efffffffff memory:39f000000000-39f001ffffff
三、查看驱动信息
# lshw -c video | grep configuration configuration: driver=mgag200 latency=0 maxlatency=32 mingnt=16 configuration: driver=nvidia latency=0 configuration: driver=nvidia latency=0# modinfo nvidia filename: /lib/modules/3.10.0-1160.el7.x86_64/kernel/drivers/video/nvidia.ko firmware: nvidia/470.57.02/gsp.bin alias: char-major-195-* version: 470.57.02 supported: external license: NVIDIA retpoline: Y rhelversion: 7.9 srcversion: 00F9E8DEACC0FB98727C03C alias: pci:v000010DEd*sv*sd*bc03sc02i00* alias: pci:v000010DEd*sv*sd*bc03sc00i00* depends: drm vermagic: 3.10.0-1160.el7.x86_64 SMP mod_unload modversions parm: NvSwitchRegDwords:NvSwitch regkey (charp) parm: NvSwitchBlacklist:NvSwitchBlacklist=uuid[,uuid...] (charp) parm: NVreg_ResmanDebugLevel:int parm: NVreg_RmLogonRC:int parm: NVreg_ModifyDeviceFiles:int parm: NVreg_DeviceFileUID:int parm: NVreg_DeviceFileGID:int parm: NVreg_DeviceFileMode:int parm: NVreg_InitializeSystemMemoryAllocations:int parm: NVreg_UsePageAttributeTable:int parm: NVreg_RegisterForACPIEvents:int parm: NVreg_EnablePCIeGen3:int parm: NVreg_EnableMSI:int parm: NVreg_TCEBypassMode:int parm: NVreg_EnableStreamMemOPs:int parm: NVreg_RestrictProfilingToAdminUsers:int parm: NVreg_PreserveVideoMemoryAllocations:int parm: NVreg_EnableS0ixPowerManagement:int parm: NVreg_S0ixPowerManagementVideoMemoryThreshold:int parm: NVreg_DynamicPowerManagement:int parm: NVreg_DynamicPowerManagementVideoMemoryThreshold:int parm: NVreg_EnableGpuFirmware:int parm: NVreg_EnableUserNUMAManagement:int parm: NVreg_MemoryPoolSize:int parm: NVreg_KMallocHeapMaxSize:int parm: NVreg_VMallocHeapMaxSize:int parm: NVreg_IgnoreMMIOCheck:int parm: NVreg_NvLinkDisable:int parm: NVreg_EnablePCIERelaxedOrderingMode:int parm: NVreg_RegisterPCIDriver:int parm: NVreg_RegistryDwords:charp parm: NVreg_RegistryDwordsPerDevice:charp parm: NVreg_RmMsg:charp parm: NVreg_GpuBlacklist:charp parm: NVreg_TemporaryFilePath:charp parm: NVreg_ExcludedGpus:charp parm: rm_firmware_active:charp
安装完查看信息
# nvidia-smi Mon Oct 25 13:53:02 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA A100-PCI... Off | 00000000:3B:00.0 Off | 0 | | N/A 34C P0 35W / 250W | 0MiB / 40536MiB | 0% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA A100-PCI... Off | 00000000:D8:00.0 Off | 0 | | N/A 40C P0 39W / 250W | 0MiB / 40536MiB | 29% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
看到GPU1,没有运行程序一直占用29%,经过查询需要驱动模式设置为常驻内存,默认状态是驱动每次用完都自动卸载的,然后重新加载。
# nvidia-smi -pm 1 # nvidia-smi Mon Oct 25 13:56:21 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA A100-PCI... On | 00000000:3B:00.0 Off | 0 | | N/A 26C P0 31W / 250W | 0MiB / 40536MiB | 0% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA A100-PCI... On | 00000000:D8:00.0 Off | 0 | | N/A 30C P0 31W / 250W | 0MiB / 40536MiB | 0% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+