openstack ussuri PCI-passthrough集成GPU
零 修订记录
序号 | 修订内容 | 修订时间 |
---|---|---|
1 | 新增 | 2021/1/21 |
一 摘要
目前opentack 提供两种方式集成gpu,一种是pci-through,一种是vgpu,本文主要介绍pci-through 方式集成。
二 环境信息
(一)软件版本
openstack ussuri
操作系统:centos 8.1
(二) 硬件信息
|品牌|型号|IP|配置|
|----|----|----|----|----|
|浪潮|SA5212M5|10.3.176.46|5118 * 2/240G * 2/4T * 2/32G * 24/Tesla T4 |
三 实施
(一)基础配置
3.1.1 bios 里enable enable VT-x, VT-d, Onboard VGA.
浪潮M5 可以设置VT-d enable 具体在Processor-IIO Configuration-Intel VT for Directed I/O (VT-d)
可以设置 VGA onboard.
我检查了这台M5,默认都是这样设置的。
3.1.2 新增 /etc/modules
新增该文件/etc/modules
文件内容
[root@ussuritest004 etc]# cat /etc/modules
pci_stub
vfio
vfio_iommu_type1
vfio_pci
kvm
kvm_intel
[root@ussuritest004 etc]#
3.1.3 修改 /etc/default/grub
首先备份原文件
[root@ussuritest004 default]# cp grub grub.bak.20210121
[root@ussuritest004 default]# vim grub
[root@ussuritest004 default]# pwd
/etc/default
[root@ussuritest004 default]#
对于Intel芯片:
新增 intel_iommu=on
GRUB_CMDLINE_LINUX="crashkernel=auto resume=/dev/mapper/cl-swap rd.lvm.lv=cl/root rd.lvm.lv=cl/swap net.ifnames=0 rhgb quiet intel_iommu=on"
对于AMD芯片:
新增 iommu=pt iommu=1
我这里是intel 芯片,暂未处理过amd 芯片。
重新编译grub,并重启机器
grub2-mkconfig -o /boot/grub2/grub.cfg
[root@ussuritest004 default]# grub2-mkconfig -o /boot/grub2/grub.cfg
Generating grub configuration file ...
done
[root@ussuritest004 default]#
[root@ussuritest004 default]# reboot
3.1.4 编辑文件 /etc/modprobe.d/blacklist.conf
新增 /etc/modprobe.d/blacklist.conf
[root@ussuritest004 modprobe.d]# ll
total 32
-rw-r--r--. 1 root root 158 Nov 9 2019 firewalld-sysctls.conf
-rw-r--r--. 1 root root 358 Nov 22 2019 kvm.conf
-rw-r--r--. 1 root root 747 Nov 9 2019 lockd.conf
-rw-r--r--. 1 root root 1004 Jun 3 2019 mlx4.conf
-rw-r--r--. 1 root root 101 May 11 2019 nvdimm-security.conf
-rw-r--r--. 1 root root 92 Nov 22 2019 truescale.conf
-rw-r--r--. 1 root root 674 Jun 27 2019 tuned.conf
-rw-r--r--. 1 root root 111 Nov 22 2019 vhost.conf
[root@ussuritest004 modprobe.d]# vim blacklist.conf
[root@ussuritest004 modprobe.d]# pwd
/etc/modprobe.d
[root@ussuritest004 modprobe.d]#
文件内容
[root@ussuritest004 modprobe.d]# cat blacklist.conf
blacklist snd_hda_intel
blacklist amd76x_edac
blacklist vga16fb
blacklist nouveau
blacklist rivafb
blacklist nvidiafb
blacklist rivatv
[root@ussuritest004 modprobe.d]#
3.1.5 查找显卡的Product ID 以及 Vendor ID
[root@ussuritest004 modprobe.d]# lspci -nn | grep NVIDIA
af:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1)
[root@ussuritest004 modprobe.d]#
输出值 | 含义 | 详细解释 |
---|---|---|
af:00.0 | 以 ”bus:slot.func“ 格式来唯一标识一个 PCI 功能设备 | 唯一定位一个 PCI 设备的虚拟功能,可以是一个物理设备,也可以是一个多功能设备的功能设备,一个多功能设备可以最多有8个功能。总线号(bus): 从系统中的256条总线中选择一条,0--255。设备号(slot): 在一条给定的总线上选择32个设备中的一个。0--31。功能号(func): 选择多功能设备中的某一个功能,有八种功能,0--7。 PCI规范规定,功能0是必须实现的 |
0302 | PCI 设备类型 | 指 PCI 设备的类型,来自不同厂商的同一类设备的类型码可以是相同的 |
10de | 供应商识别字段(Vendor ID) | 该字段用一标明设备的制造者。一个有效的供应商标识由 PCI SIG 来分配,以保证它的唯一性。Intel 的 ID 为 0x8086,Nvidia 的 ID 为 0x10de |
1eb8 | 设备识别字段(Device ID) | 用以标明特定的设备,具体代码由供应商来分配。本例中表示的是 GPU 卡的设备 ID。 |
a1 | 版本识别字段(Revision ID) | 用来指定一个设备特有的版本识别代码,其值由供应商提供 |
参数含义说明
3.1.6 编辑文件 /etc/modprobe.d/vfio.conf
新增文件/etc/modprobe.d/vfio.conf
[root@ussuritest004 modprobe.d]# ll
total 36
-rw-r--r-- 1 root root 135 Jan 21 16:20 blacklist.conf
-rw-r--r--. 1 root root 158 Nov 9 2019 firewalld-sysctls.conf
-rw-r--r--. 1 root root 358 Nov 22 2019 kvm.conf
-rw-r--r--. 1 root root 747 Nov 9 2019 lockd.conf
-rw-r--r--. 1 root root 1004 Jun 3 2019 mlx4.conf
-rw-r--r--. 1 root root 101 May 11 2019 nvdimm-security.conf
-rw-r--r--. 1 root root 92 Nov 22 2019 truescale.conf
-rw-r--r--. 1 root root 674 Jun 27 2019 tuned.conf
-rw-r--r--. 1 root root 111 Nov 22 2019 vhost.conf
[root@ussuritest004 modprobe.d]# vim vfio.conf
[root@ussuritest004 modprobe.d]#
文件内容
[root@ussuritest004 modprobe.d]# cat vfio.conf
options vfio-pci ids=10de:1eb8
[root@ussuritest004 modprobe.d]#
10de:1eb8 源于上一步骤,多个显卡可以用逗号分隔。
[root@ussuritest004 modprobe.d]# lspci -nnk -d 10de:1eb8
af:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12a2]
Kernel driver in use: nouveau
Kernel modules: nouveau
[root@ussuritest004 modprobe.d]#
新增 /etc/modules-load.d/vfio-pci.conf
[root@ussuritest004 modprobe.d]# cd /etc/modules-load.d/
[root@ussuritest004 modules-load.d]# ll
total 8
-rw-r--r-- 1 root root 30 Jan 19 17:56 ip6_tables.conf
-rw-r--r-- 1 root root 31 Jan 19 17:55 openvswitch.conf
[root@ussuritest004 modules-load.d]# echo 'vfio-pci' > /etc/modules-load.d/vfio-pci.conf
[root@ussuritest004 modules-load.d]# ll
total 12
-rw-r--r-- 1 root root 30 Jan 19 17:56 ip6_tables.conf
-rw-r--r-- 1 root root 31 Jan 19 17:55 openvswitch.conf
-rw-r--r-- 1 root root 9 Jan 21 17:08 vfio-pci.conf
[root@ussuritest004 modules-load.d]#
更新内核,重新启动。
[root@ussuritest004 boot]# cp /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak.20210121
[root@ussuritest004 boot]# dracut /boot/initramfs-$(uname -r).img --force
[root@ussuritest004 boot]# ll
total 195808
-rw-------. 1 root root 3838259 Dec 5 2019 System.map-4.18.0-147.el8.x86_64
-rw-r--r--. 1 root root 184613 Dec 5 2019 config-4.18.0-147.el8.x86_64
drwxr-xr-x. 3 root root 4096 Jan 19 11:44 efi
drwx------. 4 root root 4096 Jan 21 16:11 grub2
-rw-------. 1 root root 71694380 Jan 19 11:49 initramfs-0-rescue-c7dcb861dc20453f8e275d6036842581.img
-rw-------. 1 root root 29535971 Jan 21 16:55 initramfs-4.18.0-147.el8.x86_64.img
-rw------- 1 root root 29535918 Jan 21 16:49 initramfs-4.18.0-147.el8.x86_64.img.bak.20210121
-rw------- 1 root root 30310567 Jan 20 13:50 initramfs-4.18.0-147.el8.x86_64.img.bak.orig
-rw-------. 1 root root 19141009 Jan 19 11:57 initramfs-4.18.0-147.el8.x86_64kdump.img
drwxr-xr-x. 3 root root 4096 Jan 19 11:47 loader
drwx------. 2 root root 16384 Jan 19 11:28 lost+found
-rwxr-xr-x. 1 root root 8106744 Jan 19 11:48 vmlinuz-0-rescue-c7dcb861dc20453f8e275d6036842581
-rwxr-xr-x. 1 root root 8106744 Dec 5 2019 vmlinuz-4.18.0-147.el8.x86_64
[root@ussuritest004 boot]#
[root@ussuritest004 boot]# reboot
验证配置:
[root@ussuritest004 ~]# lspci -nnk -d 10de:1eb8
af:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12a2]
Kernel driver in use: vfio-pci
Kernel modules: nouveau
[root@ussuritest004 ~]#
显示结果中"Kernel driver in use: vfio-pci"说明已经配置成功,接下来是OpenStack的配置过程
(二)openstack 配置
类似这样的备份,主要是我个人习惯,保留一份最原始的配置文件。以.bak.orig 为标记,各位跳过这一步。
cp nova.conf nova.conf.bak.orig
3.2.1 配置nova-scheduler (controller节点),编辑文件 /etc/kolla/nova-scheduler/nova.conf
我这里控制节点有三台,先备份原配置,然后再修改
[root@ussuritest001 nova-scheduler]# cp nova.conf nova.conf.bak.20210122
[root@ussuritest001 nova-scheduler]# pwd
/etc/kolla/nova-scheduler
[root@ussuritest001 nova-scheduler]# vim nova.conf
[root@ussuritest001 nova-scheduler]#
新增如下配置
[filter_scheduler]
enabled_filters = RetryFilter, AvailabilityZoneFilter, ComputeFilter, ComputeCapabilitiesFilter, ImagePropertiesFilter, ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter,PciPassthroughFilter
available_filters = nova.scheduler.filters.all_filters
重启 nova-scheduler 服务
[root@ussuritest001 ~]# docker ps -a |grep nova-scheduler
7e7e309d3c3e registry.chouniu.fun/kolla/centos-source-nova-scheduler:ussuri "dumb-init --single-…" 2 weeks ago Up 46 hours nova_scheduler
[root@ussuritest001 ~]# docker restart 7e7e309d3c3e
7e7e309d3c3e
[root@ussuritest001 ~]# docker ps -a |grep nova-scheduler
7e7e309d3c3e registry.chouniu.fun/kolla/centos-source-nova-scheduler:ussuri "dumb-init --single-…" 2 weeks ago Up 3 seconds nova_scheduler
[root@ussuritest001 ~]#
其他两个控制节点,做相同操作,先备份,后修改,最后重启
3.2.2 配置nova-api (controller节点),编辑文件 /etc/kolla/nova-api/nova.conf:
[root@ussuritest001 ~]# cd /etc/kolla/nova-api
[root@ussuritest001 nova-api]# ll
total 8
-rw-rw---- 1 root root 393 Jan 5 16:33 config.json
-rw-rw---- 1 root root 3296 Jan 5 16:33 nova.conf
[root@ussuritest001 nova-api]# cp nova.conf nova.conf.bak.orig
[root@ussuritest001 nova-api]# cp nova.conf nova.conf.bak.20210122
[root@ussuritest001 nova-api]# vim nova.conf
[root@ussuritest001 nova-api]#
新增的内容
[pci]
alias = { "vendor_id":"10de", "product_id":"1eb8", "device_type":"type-PF", "name":"a1", "numa_policy":"preferred" }
重启nova-api服务
[root@ussuritest001 nova-api]# docker ps -a | grep nova-api
8499a74f52d9 registry.chouniu.fun/kolla/centos-source-nova-api:ussuri "dumb-init --single-…" 2 weeks ago Up 47 hours nova_api
[root@ussuritest001 nova-api]# docker restart 8499a74f52d9
8499a74f52d9
[root@ussuritest001 nova-api]# docker ps -a | grep nova-api
8499a74f52d9 registry.chouniu.fun/kolla/centos-source-nova-api:ussuri "dumb-init --single-…" 2 weeks ago Up 2 seconds nova_api
[root@ussuritest001 nova-api]#
其他两个控制点做相同操作,先备份,然后修改,最后重启服务
3.2.3 配置nova-compute(compute 节点),编辑文件/etc/kolla/nova-compute/nova.conf:
我这里只改了有GPU 卡的那台计算节点的配置,其余计算节点配置未改。
[root@ussuritest004 nova-compute]# pwd
/etc/kolla/nova-compute
[root@ussuritest004 nova-compute]# ll
total 16
-rw-rw---- 1 root root 64 Jan 19 17:53 ceph.client.cinder.keyring
-rw-rw---- 1 root root 100 Jan 19 17:53 ceph.conf
-rw-rw---- 1 root root 1127 Jan 19 17:53 config.json
-rw-rw---- 1 root root 2364 Jan 19 17:54 nova.conf
[root@ussuritest004 nova-compute]# cp nova.conf nova.conf.bak.orig
[root@ussuritest004 nova-compute]# cp nova.conf nova.conf.bak.20210122
[root@ussuritest004 nova-compute]# vim nova.conf
[root@ussuritest004 nova-compute]#
新增内容
[pci]
passthrough_whitelist = { "vendor_id": "10de", "product_id": "1eb8" }
alias = { "vendor_id":"10de", "product_id":"1eb8", "device_type":"type-PF", "name":"a1" }
重启nova-compute 服务
[root@ussuritest004 nova-compute]# docker ps -a | grep nova-compute
387f692c5b05 registry.kxdigit.com/kolla/centos-source-nova-compute:ussuri "dumb-init --single-…" 2 days ago Up 17 hours nova_compute
[root@ussuritest004 nova-compute]# docker restart 387f692c5b05
387f692c5b05
[root@ussuritest004 nova-compute]# docker ps -a | grep nova-compute
387f692c5b05 registry.kxdigit.com/kolla/centos-source-nova-compute:ussuri "dumb-init --single-…" 2 days ago Up 17 seconds nova_compute
[root@ussuritest004 nova-compute]#
四验证
(一)使用cirro 验证
4.1.1 新建flavor
[root@ussuritest001 ~]# openstack flavor create --public --ram 2048 --disk 20 --vcpus 2 m1.large.testgpu
[root@ussuritest001 ~]# openstack flavor set m1.large.testgpu --property pci_passthrough:alias='a1:1'
pci_passthrough:alias='a1:1'
a1: 是alias 里name
1:是gpu 数量
[root@ussuritest001 ~]# source /etc/kolla/admin-openrc.sh
[root@ussuritest001 ~]# openstack flavor list
+--------------------------------------+-------------+-------+------+-----------+-------+-----------+
| ID | Name | RAM | Disk | Ephemeral | VCPUs | Is Public |
+--------------------------------------+-------------+-------+------+-----------+-------+-----------+
| 1 | m1.tiny | 512 | 1 | 0 | 1 | True |
| 2 | m1.small | 2048 | 20 | 0 | 1 | True |
| 3 | m1.medium | 4096 | 40 | 0 | 2 | True |
| 4 | m1.large | 8192 | 80 | 0 | 4 | True |
| 5 | m1.xlarge | 16384 | 160 | 0 | 8 | True |
| 5b94a68e-9f34-44f7-ac5c-b0f0d83aaeed | 4CORE8G50G | 8192 | 50 | 0 | 4 | True |
| a5b481e6-92b2-4a4f-897a-1fe0f172705f | 4CORE8G100G | 8192 | 100 | 0 | 4 | True |
+--------------------------------------+-------------+-------+------+-----------+-------+-----------+
[root@ussuritest001 ~]# openstack flavor create --public --ram 2048 --disk 20 --vcpus 2 m1.large.testgpu
+----------------------------+--------------------------------------+
| Field | Value |
+----------------------------+--------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| disk | 20 |
| id | 33af7c87-6340-494d-827d-0afe38d52802 |
| name | m1.large.testgpu |
| os-flavor-access:is_public | True |
| properties | |
| ram | 2048 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 2 |
+----------------------------+--------------------------------------+
[root@ussuritest001 ~]# openstack flavor list
+--------------------------------------+------------------+-------+------+-----------+-------+-----------+
| ID | Name | RAM | Disk | Ephemeral | VCPUs | Is Public |
+--------------------------------------+------------------+-------+------+-----------+-------+-----------+
| 1 | m1.tiny | 512 | 1 | 0 | 1 | True |
| 2 | m1.small | 2048 | 20 | 0 | 1 | True |
| 3 | m1.medium | 4096 | 40 | 0 | 2 | True |
| 33af7c87-6340-494d-827d-0afe38d52802 | m1.large.testgpu | 2048 | 20 | 0 | 2 | True |
| 4 | m1.large | 8192 | 80 | 0 | 4 | True |
| 5 | m1.xlarge | 16384 | 160 | 0 | 8 | True |
| 5b94a68e-9f34-44f7-ac5c-b0f0d83aaeed | 4CORE8G50G | 8192 | 50 | 0 | 4 | True |
| a5b481e6-92b2-4a4f-897a-1fe0f172705f | 4CORE8G100G | 8192 | 100 | 0 | 4 | True |
+--------------------------------------+------------------+-------+------+-----------+-------+-----------+
[root@ussuritest001 ~]# openstack flavor set m1.large.testgpu --property pci_passthrough:alias='a1:1'
[root@ussuritest001 ~]#
4.1.2 创建实例
五 参考
https://docs.openstack.org/nova/latest/admin/pci-passthrough.html
https://www.cnblogs.com/sammyliu/p/5179414.html
"libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2021-01-22T04:01:08.752319Z qemu-kvm: -device cirrus-vga,id=video0,bus=pci.0,addr=0x2: warning: 'cirrus-vga' is deprecated, please use a different VGA card instead\n2021-01-22T04:01:08.753610Z qemu-kvm: -device vfio-pci,host=0000:af:00.0,id=hostdev0,bus=pci.0,addr=0x5: vfio 0000:af:00.0: failed to setup container for group 63: No available IOMMU models\n", '\nDuring handling of the above exception, another exception occurred:\n\n', 'Traceback (most recent call last):\n', ' File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/compute/manager.py", line 2200, in _do_build_and_run_instance\n filter_properties, request_spec, accel_uuids)\n', ' File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/compute/manager.py", line 2478, in _build_and_run_instance\n instance_uuid=instance.uuid, reason=six.text_type(e))\n', "nova.exception.RescheduledException: Build of instance 89719983-ccbd-4dde-b1aa-d23a3f20e36a was re-scheduled: internal error: qemu unexpectedly closed the monitor: 2021-01-22T04:01:08.752319Z qemu-kvm: -device cirrus-vga,id=video0,bus=pci.0,addr=0x2: warning: 'cirrus-vga' is deprecated, please use a different VGA card instead\n2021-01-22T04:01:08.753610Z qemu-kvm: -device vfio-pci,host=0000:af:00.0,id=hostdev0,bus=pci.0,addr=0x5: vfio 0000:af:00.0: failed to setup container for group 63: No available IOMMU models\n"
posted on 2021-01-21 14:37 weiwei2021 阅读(820) 评论(0) 编辑 收藏 举报