openstack ussuri PCI-passthrough集成GPU

零 修订记录

序号 修订内容 修订时间
1 新增 2021/1/21

一 摘要

目前opentack 提供两种方式集成gpu,一种是pci-through,一种是vgpu,本文主要介绍pci-through 方式集成。

二 环境信息

(一)软件版本

  openstack ussuri
  操作系统:centos 8.1

(二) 硬件信息

|品牌|型号|IP|配置|
|----|----|----|----|----|
|浪潮|SA5212M5|10.3.176.46|5118 * 2/240G * 2/4T * 2/32G * 24/Tesla T4 |

三 实施

(一)基础配置

3.1.1 bios 里enable enable VT-x, VT-d, Onboard VGA.

浪潮M5 可以设置VT-d enable 具体在Processor-IIO Configuration-Intel VT for Directed I/O (VT-d)
可以设置 VGA onboard.
我检查了这台M5,默认都是这样设置的。

3.1.2 新增 /etc/modules

新增该文件/etc/modules
文件内容

[root@ussuritest004 etc]# cat /etc/modules
pci_stub
vfio
vfio_iommu_type1
vfio_pci
kvm
kvm_intel
[root@ussuritest004 etc]#

3.1.3 修改 /etc/default/grub

首先备份原文件

[root@ussuritest004 default]# cp grub grub.bak.20210121
[root@ussuritest004 default]# vim grub
[root@ussuritest004 default]# pwd
/etc/default
[root@ussuritest004 default]#

对于Intel芯片:
新增 intel_iommu=on

GRUB_CMDLINE_LINUX="crashkernel=auto resume=/dev/mapper/cl-swap rd.lvm.lv=cl/root rd.lvm.lv=cl/swap net.ifnames=0 rhgb quiet intel_iommu=on"

对于AMD芯片:

新增 iommu=pt iommu=1

我这里是intel 芯片,暂未处理过amd 芯片。

重新编译grub,并重启机器
grub2-mkconfig -o /boot/grub2/grub.cfg

[root@ussuritest004 default]# grub2-mkconfig -o /boot/grub2/grub.cfg
Generating grub configuration file ...
done
[root@ussuritest004 default]#
[root@ussuritest004 default]# reboot

3.1.4 编辑文件 /etc/modprobe.d/blacklist.conf

新增 /etc/modprobe.d/blacklist.conf

[root@ussuritest004 modprobe.d]# ll
total 32
-rw-r--r--. 1 root root  158 Nov  9  2019 firewalld-sysctls.conf
-rw-r--r--. 1 root root  358 Nov 22  2019 kvm.conf
-rw-r--r--. 1 root root  747 Nov  9  2019 lockd.conf
-rw-r--r--. 1 root root 1004 Jun  3  2019 mlx4.conf
-rw-r--r--. 1 root root  101 May 11  2019 nvdimm-security.conf
-rw-r--r--. 1 root root   92 Nov 22  2019 truescale.conf
-rw-r--r--. 1 root root  674 Jun 27  2019 tuned.conf
-rw-r--r--. 1 root root  111 Nov 22  2019 vhost.conf
[root@ussuritest004 modprobe.d]# vim blacklist.conf
[root@ussuritest004 modprobe.d]# pwd
/etc/modprobe.d
[root@ussuritest004 modprobe.d]#

文件内容

[root@ussuritest004 modprobe.d]# cat blacklist.conf
blacklist snd_hda_intel
blacklist amd76x_edac
blacklist vga16fb
blacklist nouveau
blacklist rivafb
blacklist nvidiafb
blacklist rivatv
[root@ussuritest004 modprobe.d]#

3.1.5 查找显卡的Product ID 以及 Vendor ID

[root@ussuritest004 modprobe.d]# lspci -nn | grep NVIDIA
af:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1)
[root@ussuritest004 modprobe.d]#

输出值 含义 详细解释
af:00.0 以 ”bus:slot.func“ 格式来唯一标识一个 PCI 功能设备 唯一定位一个 PCI 设备的虚拟功能,可以是一个物理设备,也可以是一个多功能设备的功能设备,一个多功能设备可以最多有8个功能。总线号(bus): 从系统中的256条总线中选择一条,0--255。设备号(slot): 在一条给定的总线上选择32个设备中的一个。0--31。功能号(func): 选择多功能设备中的某一个功能,有八种功能,0--7。 PCI规范规定,功能0是必须实现的
0302 PCI 设备类型 指 PCI 设备的类型,来自不同厂商的同一类设备的类型码可以是相同的
10de 供应商识别字段(Vendor ID) 该字段用一标明设备的制造者。一个有效的供应商标识由 PCI SIG 来分配,以保证它的唯一性。Intel 的 ID 为 0x8086,Nvidia 的 ID 为 0x10de
1eb8 设备识别字段(Device ID) 用以标明特定的设备,具体代码由供应商来分配。本例中表示的是 GPU 卡的设备 ID。
a1 版本识别字段(Revision ID) 用来指定一个设备特有的版本识别代码,其值由供应商提供

参数含义说明

3.1.6 编辑文件 /etc/modprobe.d/vfio.conf

新增文件/etc/modprobe.d/vfio.conf

[root@ussuritest004 modprobe.d]# ll
total 36
-rw-r--r--  1 root root  135 Jan 21 16:20 blacklist.conf
-rw-r--r--. 1 root root  158 Nov  9  2019 firewalld-sysctls.conf
-rw-r--r--. 1 root root  358 Nov 22  2019 kvm.conf
-rw-r--r--. 1 root root  747 Nov  9  2019 lockd.conf
-rw-r--r--. 1 root root 1004 Jun  3  2019 mlx4.conf
-rw-r--r--. 1 root root  101 May 11  2019 nvdimm-security.conf
-rw-r--r--. 1 root root   92 Nov 22  2019 truescale.conf
-rw-r--r--. 1 root root  674 Jun 27  2019 tuned.conf
-rw-r--r--. 1 root root  111 Nov 22  2019 vhost.conf
[root@ussuritest004 modprobe.d]# vim vfio.conf
[root@ussuritest004 modprobe.d]#

文件内容

[root@ussuritest004 modprobe.d]# cat vfio.conf
options vfio-pci ids=10de:1eb8
[root@ussuritest004 modprobe.d]#

10de:1eb8 源于上一步骤,多个显卡可以用逗号分隔。

[root@ussuritest004 modprobe.d]# lspci -nnk -d 10de:1eb8
af:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1)
        Subsystem: NVIDIA Corporation Device [10de:12a2]
        Kernel driver in use: nouveau
        Kernel modules: nouveau
[root@ussuritest004 modprobe.d]#

新增 /etc/modules-load.d/vfio-pci.conf

[root@ussuritest004 modprobe.d]# cd /etc/modules-load.d/
[root@ussuritest004 modules-load.d]# ll
total 8
-rw-r--r-- 1 root root 30 Jan 19 17:56 ip6_tables.conf
-rw-r--r-- 1 root root 31 Jan 19 17:55 openvswitch.conf
[root@ussuritest004 modules-load.d]# echo 'vfio-pci' > /etc/modules-load.d/vfio-pci.conf
[root@ussuritest004 modules-load.d]# ll
total 12
-rw-r--r-- 1 root root 30 Jan 19 17:56 ip6_tables.conf
-rw-r--r-- 1 root root 31 Jan 19 17:55 openvswitch.conf
-rw-r--r-- 1 root root  9 Jan 21 17:08 vfio-pci.conf
[root@ussuritest004 modules-load.d]#

更新内核,重新启动。

[root@ussuritest004 boot]# cp /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak.20210121
[root@ussuritest004 boot]# dracut  /boot/initramfs-$(uname -r).img --force
[root@ussuritest004 boot]# ll
total 195808
-rw-------. 1 root root  3838259 Dec  5  2019 System.map-4.18.0-147.el8.x86_64
-rw-r--r--. 1 root root   184613 Dec  5  2019 config-4.18.0-147.el8.x86_64
drwxr-xr-x. 3 root root     4096 Jan 19 11:44 efi
drwx------. 4 root root     4096 Jan 21 16:11 grub2
-rw-------. 1 root root 71694380 Jan 19 11:49 initramfs-0-rescue-c7dcb861dc20453f8e275d6036842581.img
-rw-------. 1 root root 29535971 Jan 21 16:55 initramfs-4.18.0-147.el8.x86_64.img
-rw-------  1 root root 29535918 Jan 21 16:49 initramfs-4.18.0-147.el8.x86_64.img.bak.20210121
-rw-------  1 root root 30310567 Jan 20 13:50 initramfs-4.18.0-147.el8.x86_64.img.bak.orig
-rw-------. 1 root root 19141009 Jan 19 11:57 initramfs-4.18.0-147.el8.x86_64kdump.img
drwxr-xr-x. 3 root root     4096 Jan 19 11:47 loader
drwx------. 2 root root    16384 Jan 19 11:28 lost+found
-rwxr-xr-x. 1 root root  8106744 Jan 19 11:48 vmlinuz-0-rescue-c7dcb861dc20453f8e275d6036842581
-rwxr-xr-x. 1 root root  8106744 Dec  5  2019 vmlinuz-4.18.0-147.el8.x86_64
[root@ussuritest004 boot]#

[root@ussuritest004 boot]# reboot

验证配置:

[root@ussuritest004 ~]# lspci -nnk -d 10de:1eb8
af:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1)
        Subsystem: NVIDIA Corporation Device [10de:12a2]
        Kernel driver in use: vfio-pci
        Kernel modules: nouveau
[root@ussuritest004 ~]#

显示结果中"Kernel driver in use: vfio-pci"说明已经配置成功,接下来是OpenStack的配置过程

(二)openstack 配置

类似这样的备份,主要是我个人习惯,保留一份最原始的配置文件。以.bak.orig 为标记,各位跳过这一步。
cp nova.conf nova.conf.bak.orig

3.2.1 配置nova-scheduler (controller节点),编辑文件 /etc/kolla/nova-scheduler/nova.conf

我这里控制节点有三台,先备份原配置,然后再修改

[root@ussuritest001 nova-scheduler]# cp nova.conf nova.conf.bak.20210122
[root@ussuritest001 nova-scheduler]# pwd
/etc/kolla/nova-scheduler
[root@ussuritest001 nova-scheduler]# vim nova.conf
[root@ussuritest001 nova-scheduler]#

新增如下配置

[filter_scheduler]
enabled_filters = RetryFilter, AvailabilityZoneFilter, ComputeFilter, ComputeCapabilitiesFilter, ImagePropertiesFilter, ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter,PciPassthroughFilter
available_filters = nova.scheduler.filters.all_filters


重启 nova-scheduler 服务

[root@ussuritest001 ~]#  docker ps -a |grep nova-scheduler
7e7e309d3c3e        registry.chouniu.fun/kolla/centos-source-nova-scheduler:ussuri              "dumb-init --single-…"   2 weeks ago         Up 46 hours                                  nova_scheduler
[root@ussuritest001 ~]# docker restart 7e7e309d3c3e
7e7e309d3c3e
[root@ussuritest001 ~]#  docker ps -a |grep nova-scheduler
7e7e309d3c3e        registry.chouniu.fun/kolla/centos-source-nova-scheduler:ussuri              "dumb-init --single-…"   2 weeks ago         Up 3 seconds                                 nova_scheduler
[root@ussuritest001 ~]#

其他两个控制节点,做相同操作,先备份,后修改,最后重启

3.2.2 配置nova-api (controller节点),编辑文件 /etc/kolla/nova-api/nova.conf:

[root@ussuritest001 ~]# cd /etc/kolla/nova-api
[root@ussuritest001 nova-api]# ll
total 8
-rw-rw---- 1 root root  393 Jan  5 16:33 config.json
-rw-rw---- 1 root root 3296 Jan  5 16:33 nova.conf
[root@ussuritest001 nova-api]# cp nova.conf nova.conf.bak.orig
[root@ussuritest001 nova-api]# cp nova.conf nova.conf.bak.20210122
[root@ussuritest001 nova-api]# vim nova.conf
[root@ussuritest001 nova-api]#

新增的内容

[pci]
alias = { "vendor_id":"10de", "product_id":"1eb8", "device_type":"type-PF", "name":"a1", "numa_policy":"preferred" }

重启nova-api服务

[root@ussuritest001 nova-api]# docker ps -a | grep nova-api
8499a74f52d9        registry.chouniu.fun/kolla/centos-source-nova-api:ussuri                    "dumb-init --single-…"   2 weeks ago         Up 47 hours                                            nova_api
[root@ussuritest001 nova-api]# docker restart 8499a74f52d9
8499a74f52d9
[root@ussuritest001 nova-api]# docker ps -a | grep nova-api
8499a74f52d9        registry.chouniu.fun/kolla/centos-source-nova-api:ussuri                    "dumb-init --single-…"   2 weeks ago         Up 2 seconds                                             nova_api
[root@ussuritest001 nova-api]#

其他两个控制点做相同操作,先备份,然后修改,最后重启服务

3.2.3 配置nova-compute(compute 节点),编辑文件/etc/kolla/nova-compute/nova.conf:

我这里只改了有GPU 卡的那台计算节点的配置,其余计算节点配置未改。

[root@ussuritest004 nova-compute]# pwd
/etc/kolla/nova-compute
[root@ussuritest004 nova-compute]# ll
total 16
-rw-rw---- 1 root root   64 Jan 19 17:53 ceph.client.cinder.keyring
-rw-rw---- 1 root root  100 Jan 19 17:53 ceph.conf
-rw-rw---- 1 root root 1127 Jan 19 17:53 config.json
-rw-rw---- 1 root root 2364 Jan 19 17:54 nova.conf
[root@ussuritest004 nova-compute]# cp nova.conf nova.conf.bak.orig
[root@ussuritest004 nova-compute]# cp nova.conf nova.conf.bak.20210122
[root@ussuritest004 nova-compute]# vim nova.conf
[root@ussuritest004 nova-compute]#

新增内容

[pci]
passthrough_whitelist = { "vendor_id": "10de", "product_id": "1eb8" }
alias = { "vendor_id":"10de", "product_id":"1eb8", "device_type":"type-PF", "name":"a1" }


重启nova-compute 服务

[root@ussuritest004 nova-compute]# docker ps -a | grep nova-compute
387f692c5b05        registry.kxdigit.com/kolla/centos-source-nova-compute:ussuri                "dumb-init --single-…"   2 days ago          Up 17 hours                             nova_compute
[root@ussuritest004 nova-compute]# docker restart 387f692c5b05
387f692c5b05
[root@ussuritest004 nova-compute]# docker ps -a | grep nova-compute
387f692c5b05        registry.kxdigit.com/kolla/centos-source-nova-compute:ussuri                "dumb-init --single-…"   2 days ago          Up 17 seconds                           nova_compute
[root@ussuritest004 nova-compute]#

四验证

(一)使用cirro 验证

4.1.1 新建flavor

[root@ussuritest001 ~]# openstack flavor create --public --ram 2048 --disk 20 --vcpus 2 m1.large.testgpu
[root@ussuritest001 ~]# openstack flavor set m1.large.testgpu --property pci_passthrough:alias='a1:1'

pci_passthrough:alias='a1:1'
a1: 是alias 里name
1:是gpu 数量

[root@ussuritest001 ~]# source /etc/kolla/admin-openrc.sh
[root@ussuritest001 ~]# openstack flavor list
+--------------------------------------+-------------+-------+------+-----------+-------+-----------+
| ID                                   | Name        |   RAM | Disk | Ephemeral | VCPUs | Is Public |
+--------------------------------------+-------------+-------+------+-----------+-------+-----------+
| 1                                    | m1.tiny     |   512 |    1 |         0 |     1 | True      |
| 2                                    | m1.small    |  2048 |   20 |         0 |     1 | True      |
| 3                                    | m1.medium   |  4096 |   40 |         0 |     2 | True      |
| 4                                    | m1.large    |  8192 |   80 |         0 |     4 | True      |
| 5                                    | m1.xlarge   | 16384 |  160 |         0 |     8 | True      |
| 5b94a68e-9f34-44f7-ac5c-b0f0d83aaeed | 4CORE8G50G  |  8192 |   50 |         0 |     4 | True      |
| a5b481e6-92b2-4a4f-897a-1fe0f172705f | 4CORE8G100G |  8192 |  100 |         0 |     4 | True      |
+--------------------------------------+-------------+-------+------+-----------+-------+-----------+
[root@ussuritest001 ~]# openstack flavor create --public --ram 2048 --disk 20 --vcpus 2 m1.large.testgpu
+----------------------------+--------------------------------------+
| Field                      | Value                                |
+----------------------------+--------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                |
| OS-FLV-EXT-DATA:ephemeral  | 0                                    |
| disk                       | 20                                   |
| id                         | 33af7c87-6340-494d-827d-0afe38d52802 |
| name                       | m1.large.testgpu                     |
| os-flavor-access:is_public | True                                 |
| properties                 |                                      |
| ram                        | 2048                                 |
| rxtx_factor                | 1.0                                  |
| swap                       |                                      |
| vcpus                      | 2                                    |
+----------------------------+--------------------------------------+
[root@ussuritest001 ~]# openstack flavor list
+--------------------------------------+------------------+-------+------+-----------+-------+-----------+
| ID                                   | Name             |   RAM | Disk | Ephemeral | VCPUs | Is Public |
+--------------------------------------+------------------+-------+------+-----------+-------+-----------+
| 1                                    | m1.tiny          |   512 |    1 |         0 |     1 | True      |
| 2                                    | m1.small         |  2048 |   20 |         0 |     1 | True      |
| 3                                    | m1.medium        |  4096 |   40 |         0 |     2 | True      |
| 33af7c87-6340-494d-827d-0afe38d52802 | m1.large.testgpu |  2048 |   20 |         0 |     2 | True      |
| 4                                    | m1.large         |  8192 |   80 |         0 |     4 | True      |
| 5                                    | m1.xlarge        | 16384 |  160 |         0 |     8 | True      |
| 5b94a68e-9f34-44f7-ac5c-b0f0d83aaeed | 4CORE8G50G       |  8192 |   50 |         0 |     4 | True      |
| a5b481e6-92b2-4a4f-897a-1fe0f172705f | 4CORE8G100G      |  8192 |  100 |         0 |     4 | True      |
+--------------------------------------+------------------+-------+------+-----------+-------+-----------+
[root@ussuritest001 ~]# openstack flavor set m1.large.testgpu --property pci_passthrough:alias='a1:1'
[root@ussuritest001 ~]#

4.1.2 创建实例

五 参考

https://docs.openstack.org/nova/latest/admin/pci-passthrough.html

https://blog.csdn.net/wangjinruifly/article/details/79620075?utm_medium=distribute.pc_relevant_download.none-task-blog-searchFromBaidu-7.nonecase&depth_1-utm_source=distribute.pc_relevant_download.none-task-blog-searchFromBaidu-7.nonecas

https://www.cnblogs.com/sammyliu/p/5179414.html

"libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2021-01-22T04:01:08.752319Z qemu-kvm: -device cirrus-vga,id=video0,bus=pci.0,addr=0x2: warning: 'cirrus-vga' is deprecated, please use a different VGA card instead\n2021-01-22T04:01:08.753610Z qemu-kvm: -device vfio-pci,host=0000:af:00.0,id=hostdev0,bus=pci.0,addr=0x5: vfio 0000:af:00.0: failed to setup container for group 63: No available IOMMU models\n", '\nDuring handling of the above exception, another exception occurred:\n\n', 'Traceback (most recent call last):\n', ' File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/compute/manager.py", line 2200, in _do_build_and_run_instance\n filter_properties, request_spec, accel_uuids)\n', ' File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/compute/manager.py", line 2478, in _build_and_run_instance\n instance_uuid=instance.uuid, reason=six.text_type(e))\n', "nova.exception.RescheduledException: Build of instance 89719983-ccbd-4dde-b1aa-d23a3f20e36a was re-scheduled: internal error: qemu unexpectedly closed the monitor: 2021-01-22T04:01:08.752319Z qemu-kvm: -device cirrus-vga,id=video0,bus=pci.0,addr=0x2: warning: 'cirrus-vga' is deprecated, please use a different VGA card instead\n2021-01-22T04:01:08.753610Z qemu-kvm: -device vfio-pci,host=0000:af:00.0,id=hostdev0,bus=pci.0,addr=0x5: vfio 0000:af:00.0: failed to setup container for group 63: No available IOMMU models\n"

posted on 2021-01-21 14:37  weiwei2021  阅读(769)  评论(0编辑  收藏  举报