[转]spdk 和nvme 预备知识1
[转]spdk 和nvme 预备知识1
作者:拖鞋花短裤
链接:https://www.jianshu.com/p/b11948e55d80
来源:简书
简书著作权归作者所有,任何形式的转载都请联系作者获得授权并注明出处。
需要提前了解的概念
Linux内核驱动:
UIO:
DPDK的官方文档http://doc.dpdk.org/guides/linux_gsg/linux_drivers.html#UIO说的比较清楚,摘录如下:
A small kernel module to set up the device, map device memory to user-space and register interrupts. In many cases, the standard uio_pci_generic module included in the Linux kernel can provide the uio capability.
For some devices which lack support for legacy interrupts, e.g. virtual function (VF) devices, the igb_uio module may be needed in place of uio_pci_generic.
包括两部分:
UIO Driver
- The device tree node for the device can use whatever you want in the compatible property as it only has to match what is used in the kernel space driver as with any platform device driver
UIO Platform Device Driver
- The device tree node for the device needs to use "generic - uio" in it's compatible property
基本框架如下:
用户态驱动工作流程:
-
在启动用户态驱动前装载内核态UIO设备驱动;
-
启动用户态应用,开启对应UIO设备(/dev/uioX),从用户空间看,UIO设备向其他设备一样是文件系统中的一个设备节点;
-
通过UIO大小(如/sys/class/uio/uio0/maps/map0/size)在相应的sysfs文件目录下找到设备内存地址信息;
-
通过调用UIO驱动的mmap()函数将设备内存映射到进程地址空间;
-
应用访问设备硬件来控制设备;
-
通过调用mynmap()来移除设备内存的映射;
-
关闭UIO设备文件;
更多关于UIO的细节参见:https://www.cnblogs.com/vlhn/p/7761869.html
VFIO:
向用户态开放了IOMMU接口,通过IOCTL配置IOMMU将DMA地址空间映射并将其限制在进程虚拟地址空间。可参考:
1)https://www.kernel.org/doc/Documentation/vfio.txt
需要BIOS和内核的支持,并配置使能IO virtualization(Intel® VT-d)
IOMMU:
参考https://nanxiao.me/iommu-introduction/,IOMMU提供了IO设备访问实际物理内存的一套机制。在虚拟化领域,内部实现了guest虚机内存地址和host内存地址的转换。
summary from AMD
PCI BAR (base address register):
参见简单说就是PCI配置机制,包括寄存器配置帧头,设备编号(B/D/F)及对应的软硬件实现,最终实现PCI设备的寻址。
摘录于https://en.wikipedia.org/wiki/PCI_configuration_space的一段话,简要说明了BDF的划分和寻址。
One of the major improvements the PCI Local Bus had over other I/O architectures was its configuration mechanism. In addition to the normal memory-mapped and I/O port spaces, each device function on the bus has a configuration space, which is 256 bytes long, addressable by knowing the eight-bit PCI bus, five-bit device, and three-bit function numbers for the device (commonly referred to as the BDF or B/D/F, as abbreviated from bus/device/function). This allows up to 256 buses, each with up to 32 devices, each supporting eight functions. A single PCI expansion card can respond as a device and must implement at least function number zero. The first 64 bytes of configuration space are standardized; the remainder are available for vendor-defined purposes.
以下是SPDK自带的脚本工具显示的系统信息,目前SPDK支持的驱动包括NVMe,I/OAT(Intel的I/O加速技术)和virtio(半虚拟化的设备抽象接口规范,其规定的实现接口有PCI,MMIO和Channel I/O方式)
NVMe devices
BDF Numa Node Driver name Device name
I/OAT DMA
BDF Numa Node Driver Name
0000:00:04.0 0 vfio-pci
0000:80:04.0 1 vfio-pci
...
virtio
BDF Numa Node Driver Name Device Name
MMIO(memory-mapped I/O)
MMIO和PMIO(port-mapped I/O)作为互补的解决方案实现了CPU和外围设备的IO互通。IO和内存使用相同的地址空间,即CPU指令中的地址既可以指向内存,也可以指向特定的IO设备。每个IO设备监控CPU的地址总线并对CPU对该地址的访问进行回应,同时连接数据总线至指定设备的硬件寄存器,使得CPU指令可以像访问内存一样访问IO设备,类比于DMA的memory-to-device,MMIO是一种cpu-to-device的技术。
参考https://en.wikipedia.org/wiki/Memory-mapped_I/O
NVMe(non-volatile memory express)
优化的高性能可扩展的主机控制器接口,利用基于PCIE的SSD来实现企业和客户系统的需要。参见www.nvmexpress.org
官方推荐的一个线程模型,即CPU:thread:NVMe queue=1:1:1
threading model for an application using SPDK is to spawn a fixed number of threads in a pool and dedicate a single NVMe queue pair to each thread. A further improvement would be to pin each thread to a separate CPU core, and often the SPDK documentation will use "CPU core" and "thread" interchangeably because we have this threading model in mind.
SPDK基本框架