PCI + resource + BAR

 

 

 https://www.slideshare.net/kentaroebisawa/20111015-pcie-sriov

 

 

 

[root@localhost ~]# lspci | grep -i ether
05:00.0 Ethernet controller: Huawei Technologies Co., Ltd. Hi1822 Family (2*25GE) (rev 45)
06:00.0 Ethernet controller: Huawei Technologies Co., Ltd. Hi1822 Family (2*25GE) (rev 45)
7d:00.0 Ethernet controller: Huawei Technologies Co., Ltd. HNS GE/10GE/25GE RDMA Network Controller (rev 21)
7d:00.1 Ethernet controller: Huawei Technologies Co., Ltd. HNS GE/10GE/25GE Network Controller (rev 21)
7d:00.2 Ethernet controller: Huawei Technologies Co., Ltd. HNS GE/10GE/25GE RDMA Network Controller (rev 21)
7d:00.3 Ethernet controller: Huawei Technologies Co., Ltd. HNS GE/10GE/25GE Network Controller (rev 21)
[root@localhost ~]# ls /sys/bus/pci/devices/0000\:05\:00.0/re
remove     rescan     reset      resource   resource0  resource2  resource4  revision
[root@localhost ~]# cat /sys/bus/pci/devices/0000\:05\:00.0/resource
0x0000080007b00000 0x0000080007b1ffff 0x000000000014220c
0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000080008a20000 0x0000080008a27fff 0x000000000014220c 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000080000200000 0x00000800002fffff 0x000000000014220c 0x0000000000000000 0x0000000000000000 0x0000000000000000
------------------------------------------------------------------
0x00000000e9200000 0x00000000e92fffff 0x0000000000046200 0x0000080007b20000 0x000008000829ffff 0x000000000014220c 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x00000800082a0000 0x0000080008a1ffff 0x000000000014220c 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000080000300000 0x0000080007afffff 0x000000000014220c 0x0000000000000000 0x0000000000000000 0x0000000000000000 [root@localhost ~]#

   PCI有6个BAR,6个BAR的不同划分跟pci设备设计有关,intel的网卡有Memory Bar、IO Bar还有MSI-X Bar

6 行为 PCI 设备的 6 个 BAR,还是以 Intel 82599 为例,前两个 BAR 为 Memory BAR,中间两个 BAR 为 IO BAR,最后两个 BAR 为 MSI-X BAR。其中,每个 BAR 又分为 3 列:

第 1 列为 PCI BAR 的起始地址
第 2 列为 PCI BAR 的终止地址
第 3 列为 PCI BAR 的标识

 

[root@localhost ~]# cat /sys/bus/pci/devices/0000\:05\:00.0/device 
0x0200
[root@localhost ~]# cat /sys/bus/pci/devices/0000\:05\:00.0/vendor 
0x19e5
[root@localhost ~]# lspci -d 19e5:0200 -nvv
05:00.0 0200: 19e5:0200 (rev 45)
        Subsystem: 19e5:d139
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 32 bytes
        NUMA node: 0
        Region 0: Memory at 80007b00000 (64-bit, prefetchable) [size=128K]----------------------都是mem
        Region 2: Memory at 80008a20000 (64-bit, prefetchable) [size=32K]
        Region 4: Memory at 80000200000 (64-bit, prefetchable) [size=1M]
        Expansion ROM at e9200000 [disabled] [size=1M]
        Capabilities: [40] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
                DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 256 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM not supported, Exit Latency L0s unlimited, L1 unlimited
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 8GT/s, Width x16, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range B, TimeoutDis+, LTR-, OBFF Not Supported
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+
                         EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
        Capabilities: [80] MSI: Enable- Count=1/32 Maskable+ 64bit+
                Address: 0000000000000000  Data: 0000
                Masking: 00000000  Pending: 00000000
        Capabilities: [a0] MSI-X: Enable+ Count=32 Masked-
                Vector table: BAR=2 offset=00000000
                PBA: BAR=2 offset=00004000
        Capabilities: [b0] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [c0] Vital Product Data
                Product Name: Huawei IN200 2*100GE Adapter
                Read-only fields:
                        [PN] Part number: SP572
                End
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
        Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 0
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [200 v1] Single Root I/O Virtualization (SR-IOV)
                IOVCap: Migration-, Interrupt Message Number: 000
                IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
                IOVSta: Migration-
                Initial VFs: 120, Total VFs: 120, Number of VFs: 0, Function Dependency Link: 00
                VF offset: 1, stride: 1, Device ID: 375e
                Supported Page Size: 00000553, System Page Size: 00000010
                Region 0: Memory at 0000080007b20000 (64-bit, prefetchable)
                Region 2: Memory at 00000800082a0000 (64-bit, prefetchable)
                Region 4: Memory at 0000080000300000 (64-bit, prefetchable)
                VF Migration: offset: 00000000, BIR: 0
        Capabilities: [310 v1] #19
        Capabilities: [4e0 v1] Device Serial Number 44-a1-91-ff-ff-a4-9b-eb
        Capabilities: [4f0 v1] Transaction Processing Hints
                Device specific mode supported
                No steering table available
        Capabilities: [600 v1] Vendor Specific Information: ID=0000 Rev=0 Len=028 <?>
        Capabilities: [630 v1] Access Control Services
                ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
                ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
        Kernel driver in use: vfio-pci
        Kernel modules: hinic

06:00.0 0200: 19e5:0200 (rev 45)
        Subsystem: 19e5:d139
        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        NUMA node: 0
        Region 0: [virtual] Memory at 80010400000 (64-bit, prefetchable) [size=128K]
        Region 2: [virtual] Memory at 80011320000 (64-bit, prefetchable) [size=32K]
        Region 4: [virtual] Memory at 80008b00000 (64-bit, prefetchable) [size=1M]
        Expansion ROM at e9300000 [disabled] [size=1M]
        Capabilities: [40] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
                DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 256 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM not supported, Exit Latency L0s unlimited, L1 unlimited
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 8GT/s, Width x16, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range B, TimeoutDis+, LTR-, OBFF Not Supported
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+
                         EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
        Capabilities: [80] MSI: Enable- Count=1/32 Maskable+ 64bit+
                Address: 0000000000000000  Data: 0000
                Masking: 00000000  Pending: 00000000
        Capabilities: [a0] MSI-X: Enable- Count=32 Masked-
                Vector table: BAR=2 offset=00000000
                PBA: BAR=2 offset=00004000
        Capabilities: [b0] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D3 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [c0] Vital Product Data
                Product Name: Huawei IN200 2*100GE Adapter
                Read-only fields:
                        [PN] Part number: SP572
                End
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
        Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 0
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [200 v1] Single Root I/O Virtualization (SR-IOV)
                IOVCap: Migration-, Interrupt Message Number: 000
                IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
                IOVSta: Migration-
                Initial VFs: 120, Total VFs: 120, Number of VFs: 0, Function Dependency Link: 00
                VF offset: 1, stride: 1, Device ID: 375e
                Supported Page Size: 00000553, System Page Size: 00000010
                Region 0: Memory at 0000080010420000 (64-bit, prefetchable)
                Region 2: Memory at 0000080010ba0000 (64-bit, prefetchable)
                Region 4: Memory at 0000080008c00000 (64-bit, prefetchable)
                VF Migration: offset: 00000000, BIR: 0
        Capabilities: [310 v1] #19
        Capabilities: [4e0 v1] Device Serial Number 44-a1-91-ff-ff-a4-9b-ec
        Capabilities: [4f0 v1] Transaction Processing Hints
                Device specific mode supported
                No steering table available
        Capabilities: [600 v1] Vendor Specific Information: ID=0000 Rev=0 Len=028 <?>
        Capabilities: [630 v1] Access Control Services
                ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
                ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
        Kernel driver in use: vfio-pci
        Kernel modules: hinic

[root@localhost ~]# 

 

 

[root@localhost ~]# lspci -d 19e5:0200  -x
05:00.0 Ethernet controller: Huawei Technologies Co., Ltd. Hi1822 Family (2*25GE) (rev 45)
00: e5 19 00 02 06 04 10 00 45 00 00 02 08 00 00 00
10: 0c 00 b0 07 00 08 00 00 0c 00 a2 08 00 08 00 00
20: 0c 00 20 00 00 08 00 00 00 00 00 00 e5 19 39 d1
30: 00 00 40 e6 40 00 00 00 00 00 00 00 ff 00 00 00

06:00.0 Ethernet controller: Huawei Technologies Co., Ltd. Hi1822 Family (2*25GE) (rev 45)
00: e5 19 00 02 00 04 10 00 45 00 00 02 08 00 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 e5 19 39 d1
30: 00 00 30 e6 40 00 00 00 00 00 00 00 ff 00 00 00

[root@localhost ~]# 
[root@localhost ~]# ls /sys/bus/pci/devices/0000\:05\:00.0/resource*
/sys/bus/pci/devices/0000:05:00.0/resource   /sys/bus/pci/devices/0000:05:00.0/resource2
/sys/bus/pci/devices/0000:05:00.0/resource0  /sys/bus/pci/devices/0000:05:00.0/resource4
[root@localhost ~]# 

 

 

 

 

 

区域宽度需要魔术写入:How is a PCI / PCIe BAR size determined?

此内存由PCI设备设置,并向内核提供信息。

每个BAR对应一个地址范围,作为PCI设备的独立通信通道。

每个区域的长度由硬件定义,并通过配置寄存器传送给软件。

除了长度之外,每个区域还有其他硬件定义的属性,特别是内存类型:

  • IORESOURCE_IO:必须使用inXoutX
  • 进行访问
  • IORESOURCE_MEM:必须使用ioreadXiowriteX
  • 进行访问

几个Linux内核PCI函数将BAR作为参数来识别要使用的通信通道,例如:

mmio = pci_iomap(pdev, BAR, pci_resource_len(pdev, BAR));
pci_resource_flags(dev, BAR);
pci_resource_start(pdev, BAR);
pci_resource_end(pdev, BAR);

通过查看QEMU设备源代码,我们看到QEMU设备使用以下命令注册这些区域:

memory_region_init_io(&edu->mmio, OBJECT(edu), &edu_mmio_ops, edu,
                "edu-mmio", 1 << 20);
pci_register_bar(pdev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, &edu->mmio); ------------------------------bar编号0

并且很明显BAR的属性是硬件定义的,例如, BAR编号0,类型为内存PCI_BASE_ADDRESS_SPACE_MEMORY,内存区域为1MiB长1 << 20

 

[root@localhost ~]# lspci -s 05:00.0 -x
05:00.0 Ethernet controller: Huawei Technologies Co., Ltd. Hi1822 Family (2*25GE) (rev 45)
00: e5 19 00 02 06 04 10 00 45 00 00 02 08 00 00 00
10: 0c 00 b0 07 00 08 00 00 0c 00 a2 08 00 08 00 00
20: 0c 00 20 00 00 08 00 00 00 00 00 00 e5 19 39 d1
30: 00 00 40 e6 40 00 00 00 00 00 00 00 ff 00 00 00

 

BAR记录从内存开始的设备地址。

root@Ubuntu:~$ lspci -s 00:04.0 -x
00:04.0 USB controller: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI Controller (rev 10)
00: 86 80 cd 24 06 00 00 00 10 20 03 0c 10 00 00 00
10: 00 10 02 f3 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 f4 1a 00 11
30: 00 00 00 00 00 00 00 00 00 00 00 00 05 04 00 00

root@Ubuntu:~$ lspci -s 00:04.0 -v
00:04.0 USB controller: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI Controller (rev 10) (prog-if 20 [EHCI])
        Subsystem: Red Hat, Inc QEMU Virtual Machine
        Physical Slot: 4
        Flags: bus master, fast devsel, latency 0, IRQ 35
        Memory at f3021000 (32-bit, non-prefetchable) [size=4K]
        Kernel driver in use: ehci-pci
root@Ubuntu:~$ grep  00:04.0 /proc/iomem
  f3021000-f3021fff : 0000:00:04.0
0xfff等于4095,即4K。内存从0xf3021000开始,这是CPU可以看到的USB设备。此地址在BIOS期间是init,在本例中为BAR0。为什么是BAR0?

在此之前,需要了解PCI规范,尤其是以下内容,输入0和1:

 

 

 https://www.thinbug.com/q/30190050



请注意,标头类型均在第三个字段0x0c处定义,这就是BAR的区别。在此示例中,它是00,表示它是类型0。因此BAR0存储了地址00 10 02 f3。

一个人可能想知道为什么这不完全是f3021000,这是因为lspci与Little Endian一起使用。什么是Endian?可能需要阅读“格列佛游记”。

BAR0通常具有三种状态,即未初始化,全1和已写入地址。自设备已经启动以来,我们现在排名第三。未初始化时,位11〜4设置为0;当设置为0时,位3表示NP;设置为1时,P表示P;位2〜1表示设置为00时为32位,设置为10时为64位;位0表示内存请求设置为0,IO请求设置为1。

0xf3021000
====>>>>
11110011000000100001000000000000
由此,我们可以知道此设备是32位不可预取的内存请求。未初始化的地址是32〜12,因为2 ^ 12 = 4K。

要获取更多设备和供应商,可以通过https://pcilookup.com/

查找

 

 

[root@localhost ~]# lspci -s 05:00.0 -v
05:00.0 Ethernet controller: Huawei Technologies Co., Ltd. Hi1822 Family (2*25GE) (rev 45)
        Subsystem: Huawei Technologies Co., Ltd. Device d139
        Flags: bus master, fast devsel, latency 0, NUMA node 0
        Memory at 80007b00000 (64-bit, prefetchable) [size=128K]
        Memory at 80008a20000 (64-bit, prefetchable) [size=32K]
        Memory at 80000200000 (64-bit, prefetchable) [size=1M]
        Expansion ROM at e9200000 [disabled] [size=1M]
        Capabilities: [40] Express Endpoint, MSI 00
        Capabilities: [80] MSI: Enable- Count=1/32 Maskable+ 64bit+
        Capabilities: [a0] MSI-X: Enable+ Count=32 Masked-
        Capabilities: [b0] Power Management version 3
        Capabilities: [c0] Vital Product Data
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [200] Single Root I/O Virtualization (SR-IOV)
        Capabilities: [310] #19
        Capabilities: [4e0] Device Serial Number 44-a1-91-ff-ff-a4-9b-eb
        Capabilities: [4f0] Transaction Processing Hints
        Capabilities: [600] Vendor Specific Information: ID=0000 Rev=0 Len=028 <?>
        Capabilities: [630] Access Control Services
        Kernel driver in use: vfio-pci
        Kernel modules: hinic

[root@localhost ~]# grep  05:00.0 /proc/iomem
        e9200000-e92fffff : 0000:05:00.0
        80000200000-800002fffff : 0000:05:00.0
        80000300000-80007afffff : 0000:05:00.0
        80007b00000-80007b1ffff : 0000:05:00.0
        80007b20000-8000829ffff : 0000:05:00.0
        800082a0000-80008a1ffff : 0000:05:00.0
        80008a20000-80008a27fff : 0000:05:00.0
[root@localhost ~]#

 

 

 

PCI Access Without a Driver

At work recently, I had a new PCI device that I needed to experiment with. I was dreading writing a Linux kernel driver to talk to it. It turns out, Linux makes it possible to read and write to a PCI device's memory space without a driver! Woohoo!

Linux provides a sysfs interface to PCI devices. From that interface, the memory space can be mmaped and then read and written. No driver involved.

As a quick example, we can use lspci to get information about a particular device.

$ vendor="10ee" # Use your device ID
$ device="7014" # Use your vendor ID
$ lspci -d $vendor:$device -nvv
04:00.0 1180: 10ee:7014
    ...
    Region 0: Memory at f7300000 (32-bit, non-prefetchable) [size=128K]

Then we can look at the sysfs interface, at /sys/bus/pci/devices/. The first bit of data in the output of lspci gives the location of the device on the bus, that we can use when traversing the sysfs interface.

$ ls -alF /sys/bus/pci/devices/0000\:04\:00.0/
total 0
drwxr-xr-x 3 root root      0 Jul  1 12:42 ./
drwxr-xr-x 8 root root      0 Jul  1 12:42 ../
-rw-r--r-- 1 root root   4096 Jul  9 12:48 broken_parity_status
-r--r--r-- 1 root root   4096 Jul  1 12:42 class
-rw-r--r-- 1 root root   4096 Jul  9 12:44 config
-r--r--r-- 1 root root   4096 Jul  1 12:42 device
...
-r--r--r-- 1 root root   4096 Jul  1 12:43 resource
-rw------- 1 root root 131072 Jul  1 12:43 resource0
...
-r--r--r-- 1 root root   4096 Jul  1 12:42 vendor

This interface has some useful files like vendor and device that confirm that we have the right device. These are also useful for programatically finding the correct device, rather than using lspci.

$ cat /sys/bus/pci/devices/0000\:04\:00.0/vendor
0x10ee
$ cat /sys/bus/pci/devices/0000\:04\:00.0/device
0x7014

Looking back at the lspci output, we can also find memory resources and addresses. These are represented as resource0...resourceN in the sysfs interface. That's what we use to get access to the PCI memory space.

Open the resource0 file (which can be some number other than 0 depending on the device).

int fd = open("/sys/bus/pci/devices/0000:04:00.0/resource0", O_RDWR | O_SYNC);

Then use the memory address and size from the lspci output to mmap the file.

void* base_address = (void*)0xf7300000;
size_t size = 128 * 1024; // 128K
void* void_memory = mmap(base_address,
                         size,
                         PROT_READ | PROT_WRITE,
                         MAP_SHARED,
                         fd,
                         0);
uint16_t* memory = (uint16_t*)void_memory;

Now memory provides direct access to read and write the PCI memory space. We can hack away!

// Read the value of the first register
uint16_t first_register = memory[0];

// Write a value to the third register
memory[2] = 0x0007;

Now, this isn't the perfect scenario. For one, we need to be root to access this memory space. For two, there's no sign of interrupt handling anywhere.

But for basic poking around on a new device, it works pretty slick. No kernel module development required.

 

 

 

指令 lspci 可以看到很多關於 PCI-E 裝置的訊息,但這些訊息個代表什麼意思!!

測試環境為 CentOS7 x86_64

我的系統裡面有一張 Intel Corporation 82545EM Gigabit 網卡.

[root@localhost ~]$ lspci
01:00.0 Ethernet controller: Intel Corporation 82575EB Gigabit Network Connection (rev 02)                   

PCI 的裝置使用三個編號用來當作識別值,個別為 1.匯流排(bus number), 2. 裝置(device number) 以及 3. 功能(function number).

透過參數 -s(specified 只看特定裝置) -vvv(Verbose 可以顯示更多訊息,還有 -v , -vv),可以看到整個 PCI-E 裝置的詳細設定與狀態,下面來看各所代表的意思.

[root@localhost ~]$ lspci -s 01:00.0 -vvv
    • 卡的基本訊息
      其中的 + – 代表有沒有被啟動或是發生.

       

      01:00.0 Ethernet controller: Intel Corporation 82575EB Gigabit Network Connection (rev 02)                   
      Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
      Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
      Latency: 0, Cache Line Size: 64 bytes
      Interrupt: pin A routed to IRQ 16
      Region 0: Memory at fbba0000 (32-bit, non-prefetchable) [size=128K]
      Region 1: Memory at fbb80000 (32-bit, non-prefetchable) [size=128K]
      Region 2: I/O ports at e020 [size=32]
      Region 3: Memory at fbbc4000 (32-bit, non-prefetchable) [size=16K]
      Expansion ROM at fbb60000 [disabled] [size=128K]
    • Power Management
      與 PCI-E 裝置相關的電源管理.

       

      Capabilities: [40] Power Management version 2
              Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
              Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-

      主要可以看一下 PME(D0+,D1-,D2-,D3hot+,D3cold+) power management event 的狀態,+ 代表有被啟動, – 代表沒有被啟動.
      D0 有 Uninitialized 與 Active 兩種狀態.
      D1 為 Light Sleep 狀態.
      D2 為 Deep Sleep 狀態.
      D3 為 Full Off 狀態, 還可以分成為 D3cold 與 D3hot 兩種狀態.

    • Message Signaled Interrupts
      有分兩種 MSI 與 MSI-X .

       

      Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
              Address: 0000000000000000  Data: 0000
      Capabilities: [60] MSI-X: Enable+ Count=10 Masked-
              Vector table: BAR=3 offset=00000000
              PBA: BAR=3 offset=00002000

      MSI 是 PCI Express 中斷產生的方式,採用 in-band (控制訊號與資料同線路) 的方式,取代舊有的 out-of-band 的方式.
      MSI (PCI 2.2 開始採用) 支援 1, 2, 4, 8, 16 或 32 中斷上限.
      MSI-X (PCI 3.0 開始採用) 支援 2048 中斷上限.

    • Express Endpoint
      Capabilities: [a0] Express (v2) Endpoint, MSI 00
              DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
                      ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
              DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
                      RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                      MaxPayload 256 bytes, MaxReadReq 128 bytes
              DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
              LnkCap: Port #0, Speed 2.5GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <4us, L1 <64us
                      ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
              LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                      ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
              LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
              DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
              DevCtl2: Completion Timeout: 16ms to 55ms, TimeoutDis-, LTR-, OBFF Disabled
              LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
                       Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                       Compliance De-emphasis: -6dB
              LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                       EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-

      DevCap: Device Capabilities
      其中的 MaxPayload 代表 PCIe 封包最大有效負載量,類似網路設定的 MTU.
      DevCtl: Device Control
      Report errors 表示哪一些錯誤需要被報告 + 代表有需要, – 代表沒有需要.詳細還需要看下面的 Advanced Error Reporting.
      其中的 MaxReadReq (PCIe Max Read Request) 代表 PCIe 讀取請求允許的上限值.
      DevSta: Device State
      目前裝置狀態,有沒有錯誤發生.

      LnkCap: Link capability
      系統能提供的最高頻寬 PCI-Express 1.0 ( 2.5G ) Width x4=10G .
      LnkCtl: Link Control
      裡面的 ASPM 為 PCI Express Active State Power Management ,disabled 代表沒有使用 power savings.
      LnkSta: Link State
      目前該PCI-E 裝置跑的速度 PCI-Express 1.0 ( 2.5G ) Width x1=2.5G,

      DevCap2: 其他與 Device Capabilities 相關的訊息.
      DevCtl2: 其他與 Device Control 相關的訊息.
      LnkCtl2: 其他與 Link Control 相關的訊息.
      LnkSta2: 其他與 Link State 相關的訊息.
      後面的 Register 就需要到 PCI-SIG 查詢 PCI-E Specification – https://pcisig.com/specifications

    • Advanced Error Reporting
      PCI Express 錯誤信號可能發生在 PCI Express 鏈路本身或在鏈路上啟動時.
      PCI Express 定義兩個錯誤報告: 1. 基本(baseline) 和 2.高級(Advanced) 錯誤報告 (Error Reporting) 功能.

       

      Capabilities: [100 v1] Advanced Error Reporting
              UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
              UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
              UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
              CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
              CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
              AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-

      UESta: UN-correctable Error State
      UEMsk: UN-correctable Error Mask
      UESvrt: UN-correctable Error ?
      CESta: Correctable Error State
      CEMsk: Correctable Error Mask
      AERCap: AER Capabilities

    • 其他訊息
      Capabilities: [140 v1] Device Serial Number 90-fb-a6-ff-ff-76-38-00
      Kernel driver in use: igb
      Kernel modules: igb

 

posted on 2020-08-24 09:39  tycoon3  阅读(4347)  评论(0编辑  收藏  举报

导航