DPDK Introduction and Architecture

Processing Packets in Linux: Main Stages

When a network card first receives a packet, it sends it to a receive queue, or RX. From there, it gets copied to the main memory via the DMA (Direct Memory Access) mechanism.

Afterwards, the system needs to be notified of the new packet and pass the data onto a specially allocated buffer (Linux allocates these buffers for every packet). To do this, Linux uses an interrupt mechanism: an interrupt is generated several times when a new packet enters the system. The packet then needs to be transferred to the user space.

One bottleneck is already apparent: as more packets have to be processed, more resources are consumed, which negatively affects the overall system performance.

 

DPDK: How It Works

General Features

Let’s look at the following illustration:

pr-3303

On the left you see the traditional way packets are processed, and on the right–with DPDK. As we can see, the kernel in the second example doesn’t step in at all: interactions with the network card are performed via special drivers and libraries.

If you’ve already read about DPDK or have ever used it, then you know that the ports receiving incoming traffic on network cards need to be unbound from Linux (the kernel driver). This is done using the dpdk_nic_bind (or dpdk-devbind) command, or ./dpdk_nic_bind.py in earlier versions.

How are ports then managed by DPDK? Every driver in Linux has bind and unbind files. That includes network card drivers:

ls /sys/bus/pci/drivers/ixgbe
bind  module  new_id  remove_id  uevent  unbind

To unbind a device from a driver, the device’s bus number needs to be written to the unbind file. Similarly, to bind a device to another driver, the bus number needs to be written to its bind file. More detailed information about this can be found here.

The DPDK installation instructions tell us that our ports need to be managed by the vfio_pci, igb_uio, or uio_pci_generic driver. (We won’t be geting into details here, but we suggested interested readers look at the following articles on kernel.org: 1 and 2.)

These drivers make it possible to interact with devices in the user space. Of course they include a kernel module, but that’s just to initialize devices and assign the PCI interface.

All further communication between the application and network card is organized by the DPDK poll mode driver (PMD) DPDK has poll mode drivers for all supported network cards and virtual devices.

The DPDK also requires hugepages be configured. This is required for allocating large chunks of memory and writing data to them. We can say that hugepages does the same job in DPDK that DMA does in traditional packet processing.

We’ll discuss all of its nuances in more detail, but for now, let’s go over the main stages of packet processing with the DPDK:

  1. Incoming packets go to a ring buffer (we’ll look at its setup in the next section). The application periodically checks this buffer for new packets.
  2. If the buffer contains new packet descriptors, the application will refer to the DPDK packet buffers in the specially allocated memory pool using the pointers in the packet descriptors.
  3. If the ring buffer does not contain any packets, the application will queue the network devices under the DPDK and then refer to the ring again.

Let’s take a closer look at the DPDK’s internal structure.

 

 

https://blog.selectel.com/introduction-dpdk-architecture-principles/

 

Overview

High level overview of the path of a packet:

  1. Driver is loaded and initialized.
  2. Packet arrives at the NIC from the network.
  3. Packet is copied (via DMA) to a ring buffer in kernel memory.
  4. Hardware interrupt is generated to let the system know a packet is in memory.
  5. Driver calls into NAPI to start a poll loop if one was not running already.
  6. ksoftirqd processes run on each CPU on the system. They are registered at boot time. The ksoftirqd processes pull packets off the ring buffer by calling the NAPI poll function that the device driver registered during initialization.
  7. Memory regions in the ring buffer that have had network data written to them are unmapped.
  8. Data that was DMA’d into memory is passed up the networking layer as an ‘skb’ for more processing.
  9. Packet steering happens to distribute packet processing load to multiple CPUs (in leu of a NIC with multiple receive queues), if enabled.
  10. Packets are handed to the protocol layers from the queues.
  11. Protocol layers add them to receive buffers attached to sockets.

https://www.privateinternetaccess.com/blog/2016/01/linux-networking-stack-from-the-ground-up-part-1/

posted on 2018-09-30 07:49  Quinn-Yann  阅读(566)  评论(0编辑  收藏  举报