Efficient data transfer through zero copy

一。传统数据传输

1.user mode & kernel mode

 

 

2.context switch

The steps involved are:

1.The read() call causes a context switch (see Figure 2) from user mode to kernel mode.

  Internally a sys_read() (or equivalent) is issued to read the data from the file. The first copy (see Figure 1) is performed by the

direct memory access (DMA) engine, which reads file contents from the disk and stores them into a kernel address space buffer.

2.The requested amount of data is copied from the read buffer into the user buffer, and the read() call returns.

  The return from the call causes another context switch from kernel back to user mode. Now the data is stored in the user address

space buffer.

3.The send() socket call causes a context switch from user mode to kernel mode.

   A third copy is performed to put the data into a kernel address space buffer again. This time, though, the data is put into a

different buffer, one that is associated with the destination socket.

4.The send() system call returns, creating the fourth context switch.

  Independently and asynchronously, a fourth copy happens as the DMA engine passes the data from the kernel buffer to the

protocol engine.

 

二。zero-copy

Notice that the second and third data copies are not actually required.

  The application does nothing other than cache the data and transfer it back to the socket buffer. Instead, the data could be

transferred directly from the read buffer to the socket buffer.The java.util.current.FileChannel's transferTo() method lets you do

exactly this. 

注意下面的步骤2和3的数据复制是多余的。

1.

The steps taken when you use transferTo()  are:

1.The transferTo() method causes the file contents to be copied into a read buffer by the DMA engine.

Then the data is copied by the kernel into the kernel buffer associated with the output socket.

2.The third copy happens as the DMA engine passes the data from the kernel socket buffers to the protocol engine.

 

三。再改进:消除上面步骤2和3的数据复制

This is an improvement: we've reduced the number of context switches from four to two and reduced the number of data copies

from four to three (only one of which involves the CPU). But this does not yet get us to our goal of zero copy. We can further

reduce the data duplication done by the kernel if the underlying network interface card supports gather operations.

In Linux kernels 2.4 and later, the socket buffer descriptor was modified to accommodate this requirement. This approach not only

reduces multiple context switches but also eliminates the duplicated data copies that require CPU involvement. The user-side usage

still remains the same, but the intrinsics have changed:

这是一种进步:我们已经把上下文切换的数目从4个减到2个,把数据副本数目从4个减少到3个(仅其中之一涉及CPU)。但是这还不达到零拷贝的目标。

如果底层网络接口卡支持归集操作我们可以进一步减少内核做的数据复制操作。在Linux内核2.4及更高版本,对套接字缓冲区描述进行了修改,以适应

这一要求。这种方法不仅减少了上下文切换,还消除了需要CPU参与数据复制操作。API使用方法仍然是相同的,但本质已经改变:

1.The transferTo() method causes the file contents to be copied into a kernel buffer by the DMA engine.

2.No data is copied into the socket buffer.

Instead, only descriptors with information about the location and length of the data are appended to the socket buffer. The DMA

engine passes data directly from the kernel buffer to the protocol engine, thus eliminating the remaining final CPU copy.

 

1.通过DMA把文件复制到内核缓冲区

2.把数据的位置和长度添加到socket buffer, 而不是复制数据到socket buffer。然后DMA直接把数据从kernel buffer发送到 socket buffer, 消除了

CPU复制过程。

 

 

 

四。出处

1.IBM developer

2.DMA(Direct Memory Access)

  直接内存访问,是一种不经过CPU而直接从内存存取数据的数据交换模式。在DMA模式下,CPU只须向DMA控制器下达指令,让DMA控制器

来处理数据的传送,数据传送完毕再把信息反馈给CPU,这样就很大程度上减轻了CPU资源占有率,可以大大节省系统资源。

3.Scatter-gather

   分散-收集:DMA 允许在一次单一的 DMA 处理中传输资料到多个内存区域。相当于把多个简单的 DMA 要求串在一起。再一次,这个动机是要

减轻 CPU 的多次输出输入中断和资料复制任务。

posted @ 2014-09-23 23:12  等风来。。  Views(331)  Comments(0Edit  收藏  举报
------------------------------------------------------------------------------------------------------------ --------------- 欢迎联系 x.guan.ling@gmail.com--------------- ------------------------------------------------------------------------------------------------------------