(4) Switching Performance – Connecting Linux Network Namespaces
http://www.opencloudblog.com/?p=96
In a previous article I have shown several solutions to interconnect Linux network namespaces (http://www.opencloudblog.com). Four different solutions can be used – but which is the best solution with respect to performance and resource usage? This is quite interesting when you are running Openstack Networking Neutron together with the Openvswitch. The systems used to run these tests have Ubuntu 13.04 installed with kernel 3.8.0.30 and Openvswitch 1.9. The MTU of the IP interfaces is kept at the default value of 1500.
A test script has been used to collect performance numbers. The test script creates 2 namespaces, connects these namespaces using different Linux network technologies and measures the throughput and CPU usage using the software iperf. The iperf server process is started in one namespace, the client iperf process is started in the other namespace.
IMPORTANT REMARK: The results are confusing – a deeper analysis shows, that the configuration of the virtual network devices has a major impact on the performance. The settings TSO and GSO play a very important role when using network devices. I’ll show an analyses in an upcoming article.
Single CPU (i5-2550 [3.3 GHz]) Kernel 3.13
The test system has the Desktop CPU i5-2500 CPU @ 3.30GHz providing 4 CPU cores and 32 GByte DDR3-10600 RAM providing around 160 GBit/s RAM throughput.
The test system is running Ubuntu 14.04 with kernel 3.13.0.24 and Openvswitch 2.0.1
The results are shown in the table below. iperf has been running with one, two and four threads. At the end the limiting factor is CPU usage. The column „efficiency“ is defined as network throughput in GBit/s per Gigahertz available on CPUs.
Switch and Connection type | no of iperf threads |
throughput [GBit/s] tso gso lro gro on |
Efficiency [GBit/s per CPUGHZ] |
throughput [GBit/s] tso gso lro gro off |
---|---|---|---|---|
one veth pair | 1 | 37.8 | 6.3 | 3.7 |
one veth pair | 2 | 65.0 | 5.4 | 7.9 |
one veth pair | 4 | 54.6 | 4.4 | 11.0 |
one veth pair | 8 | 40.7 | 3.2 | 11.4 |
one veth pair | 16 | 37.4 | 2.9 | 11.7 |
linuxbridge with two veth pairs | 1 | 33.3 | 5.5 | 2.7 |
linuxbridge with two veth pairs | 2 | 54.3 | 4.4 | 5.6 |
linuxbridge with two veth pairs | 4 | 43.9 | 3.4 | 6.9 |
linuxbridge with two veth pairs | 8 | 32.1 | 2.5 | 7.9 |
linuxbridge with two veth pairs | 16 | 34.0 | 2.6 | 7.9 |
openvswitch with two veth pairs | 1 | 35.0 | 5.9 | 3.2 |
openvswitch with two veth pairs | 2 | 51.5 | 4.2 | 6.7 |
openvswitch with two veth pairs | 4 | 47.3 | 3.8 | 8.7 |
openvswitch with two veth pairs | 8 | 36.0 | 2.8 | 7.5 |
openvswitch with two veth pairs | 16 | 36.5 | 2.8 | 9.4 |
openvswitch with two internal ovs ports | 1 | 37.0 | 6.2 | 3.3 |
openvswitch with two internal ovs ports | 2 | 65.6 | 5.5 | 6.4 |
openvswitch with two internal ovs ports | 4 | 74.3 | 5.7 | 6.3 |
openvswitch with two interval ovs ports | 8 | 74.3 | 5.7 | 10.9 |
openvswitch with two internal ovs ports | 16 | 73.4 | 5.6 | 12.6 |
The numbers show an drastic effect, if TSO … are switches off. TSO perfformes the segmentation at the NIC level. In this software NIC only environment no segmentation is done by the NIC, the large packets (average 40 kBytes) are sent to the other side as one packet. The guiding factor is the packet packet rate.
Single CPU (i5-2550 [3.3 GHz]) Kernel 3.8
The test system is running Ubuntu 13.04 with kernel 3.8.0.30 and Openvswitch 1.9 .
The results are shown in the table below. iperf has been running with one, two and four threads. At the end the limiting factor is CPU usage. The column „efficiency“ is defined as network throughput in GBit/s per Gigahertz available on CPUs.
Switch and Connection type | no of iperf threads |
throughput [GBit/s] |
Efficiency [GBit/s per CPUGHZ] |
---|---|---|---|
one veth pair | 1 | 7.4 | 1.21 |
one veth pair | 2 | 13.5 | 1.15 |
one veth pair | 4 | 14.2 | 1.14 |
one veth pair | 8 | 15.3 | 1.17 |
one veth pair | 16 | 14.0 | 1.06 |
linuxbridge with two veth pairs | 1 | 3.9 | 0.62 |
linuxbridge with two veth pairs | 2 | 8.5 | 0.70 |
linuxbridge with two veth pairs | 4 | 8.8 | 0.69 |
linuxbridge with two veth pairs | 8 | 9.5 | 0.72 |
linuxbridge with two veth pairs | 16 | 9.1 | 0.69 |
openvswitch with two veth pairs | 1 | 4.5 | 0.80 |
openvswitch with two veth pairs | 2 | 9.7 | 0.82 |
openvswitch with two veth pairs | 4 | 10.7 | 0.85 |
openvswitch with two veth pairs | 8 | 11.3 | 0.86 |
openvswitch with two veth pairs | 16 | 10.7 | 0.81 |
openvswitch with two internal ovs ports | 1 | 41.9 | 6.91 |
openvswitch with two internal ovs ports | 2 | 69.1 | 5.63 |
openvswitch with two internal ovs ports | 4 | 75.5 | 5.74 |
openvswitch with two interval ovs ports | 8 | 67.0 | 5.08 |
openvswitch with two internal ovs ports | 16 | 74.3 | 5.63 |
The results show a huge differences. The openvswitch using two internal openvswitch ports has the best throughput and the best efficiency.
The short summary is:
- Use Openvswitch and Openvswitch internal ports – in the case of one iperf thread you get 6.9 GBit/s throughput per CPU Ghz. But this solution does not provide any iptables rules on the link.
- If you like the old linuxbridge and veth pairs you get only 0.7 GBit/s per CPU Ghz throughput. With this solution it’s possible to filter the traffic on the network namespace links.
The table shows some interesting effects, e.g.:
- The test with the ovs and two ovs ports shows a drop in performance between 4 and 16 threads. The CPU analysis shows, that in the case of 8 threads, the CPU time used by softirqs doubled in comparison to the case of 4 threads. The softirq time used by 16 threads is the same as for 4 threads.
Openstack
If you are running Openstack Neutron, you should use the Openvswitch. Avoid linuxbridges. When connecting the Neutron networking Router/LBaas/DHCP namespaces DO NOT enable ovs_use_veth.
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下
· 别再用vector<bool>了!Google高级工程师:这可能是STL最大的设计失误
· 单元测试从入门到精通