网卡驱动引起openstack的mtu问题

一套Pike版本的openstack测试环境，使用vlan模式的网络，数据网网卡使用的是绿联的usb百兆网卡，遇到了虚拟机网络异常的问题。同一个vlan下，不同宿主机上的两台虚拟机，相互之间可以ping通，但是不能ssh。

ICMP能通，说明链路没有问题，ssh走的是ssh协议，不通的话，最常见时两种情况，一种是防火墙安全组禁用了ssh的端口，经过排查，发现不是这个问题。

另一种情况可能是MTU设置错误的问题。
按照以太网早期的设计，链路层最大传输的数据的长度（MTU）最小46个字节，最大1500个字节（原因可以知乎搜索），如果传输的数据大小超过了1500个字节，要切分成多分小于1500的数据再交给链路层。对应的，链路层设备接受到的帧大小的范围时64-1518^注1 ，如果以太网上的设备，如网卡，交换机网卡接收到的帧大小超过了1518^注2，默认会直接丢弃这样的数据帧^注3 。

有的时候网络通信，就是因为上层协议没有很好的根据mtu调整自己单次传输的数据大小，导致部分数据在传输过程中丢失，网络通信故障。但是对于ssh，ssh使用的是tcp协议，这个按理不会出现，tcp通信时，tcp链接的两端会根据双方以及整条线路上设备的最小mtu，协商出一个MSS，即tcp层每次传输的最大的数据，有的时候MTU设置的太大，会导致数据帧的长度超出限制，连接到某些网络失败。按理说，无论哪一个设备的mtu设置错误，都不应该影响ssh。

但是还是要试一试，现在虚拟机默认的mtu大小是1500，ssh不通，尝试将mtu调整成1499，依旧不通，1498通了。这就很奇怪了，为什么链路层支持传输的最大数据少了2字节？当时还不知道这个数据网出口时绿联的usb网卡，开始怀疑时不是vxlan相关的配置引起的bug，决定通过抓包来定位问题设备。

vlan 模式下不同宿主机上的虚拟机通信逻辑比较简单，如下图所示：

vlan模式不同宿主机上的虚拟机通信

灰色设备是ovs port，类似网线的作用，蓝色是物理的或者虚拟的二层交换设备，橙色是物理的或者虚拟的网卡
尝试从vm1执行ping -s 1471 192.168.11.106，然后再物理机02上抓包，发现再eth1上抓包，没有问题，可以抓得到，在qvo上抓包，抓不到。这里要确认，是eth1把收到的包丢掉了，还是eth1把包交给qvo的过程中，包丢失了。

在物理机2上执行下面的命令：

(openvswitch-vswitchd)[root@compute01 /]# ovs-dpctl show
system@ovs-system:
	lookups: hit:7589478 missed:410086 lost:0
	flows: 54
	masks: hit:62147619 total:10 hit/pkt:7.77
	port 0: ovs-system (internal)
	port 1: br-int (internal)
	port 2: br-tun (internal)
	port 3: eth1
	port 4: br-data (internal)
	port 5: qvo34b16ff1-bb
	port 6: qvoe6366126-d8


(openvswitch-vswitchd)[root@compute01 /]# ovs-dpctl dump-flows | grep -r "6e:f2" 
recirc_id(0),in_port(3),eth(src=fa:16:3e:a9:6e:f2,dst=fa:16:3e:07:ee:2c),
eth_type(0x8100),vlan(vid=131,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)),
packets:2122, bytes:3223318, used:0.726s,
actions:pop_vlan,6
recirc_id(0),in_port(6),eth(src=fa:16:3e:07:ee:2c,dst=fa:16:3e:a9:6e:f2),
eth_type(0x0800),ipv4(frag=no), packets:1046, bytes:1582598, used:0.726s,
actions:push_vlan(vid=131,pcp=0),3

可以看到，从port3(eth1)进入的icmp包，被ovs删掉vlan id后交给port 6(qvoe6366126-d8)了, 但是我们在qvo上抓包却抓不到，说明qvo把eth1交给他的包丢掉了。
使用tcpdump -i eth1 icmp -w packages.cap 把eth4上的包保存到packages.cap中，在wireshark中打开，可以看到下面的内容：

wireshark截图

仔细观察图中，我们发现，抓到的数据包的大小是1519字节，超过了1518字节的限制，这就是qvo丢弃他的原因吧。但是我们ping使用的是1471个字节的数据，加速8字节的icmp首部，20字节ip头部，6字节链路层源地址，6字节目的地址，2字节帧类型，4字节vlan信息，应该只有1517字节，不超过限制才对。多出来的2字节哪里来的？仔细观察，发现多出来的两字节在帧的末尾，被wireshark解析为：vssmonitoring Ethernet trailer。在qvo上抓包，发现抓到的包里面的内容不包括这部分数据，说明是包数据从qvo到eth1的途径中被加了这两个额外字节。但是为什么会有这两个字节额外字节？使用vssmonitoring + openstack在google上搜索，发现了一篇叫openstack network mystery的文章，作者遇到了一样的问题，作者的解释是：

Finally, we’ve found the 2 additional bytes! But, where do they come from? Google tells us that this can be caused by the padding of packets at the network-driver level. I reconsidered my setup and identified the USB network adapter as the weakest link. To be honest, I suspected this might be the issue from the beginning, but I never imagined it would catch up with me in this way

.
I downloaded and built the latest version of the driver and replaced the kernel module. Lo and behold, all my networking problems were gone! Pings of arbitrary payload sizes, SSH sessions, and file transfers all suddenly worked. In the end, my networking issues were caused by an issue with the driver that ships by default with the Linux kernel.

作者说是因为他的usb 网卡驱动的问题，我们使用的也是一个usb网卡，那问题看来是一样的了，去绿联官网下载最新的驱动，更新到我们的环境中，确实解决了我们的问题。

posted on 2018-03-05 16:43 张宇飞阅读(2702) 评论(0) 编辑收藏举报