专题：『Channel Bonding/team』——EXPERIMANTAL！！！

Linux内核支持的多网卡聚合方法——bond、team

bond

优点：经过长时间的实践检验，具有较高的稳定性；kernel-2.4及以上内核均广泛支持

缺点：需要通过sysfs或发行版定制的网卡配置文件控制，易用性较差；运行效率对比team没有优势；同一时间仅允许单一形式的monitor，不能结合使用arp及mii

team

优点：后起之秀，运行效率较bond有所提升，目前在5％左右；提供用户空间程序库(libteam)和配置文件(teamd.conf)，易用性提升；提供了mode API，用户可以编写自己的mode；同一时间可以结合使用多种monitor方式，有助于实现HA

缺点：仅能应用在配置kernel-3.3及更高版本内核的Linux系统中，诸如Debian7(kernel-3.2)、rhel6(kernel-2.6)等较早的发行版本，均无法应用

- Team布署应用 -

『参考资料』https://github.com/jpirko/libteam/wiki/Infrastructure-Specification

一、确保libteam安装就绪，最新版本如下

$ git clone git://github.com/jpirko/libteam.git

二、创建team设备、绑定或解除slave端口、查看team与slave对应关系

　　通常结合"ip link"、"teamd"布署应用

创建team
　　ip link add dev team0 type team
删除team
　　ip link del team0
绑定eth0到team
　　ip link set eth0 master team0
解除eth0的绑定
　　ip link set eth0 nomaster
查看team设备所有的slave端口
　　ip link | grep -P 'master\s*team0'
- OR -
　　teamnl team0 ports
查看team设备详细信息
　　teamdctl teamd state -v
运行teamd
　　teamd -f[--config-file] FILENAME -d[--daemonize]

注意：同bond一样，team的slave在实施绑定之前，需要事先清空其IP及route信息并处于DOWN状态

三、模式及参数配置

　　目前官方提供5种Runner(运行策略/mode)

　　当每个端口单独对接一个独立的switch时，不需要对switch进行设置；当team设备有多于一个slave interface接入同一个switch时，除activebackup外，其它策略均需要switch端支技EtherChannel，lacp策略额外需要802.3ad支持

The following runners are available:

broadcast — Simple runner which directs the team device to transmit packets via all ports

roundrobin — Simple runner which directs the team device to transmits packets in a round-robin fashion

activebackup — Watches for link changes and selects active port to be used for data transfers

loadbalance — To do passive load balancing, runner only sets up BPF hash function which will determine port for packet transmit.To do active load balancing, runner moves hashes among available ports trying to reach perfect balance

lacp — Implements 802.3ad LACP protocol. Can use same Tx port selection possibilities as loadbalance runner

　　Monitor mode

ethtool - Uses Libteam lib to get port ethtool state changes

arp_ping - ARP requests are sent through a port. If an ARP reply is received, the link is considered to be up. Target IP address, interval and other options can be setup in configfile

nsna_ping - Similar to the arp_ping, only it uses the IPv6 Neighbour Solicitation and Neighbour Advertisement mechanism. This is an alternative to arp_ping and becomes handy in pure-IPv6 environments

　　TeamX.conf使用JSON风格的配置文件

JSON语法梗概：

　　成对的‘{}’标识代码层级，子项是由‘:’连接的‘key:value’对，子项之间用‘,’分割，除纯数字以外，内容要写在‘“”’中

『COMMON OPTIONS』
“runner”: {"name": "activebackup/lacp/broadcast/boundrobin/loadbalance"}

"device"： "Desired name of new team device"

"hwaddr"： "Desired hardware address of new team device. Usual MAC address format is accepted"

"link_watch"： {"name": "ethtool/nasa_ping/arp_ping"}

－OR－＃设置指定端口的monitor，后续 SPECIFIC OPTIONS 中的monitor同样适用
"ports": {
     "eth0": {
          "link_watch": {"name": "ethtool/arp_ping/nsna_ping"}
     }
 }

『SPECIFIC OPTIONS』——ACTIVE-BACKUP RUNNER

"ports": {"eth0": {"prio": -10, "sticky": "true"}}

Port priority. The higher number means higher priority.Default: 0

Flag which indicates if the port is sticky. If set, it means the port does not get unselected if another port with higher priority or better parameters becomes available.Default is false

『SPECIFIC OPTIONS』——LOAD BALANCE RUNNER

"runner": {"tx_hash": ["eth", "ipv4", "ipv6"]}　　注意：一对多情况，采用数组格式

eth — Uses source and destination MAC addresses

vlan — Uses VLAN id

ipv4 — Uses source and destination IPv4 addresses

ipv6 — Uses source and destination IPv6 addresses

tcp — Uses source and destination TCP ports

udp — Uses source and destination UDP ports

sctp — Uses source and destination SCTP ports

　　＃List of fragment types which should be used for packet Tx hash computation

『SPECIFIC OPTIONS』——LACP RUNNER

“runner”: {"fast_rate": "true"}

　　#Specifies the rate at which our link partner is asked to transmit LACPDU packets. If this is true then packets will be sent once per second. Otherwise they will be sent every 30 seconds

"runner": {"tx_hash": [array]}

　　#Same as for load balance runner

"runner": {"agg_select_policy": "lacp_prio/lacp_prio_stable/bandwidth/count/port_options"}

lacp_prio — Aggregator with highest priority according to LACP standard will be selected. Aggregator priority is affected by per-port option lacp_prio

lacp_prio_stable — Same as previous one, except do not replace selected aggregator if it is still usable

bandwidth — Select aggregator with highest total bandwidth

count — Select aggregator with highest number of ports

port_options — Aggregator with highest priority according to per-port options prio and sticky will be selected. This means that the aggregator containing the port with the highest priority will be selected unless at least one of the ports in the currently selected aggregator is sticky.Default is lacp_prio

"ports": {"eth0": {"lacp_prio": -10}}

　　#Port priority according to LACP standard. The lower number means higher priority

"ports": {"eth0": {"lacp_key": 4}}

　　#Port key according to LACP standard. It is only possible to aggregate ports with the same key.Default is 0

『SPECIFIC OPTIONS』——ETHTOOL LINK WATCH
"link_watch": {
     "name": "ethtool",
     "delay_up": 100,
     "delay_down": 50
   }
- OR -
"ports": {
     "eth0": {
          "link_watch": {
               "name": "ethtool",
               "delay_up": 200,
               "delay_down": 100
            }
       }
 }        
link_watch.delay_up | ports.PORTIFNAME.link_watch.delay_up (int).Value is a positive number in milliseconds. It is the delay between the link coming up and the runner being notified about it.Default is 0

link_watch.delay_down | ports.PORTIFNAME.link_watch.delay_down (int).Value is a positive number in milliseconds. It is the delay between the link going down and the runner being notified about it.Default is 0

『SPECIFIC OPTIONS』——ARP PING LINK WATCH
"link_watch": {
　　　　　　　　　"name": "arp_ping",
　　　　　　　　　"interval": 100,    ＃两次arp_ping之间的时间间隔，单位毫秒
　　　　　　　　　"init_wait": 1000,    ＃端口从初次加入team到首次发出arp_ping的时间间隔
　　　　　　　　　"missed_max": 3,    ＃允许丢失的ARP replies的最大值，超过此数值即判定为端口失效
　　　　　　　　　"target_host": 10.1.0.100    ＃arp_ping的目标主机
　　　　　　　　 }
- OR -
"ports": {
　　"eth0": {
　　　　"link_watch": {
　　　　　　　　　　　　　"name": "arp_ping"
　　　　　　　　　　　　　"interval": 100,
　　　　　　　　　　　　　"init_wait": 1000,
　　　　　　　　　　　　　"missed_max": 3,
　　　　　　　　　　　　　"target_host": 10.1.0.100
　　　　　　　　　　　　 }
　　　　　　　}
　　　　　}
link_watch.interval | ports.PORTIFNAME.link_watch.interval (int).Value is a positive number in milliseconds. It is the interval between ARP requests being sent.

link_watch.init_wait | ports.PORTIFNAME.link_watch.init_wait (int).Value is a positive number in milliseconds. It is the delay between link watch initialization and the first ARP request being sent.Default is 0

link_watch.missed_max | ports.PORTIFNAME.link_watch.missed_max (int).Maximum number of missed ARP replies. If this number is exceeded, link is reported as down.Default is 3

link_watch.source_host | ports.PORTIFNAME.link_watch.source_host (hostname).Hostname to be converted to IP address which will be filled into ARP request as source address.Default is 0.0.0.0

link_watch.target_host | ports.PORTIFNAME.link_watch.target_host (hostname).Hostname to be converted(转换) to IP address which will be filled into ARP request as destination address.

link_watch.validate_active | ports.PORTIFNAME.link_watch.validate_active (bool).Validate(验证、检测) received ARP packets on active ports. If this is not set, all incoming ARP packets will be considered as a good reply.Default is false

link_watch.validate_inactive | ports.PORTIFNAME.link_watch.validate_inactive (bool).Validate received ARP packets on inactive ports. If this is not set, all incoming ARP packets will be considered as a good reply.Default is false

link_watch.send_always | ports.PORTIFNAME.link_watch.send_always (bool).By default, ARP requests are sent on active ports only. This option allows sending even on inactive ports.Default: false

『SPECIFIC OPTIONS』——NS/NA PING LINK WATCH

link_watch.interval | ports.PORTIFNAME.link_watch.interval (int).Value is a positive number in milliseconds. It is the interval between sending NS packets

link_watch.init_wait | ports.PORTIFNAME.link_watch.init_wait (int).Value is a positive number in milliseconds. It is the delay between link watch initialization and the first NS packet being sent

link_watch.missed_max | ports.PORTIFNAME.link_watch.missed_max (int).Maximum number of missed NA reply packets. If this number is exceeded, link is reported as down.Default is 3

link_watch.target_host | ports.PORTIFNAME.link_watch.target_host (hostname).Hostname to be converted to IPv6 address which will be filled into NS packet as target address

四、示例：启动脚本

#!/bin/env bash
team()
{
    tM="team0"
    iP="10.1.7.77/24"
    pkill teamd 2>/dev/null
    for i in {0..2}
    do
        ip addr flush dev eth$i
        ip route flush dev eth$i
        ip link set eth$i down
    done
    PS3="Select runner policy:"
    select x in "activebackup" "broadcast" "loadbalance" "lacp" "roundrobin"
    do
        teamd --force-recreate --config-file ${x}.conf --daemonize
        ip link set $tM up
        ip addr add $iP dev $tM scope link
        break
    done
}
team

posted @ 2015-09-06 14:03 范辉阅读(1223) 评论(0) 收藏举报

刷新页面返回顶部

纯原创，零转载

——微信：kite7x

专题：『Channel Bonding/team』——EXPERIMANTAL！！！