LVS(dr)+keepalived

LVS可以实现负载均衡,但是不能进行健康检查,如果有一个rs出现故障,LVS任然会把请求转发给故障的rs服务器,这样就会导致请求失败。
keepalived可以进行健康检查,同时能够实现LVS的高可用性,解决LVS的单点故障问题,其实,keepalived就是为LVS而生的。
实验环境:四台机器
192.168.75.61: LVS1   BACKUP
192.168.75.63: LVS2   MASTER
192.168.75.64:realserver1
192.168.75.65:realserver2
 
LVS的两个节点安装ipvsadm+keepalived
yum install -y ipvsadm keepalived(本文我是用编译安装的keepalived)
 
realserver的节点安装nginx
yum install -y nginx
 
配置脚本:
 
realserver两台节点的配置脚本:
[root@VM-75-64 ~]# cat lvs_dr_rs.sh
#!/bin/bash
vip=192.168.75.55
set -x
ifconfig lo:0 $vip broadcast $vip netmask 255.255.255.255 up
route add -host $vip lo:0
echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce
echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce
由于之前已经配置过dr模式的lvs,lo:0网卡已经存在了,因此首先要手动删掉对应的网卡,操作步骤为:
#ifconfig lo:0 down
在两台rs上分别执行该脚本。
 
LVS两台节点配置:ipvsadm不用配置了,直接在keepalived的配置文件里配置,所以说keepalived是为lvs而生的。
两台节点的keepalived配置文件:
MASTER:192.168.76.63
[root@VM-75-63 keepalived]# pwd
/usr/local/etc/keepalived
[root@VM-75-63 keepalived]# cat keepalived.conf
! Configuration File for keepalived
 
global_defs {
   router_id 75_63
   vrrp_skip_check_adv_addr
#   vrrp_strict
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}
 
vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 150
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.75.55
    }
}
 
virtual_server 192.168.75.55 80 {
    delay_loop 6
    lb_algo rr
    lb_kind DR
    protocol TCP
 
    real_server 192.168.75.64 80 {
        weight 1
        TCP_CHECK {
            connect_port 80
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 10
        }
    }
 
    real_server 192.168.75.65 80 {
       weight 1
        TCP_CHECK {
            connect_port 80
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
        }
    }
}
 
BACKUP:192.168.75.61
[root@VM-75-61 keepalived]# pwd
/usr/local/etc/keepalived
[root@VM-75-61 keepalived]# cat keepalived.conf
! Configuration File for keepalived
 
global_defs {
   router_id 75_61
   vrrp_skip_check_adv_addr
#   vrrp_strict                                #这里要注释掉,否则虚拟出的IP是ping不通的
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}
 
vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.75.55
    }
}
 
virtual_server 192.168.75.55 80 {      #此处等同于:/sbin/ipvsadm -A -t 192.168.75.55:80 -s wrr      
    delay_loop 6                       #服务轮询的时间间隔,每隔6秒检测后端服务
    lb_algo rr                         #lvs的调度算法
    lb_kind DR                         #lvs集群模式
    persistence_timeout 360            #持久化超时时间,默认是6分钟,客户端在持久化超时时间内的连接,会持续的连接到同一台后端服务器
    protocol TCP                       #指定协议
 
    real_server 192.168.75.64 80 {     #等同于/sbin/ipvsadm -a -t 192.168.75.55:80 -r 192.168.75.64:80 -g -w 1
        weight 1                       #权重
        TCP_CHECK {
            connect_port 80
            connect_timeout 3           #检测的连接超时时间
            nb_get_retry 3              #重试次数,这里是HTTP_GET检测方式的配置字段,TCP_CHECK一般可以去掉
            delay_before_retry 10       #重试之前延迟多少秒,默认1s
        }
    }
 
    real_server 192.168.75.65 80 {
       weight 1
        TCP_CHECK {
            connect_port 80
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
        }
    }
}
 
keepalived的配置文件处理好之后,开启keepalived进程:
注意,在开启keepalived的时候,先MASTER,再BACKUP!
[root@VM-75-63 init.d]# /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
[root@VM-75-63 init.d]# ps -ef |grep keep
root     27612     1  0 04:18 ?        00:00:00 /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
root     27613 27612  0 04:18 ?        00:00:00 /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
root     27614 27612  0 04:18 ?        00:00:00 /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
root     27619  8470  0 04:18 pts/0    00:00:00 grep keep
 
进程起来了,看下状态:
[root@VM-75-63 init.d]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:d1:74:a7 brd ff:ff:ff:ff:ff:ff
    inet 192.168.75.63/24 brd 192.168.75.255 scope global eth0
    inet 192.168.75.55/32 scope global eth0                                    #这一行,就是虚拟IP
    inet6 fe80::20c:29ff:fed1:74a7/64 scope link
       valid_lft forever preferred_lft forever
 
同样开启BCKUP的keepalived:
[root@VM-75-61 sysconfig]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:50:56:b3:00:01 brd ff:ff:ff:ff:ff:ff
    inet 192.168.75.61/24 brd 192.168.75.255 scope global eth0
    inet6 fe80::250:56ff:feb3:1/64 scope link
       valid_lft forever preferred_lft forever
 
此时并没有VIP,因为VIP在MASTER节点上!
ok,在MASTER节点查看ipvsadm的结果:
[root@VM-75-63 init.d]# ipvsadm -l
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  192.168.75.55:http rr
  -> 192.168.75.64:http           Route   1      0          0         
  -> 192.168.75.65:http           Route   1      0          0         
 
这里我们就不需要手动去配置虚拟主机和真实主机,而是通过keepalived的配置文件去实现的,所以说二者的联系很密切。
虽然vip不在backup节点上,但是ipvsadm的配置,他是存在的:
[root@VM-75-61 sysconfig]# ipvsadm -l
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  192.168.75.55:http rr
  -> 192.168.75.64:http           Route   1      0          0         
  -> 192.168.75.65:http           Route   1      0          0         
 
现在,测试在页面访问:http://192.168.75.55
发现访问不不通,连接超时,但是当我们用同一网段的linux虚拟机访问时:
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.65
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.64
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.65
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.64
发现能够拿到结果,并且轮询方式也是rr,那么现在思考,为什么用局域网内的办公电脑没法访问呢???
 
办公电脑跟后端realserver在不同网段,不同网段通信就需要添加路由信息,好的,查看realserver的路由信息:
[root@VM-75-64 ~]# route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.75.55   *               255.255.255.255 UH    0      0        0 lo
192.168.75.0    *               255.255.255.0   U     0      0        0 eth1
link-local      *               255.255.0.0     U     1002   0        0 eth1
default         192.168.75.55   0.0.0.0         UG    0      0        0 eth1
default         192.168.75.1    0.0.0.0         UG    0      0        0 eth1
 
我们看到192.168.75.55   *               255.255.255.255 UH    0      0        0 lo 这条是到75的路由信息,且后面的网卡设备是lo,但是往下发现,本地还有个默认到75.55的网关,通过上面学习的结果知道,lvs的DR模式下,是不允许后端的realserver配置默认IP为VIP的网关的,好的,我们尝试删除该条网关:
[root@VM-75-64 ~]# route del default gw 192.168.75.55
[root@VM-75-64 ~]# route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.75.55   *               255.255.255.255 UH    0      0        0 lo
192.168.75.0    *               255.255.255.0   U     0      0        0 eth1
link-local      *               255.255.0.0     U     1002   0        0 eth1
default         192.168.75.1    0.0.0.0         UG    0      0        0 eth1
 
好的,再次尝试用页面访问:
 
你看,访问到了,思考下原因:
因为DR模式是realserver直接把响应包反馈给客户端的,也就是这里的web主机,我电脑的IP,76.147,这里看来二者没法通信,虽然加上了路由信息,但默认网关是走的vip,也就是这里的75.55,但是实际上返回包是通过realserver的lo的IP,也就是vip,传给eth0,也就是75.64,再返回给web,所以这里需要网关是75.1,也就是真实的默认网关,不然76.147和75.64是没法通信的。
 
 
测试1:
手动关闭75.64的nginx进程:
[root@VM-75-64 ~]# service nginx stop
Stopping nginx:                                            [  OK  ]
测试访问情况:
[root@Vm-75-60 ~]# curl http://192.168.75.55
curl: (7) couldn't connect to host
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.65
[root@Vm-75-60 ~]# curl http://192.168.75.55
curl: (7) couldn't connect to host
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.65
[root@Vm-75-60 ~]# curl http://192.168.75.55
curl: (7) couldn't connect to host
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.65
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.65
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.65
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.65
 
期初你会发现,当轮询到75.64上时,访问是会失败的,75.65节点正常,但是大约10秒钟之后,所有的访问请求都会被分配到75.65节点上,这是因为keepalived检测到后端有节点响应失败,所以把所有的请求都分配到正常的节点上了!
当75.64的nginx再次开起来之后,realserver立马加入集群,处理请求。
 
测试2:
keepalived的HA特性,我们尝试关闭MASTER的keepalived进程,看VIP是否能够正常的票到BACKUP节点上:
[root@VM-75-63 keepalived]# ps -ef |grep keep
root     27612     1  0 04:18 ?        00:00:00 /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
root     27613 27612  0 04:18 ?        00:00:00 /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
root     27614 27612  0 04:18 ?        00:00:00 /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
root     27722  8470  0 06:40 pts/0    00:00:00 grep keep
[root@VM-75-63 keepalived]#
[root@VM-75-63 keepalived]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:d1:74:a7 brd ff:ff:ff:ff:ff:ff
    inet 192.168.75.63/24 brd 192.168.75.255 scope global eth0
    inet 192.168.75.55/32 scope global eth0                                            #VIP在MASTER上
    inet6 fe80::20c:29ff:fed1:74a7/64 scope link
       valid_lft forever preferred_lft forever
[root@VM-75-63 keepalived]# pkill keepalived
[root@VM-75-63 keepalived]# ps -ef |grep keep
root     27726  8470  0 06:41 pts/0    00:00:00 grep keep
[root@VM-75-63 keepalived]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:d1:74:a7 brd ff:ff:ff:ff:ff:ff
    inet 192.168.75.63/24 brd 192.168.75.255 scope global eth0
    inet6 fe80::20c:29ff:fed1:74a7/64 scope link                                        #VIP不见了
       valid_lft forever preferred_lft forever
 
MASTER上的VIP不见了,我们看下BACKUP的情况:
[root@VM-75-61 sysconfig]# ps -fe | grep keep
root     16821     1  0 09:49 ?        00:00:00 /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
root     16822 16821  0 09:49 ?        00:00:01 /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
root     16823 16821  0 09:49 ?        00:00:01 /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
root     17076 31656  0 15:01 pts/0    00:00:00 grep keep
[root@VM-75-61 sysconfig]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:50:56:b3:00:01 brd ff:ff:ff:ff:ff:ff
    inet 192.168.75.61/24 brd 192.168.75.255 scope global eth0
    inet 192.168.75.55/32 scope global eth0
    inet6 fe80::250:56ff:feb3:1/64 scope link
       valid_lft forever preferred_lft forever
 
是不是,来到75.61上了!并且服务依旧是可用的,那么这里就实现了双机热备的高可用架构。
 
注意:实验初期,我在测试VIP漂移的时候,发现MASTER的keepalived杀掉之后,VIP并不会漂移,依旧在MASTER上,更奇怪的是,BACKUP上也出现了VIP,二者同时存在,这是为什么呢?而且通过抓包我们发现:
[root@VM-75-61 sysconfig]# tcpdump -i eth0 -p vrrp -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
15:05:11.078778 IP 192.168.75.61 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
15:05:12.079844 IP 192.168.75.61 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
15:05:13.080878 IP 192.168.75.61 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
15:05:14.081893 IP 192.168.75.61 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
 
vrrp广播确实由75.61也就是BACKUP发送的,也就说是由75.61提供服务的,并且服务依旧可用!
当把master的keepalived开起来之后,backup的VIP就没了,通过抓包发下:
[root@VM-75-61 sysconfig]# tcpdump -i eth0 -p vrrp -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
15:05:44.108400 IP 192.168.75.63 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 150, authtype simple, intvl 1s, length 20
15:05:45.109462 IP 192.168.75.63 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 150, authtype simple, intvl 1s, length 20
15:05:46.110508 IP 192.168.75.63 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 150, authtype simple, intvl 1s, length 20
 
又由master提供服务了,所以这里的情况是:VIP仅仅看起来没有转移,实际上高可用功能是能够正常提供的,那么为什么呢?
通过网上提供的思路发现,我在杀master端的keepalived进程,使用了 -9  参数,也就是强制中断,并不是正常的关闭进程,当我尝试不加参数‘-9’的时候,VIP顺利的飘走了,服务依旧可用,那么,这里在测试的时候要注意了!
不能使用 ‘kill -9’
 
 
好的,关于lvs的DR模式的高可用大致就说到这里,后面有发现会继续补充。
 
LVS状态能够通过zabbix自动以监控进行监控并画图,监控数据主要从下面几个指令得到:
[root@VM-75-63 bin]# cat /proc/net/ip_vs
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP  C0A84B37:0050 rr
  -> C0A84B41:0050      Route   1      0          0         
  -> C0A84B40:0050      Route   1      0          0         
 
这里看起来不太友好,换个方式:
[root@VM-75-63 bin]# ipvsadm -l
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  192.168.75.55:http rr
  -> 192.168.75.64:http           Route   1      0          0         
  -> 192.168.75.65:http           Route   1      0          0         
这里是展示连接数的。
 
[root@VM-75-63 bin]# cat /proc/net/ip_vs_stats
   Total Incoming Outgoing         Incoming         Outgoing
   Conns  Packets  Packets            Bytes            Bytes
     542     58DB     3E99           8C1065           344D49
 
Conns/s   Pkts/s   Pkts/s          Bytes/s          Bytes/s
       1        7        0              214                0
上面的指令是展示当前往来的数据量和输出带宽的。
自定义监控项的具体细节就不阐述了。

 

以上,共勉!

posted @ 2020-08-03 11:57  一个运维  阅读(208)  评论(0编辑  收藏  举报