keepalived的状态不断切换的问题解决

转载自:https://blog.csdn.net/weixin_43515220/article/details/104959814

=================

 

笔者在搭建nginx+keepalived架构的过程中,发现存在keepalived的vip不断迁移的过程:master down了之后,vip一会迁移到backup上,一会迁移到master上,或者在backup上一会存在一会消失。

就这样的情况笔者做了下面排查。

先介绍下环境:

master和backup都为CentOS Linux release 7.7.1908 (Core),均不开启防火墙

master IP: 192.168.218.131

backup IP:192.168.218.132

VIP:192.168.218.140

通过keepalived执行脚本检测Nginx的情况,若Nginx挂掉了,keepalived也会停止

Nginx的配置其实已经没有问题,关键是keepalived

下面是master的keepalived配置

! Configuration File for keepalived

global_defs {
# 接收邮件地址
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
# 邮件发送地址
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id NGINX_MASTER
}

vrrp_script check_nginx {
script "/usr/local/nginx/sbin/check_nginx.sh"
}

vrrp_instance VI_1 {
state MASTER
interface ens33 # 修改网卡
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.218.140/24 # vip for k8s master
}
track_script {
check_nginx
}
}

 


下面是backup的keepalived配置

! Configuration File for keepalived

global_defs {
# 接收邮件地址
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
# 邮件发送地址
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id NGINX_BACKUP
}

vrrp_script check_nginx {
script "/usr/local/nginx/sbin/check_nginx.sh"
}

vrrp_instance VI_1 {
state BACKUP # BACKUP
interface ens33
virtual_router_id 51
priority 90 # 优先级小于master
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.218.140/24 # vip for k8s master
}
track_script {
check_nginx
}
}

 

下面是keepalived中的执行脚本check+nginx.sh,已加执行权限

count=$(ps -ef |grep nginx |egrep -cv "grep|$$")

if [ "$count" -eq 0 ];then
systemctl stop keepalived
fi

 

排查

在master上手动pkill nginx之后,检查nginx状态,nginx已经停止

但是再检查keepalived的时候,keepalived还是正在运行

笔者先是在backup上查看系统日志

vim /var/log/messages

Mar 18 23:53:46 nginx02 Keepalived_vrrp[1365]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on ens33 for 192.168.218.140
Mar 18 23:53:46 nginx02 Keepalived_vrrp[1365]: Sending gratuitous ARP on ens33 for 192.168.218.140
Mar 18 23:53:46 nginx02 Keepalived_vrrp[1365]: Sending gratuitous ARP on ens33 for 192.168.218.140
Mar 18 23:53:46 nginx02 Keepalived_vrrp[1365]: Sending gratuitous ARP on ens33 for 192.168.218.140
Mar 18 23:53:46 nginx02 Keepalived_vrrp[1365]: Sending gratuitous ARP on ens33 for 192.168.218.140
Mar 18 23:53:48 nginx02 Keepalived_vrrp[1365]: VRRP_Instance(VI_1) Received advert with higher priority 100, ours 90
Mar 18 23:53:48 nginx02 Keepalived_vrrp[1365]: VRRP_Instance(VI_1) Entering BACKUP STATE
Mar 18 23:53:48 nginx02 Keepalived_vrrp[1365]: VRRP_Instance(VI_1) removing protocol VIPs.
Mar 18 23:53:51 nginx02 Keepalived_vrrp[1365]: VRRP_Instance(VI_1) Transition to MASTER STATE
Mar 18 23:53:52 nginx02 Keepalived_vrrp[1365]: VRRP_Instance(VI_1) Entering MASTER STATE
Mar 18 23:53:52 nginx02 Keepalived_vrrp[1365]: VRRP_Instance(VI_1) setting protocol VIPs.
Mar 18 23:53:52 nginx02 Keepalived_vrrp[1365]: Sending gratuitous ARP on ens33 for 192.168.218.140
Mar 18 23:53:52 nginx02 Keepalived_vrrp[1365]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on ens33 for 192.168.218.140
Mar 18 23:53:52 nginx02 Keepalived_vrrp[1365]: Sending gratuitous ARP on ens33 for 192.168.218.140
Mar 18 23:53:52 nginx02 Keepalived_vrrp[1365]: Sending gratuitous ARP on ens33 for 192.168.218.140
Mar 18 23:53:52 nginx02 Keepalived_vrrp[1365]: Sending gratuitous ARP on ens33 for 192.168.218.140
Mar 18 23:53:52 nginx02 Keepalived_vrrp[1365]: Sending gratuitous ARP on ens33 for 192.168.218.140
Mar 18 23:53:53 nginx02 Keepalived_vrrp[1365]: VRRP_Instance(VI_1) Received advert with higher priority 100, ours 90
Mar 18 23:53:53 nginx02 Keepalived_vrrp[1365]: VRRP_Instance(VI_1) Entering BACKUP STATE
Mar 18 23:53:53 nginx02 Keepalived_vrrp[1365]: VRRP_Instance(VI_1) removing protocol VIPs.
Mar 18 23:53:57 nginx02 Keepalived_vrrp[1365]: VRRP_Instance(VI_1) Transition to MASTER STATE
Mar 18 23:53:58 nginx02 Keepalived_vrrp[1365]: VRRP_Instance(VI_1) Entering MASTER STATE

在日志中可以看到,backup一会切到BACKUP状态,一会切到MASTER状态,根据提示,切换的原因是存在一个优先级为100的主机,比backup的优先级90要高

在这里笔者觉得很奇怪,优先级为100的那不就是master吗?

master的keepalived优先级没有因为Nginx挂掉而发生改变

在网上查阅了一些优秀博客之后,得到以下信息:

keepalived的vrrp_script执行后异常退出,可通过weight -int 的形式降低keepalived的优先级

于是针对笔者的实际情况,做出以下修改:

# 修改keepalived的vrrp_script:/usr/local/nginx/sbin/check_nginx.sh
count=$(ps -ef |grep nginx |egrep -cv "grep|$$")

if [ "$count" -eq 0 ];then
systemctl stop keepalived
exit 1 # 当nginx挂掉后,返回退出状态码1,以使其异常退出
fi

 

修改master和backup的keepalived配置:

# 两台主机都在vrrp_script中修改
vrrp_script check_nginx {
script "/usr/local/nginx/sbin/check_nginx.sh"
weight -20 # 脚本异常退出后,使keepalived的优先级-20
interval 2 # 脚本每2秒执行一次
}

 

将keepalived和Nginx的状态复原后,再次在master上pkill nginx

发现vip已迁移到backup上,且不发生原来的不断迁移的情况

下面查看backup的系统日志/var/log/messages

Mar 19 01:05:25 nginx02 Keepalived_vrrp[1417]: /usr/local/nginx/sbin/check_nginx.sh exited with status 127
Mar 19 01:05:26 nginx02 Keepalived_vrrp[1417]: /usr/local/nginx/sbin/check_nginx.sh exited with status 127
Mar 19 01:05:27 nginx02 Keepalived_vrrp[1417]: /usr/local/nginx/sbin/check_nginx.sh exited with status 127
Mar 19 01:05:28 nginx02 Keepalived_vrrp[1417]: /usr/local/nginx/sbin/check_nginx.sh exited with status 127
Mar 19 01:05:29 nginx02 Keepalived_vrrp[1417]: /usr/local/nginx/sbin/check_nginx.sh exited with status 127
Mar 19 01:05:30 nginx02 Keepalived_vrrp[1417]: VRRP_Script(check_nginx) succeeded

...省略部分信息...

Mar 19 01:06:40 nginx02 Keepalived_vrrp[1417]: VRRP_Instance(VI_1) Transition to MASTER STATE
Mar 19 01:06:41 nginx02 Keepalived_vrrp[1417]: VRRP_Instance(VI_1) Entering MASTER STATE
Mar 19 01:06:41 nginx02 Keepalived_vrrp[1417]: VRRP_Instance(VI_1) setting protocol VIPs.
Mar 19 01:06:41 nginx02 Keepalived_vrrp[1417]: Sending gratuitous ARP on ens33 for 192.168.218.140
Mar 19 01:06:41 nginx02 Keepalived_vrrp[1417]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on ens33 for 192.168.218.140
Mar 19 01:06:41 nginx02 Keepalived_vrrp[1417]: Sending gratuitous ARP on ens33 for 192.168.218.140
Mar 19 01:06:41 nginx02 Keepalived_vrrp[1417]: Sending gratuitous ARP on ens33 for 192.168.218.140
Mar 19 01:06:41 nginx02 Keepalived_vrrp[1417]: Sending gratuitous ARP on ens33 for 192.168.218.140

 

posted @ 2023-02-09 16:25  larybird  阅读(485)  评论(0编辑  收藏  举报