高可用keepalived
什么是高可用:
“高可用性”(High Availability)通常来描述一个系统经过专门的设计,从而减少停工时间,而保持其服务的高度可用性。(目的:减少停工时间)。
- 负载均衡服务器的高可用:
为了屏蔽负载均衡服务器的失效,需要建立一个备份机。主服务器和备份机上都运行High Availability监控程序,通过传送诸如“I am alive”这样的信息来监控对方的运行状况。当备份机不能在一定的时间内收到这样的信息时,它就接管主服务器的服务IP并继续提供服务
高可用集群:
- keepalived
- slb
对谁高可用?
对关键性业务做高可用(如lvs的调度器,主从数据库的主数据库)
keepalived
概念
起初是专为LVS负载均衡设计用来监控LVS各集群节点状态,后来加入VRRP协议。通过VRRP(Virtual Router RedundancyProtocol)虚拟路由器冗余协议,来保证个别节点宕机时,整个网络服务不间断运行。
功能
- 管理LVS负载均衡软件
- 实现LVS集群节点的健康检查
- 作为系统网络服务的高可用性(failover)
原理
基于keepalived的nginx负载均衡高可用实现
环境说明
系统 | 主机名 | ip |
---|---|---|
rhel8 | master | 192.168.94.141 |
rhel8 | slave | 192.168.94.143 |
//主备防火墙selinux关闭,安装keepalived和nginx的包
systemctl stop firewalld.service
setenforce 0
yum -y install keepalived nginx
//主备各修改网页内容,方便查看效果,设置开机自启
systemctl enable --now nginx
[root@master ~]# echo 'master'> /usr/share/nginx/html/index.html
[root@slave ~]# echo 'slave'> /usr/share/nginx/html/index.html
keepalived配置
master端配置
//编写脚本
[root@master ~]# mkdir /script/
[root@node1 script]# cd
[root@node1 ~]# cd /script/
[root@node1 script]# cat Ngchcek.sh
#!/bin/bash
nginx_status=$(ps -ef|grep -Ev "grep|$0"|grep '\bnginx\b'|wc -l) #定义一个变量用于检测nginx进程数
if [ $nginx_status -lt 1 ];then #判断进程数是否<1,是则停止keep
systemctl stop keepalived
fi
[root@node1 script]# cat notify.sh
#!/bin/bash
VIP=$2 #定义VIP变量为外部传进来的第二个参数
sendmail (){ #定义一个sendmail函数
subject="${VIP}'s server keepalived state is translate"
content="`date +'%F %T'`: `hostname`'s state change to master" #定义主题,内容
echo $content | mail -s "$subject" 1@2.com #发送给指定邮箱
}
case "$1" in #分支选择
master) #如果第一个参数是master
nginx_status=$(ps -ef|grep -Ev "grep|$0"|grep '\bnginx\b'|wc -l)
if [ $nginx_status -lt 1 ];then #判断进程数是否小于1
systemctl start nginx #成立则启动nginx
fi
sendmail #调用sendmail函数
;;
backup) #第一个参数是备份的情况
nginx_status=$(ps -ef|grep -Ev "grep|$0"|grep '\bnginx\b'|wc -l)
if [ $nginx_status -gt 0 ];then #判断进程是否大于0
systemctl stop nginx #成立则停止进程
fi
;;
*)
echo "Usage:$0 master|backup VIP" #其他参数则输出正确语法
;;
esac
[root@master ~]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id lb01 #全局定义路由id为lb01
}
vrrp_script nginx_check { #VRRP脚本
script "/script/check_n.sh"
interval 10 #设置脚本检测间隔(秒)
weight -20
}
vrrp_instance VI_1 { #VRRP实例
state MASTER #状态为主(master)
interface ens160 #本机网卡名
virtual_router_id 51 #虚拟路由id
priority 100 #优先级100
advert_int 10 #检查间隔,默认1秒 VRRP心跳包的发送周期
authentication { #认证
auth_type PASS #类型为密码
auth_pass fxx #设置密码
}
virtual_ipaddress { #vip设置
192.168.94.250
}
track_script {
nginx_check
}
A notify_master "/script/notify.sh master 192.168.94.141"
notify_backup "/script/notify.sh backup 192.168.94.143"
}
virtual_server 192.168.94.250 80 { #虚拟服务器设置
delay_loop 6 #延迟回路6次
lb_algo rr #轮询
lb_kind DR #工作模式DR
persistence_timeout 50 #持久性超时
protocol TCP #协议TCP
real_server 192.168.94.141 80 { #真实服务器设置
weight 1 #权重1
TCP_CHECK {
connect_port 80 #访问端口80
connect_timeout 3 #超时连接3次
nb_get_retry 3
delay_before_retry 3
}
}
real_server 192.168.94.143 80 {
weight 1
TCP_CHECK {
connect_port 80
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
}
//令keepalived服务开机自启
[root@master ~]# systemctl enable --now keepalived.service
//查看ip
[root@master ~]# ip a |grep ens160
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
inet 192.168.94.141/24 brd 192.168.94.255 scope global dynamic noprefixroute ens160
inet 192.168.94.250/32 scope global ens160 #vip启动了
slave端配置
[root@slave ~]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id lb02 #路由id lb02
}
vrrp_instance VI_1 {
state BACKUP
interface ens160
virtual_router_id 51
priority 90
nopreemt
advert_int 10 #检查间隔,默认1秒 VRRP心跳包的发送周期
authentication {
auth_type PASS
auth_pass fxx
}
virtual_ipaddress {
192.168.94.250
}
notify_master "/script/notify.sh master 192.168.94.141"
notify_backup "/script/notify.sh backup 192.168.94.143"
}
virtual_server 192.168.94.250 80 {
delay_loop 6
lb_algo rr
lb_kind DR
persistence_timeout 50
protocol TCP
real_server 192.168.94.141 80 {
weight 1
TCP_CHECK {
connect_port 80
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
real_server 192.168.94.143 80 {
weight 1
TCP_CHECK {
connect_port 80
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
}
[root@slave ~]# mkdir /script
[root@master script]# scp /script/notify.sh root@node2:/script
notify.sh 100% 583 562.4KB/s 00:00
[root@slave ~]# systemctl enable --now keepalived.service
测试验证
//保证salve端keep运行,nginx停止
[root@slave ~]# systemctl status keepalived.service |grep running
Active: active (running) since Thu 2021-05-20 21:41:24 CST; 19min ago
[root@slave ~]# systemctl status nginx.service |grep dead
Active: inactive (dead)
//保证master上二者都是启动的
[root@master ~]# systemctl status keepalived.service |grep running
Active: active (running) since Thu 2021-05-20 21:41:24 CST; 19min ago
[root@master ~]# systemctl status nginx.service |grep running
Active: active (running) since Thu 2021-05-20 21:41:28 CST; 19min ago
访问vip
//模拟主nginx挂了
[root@master ~]# pkill nginx
[root@master ~]# systemctl status nginx.service
● nginx.service - The nginx HTTP and reverse proxy server
Loaded: loaded (/usr/lib/systemd/system/nginx.service; di>
Active: inactive (dead)
//脚本检测没有nginx进程,触发条件停止keep
[root@master ~]# systemctl status keepalived.service
● keepalived.service - LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service;>
Active: inactive (dead) since Thu 2021-05-20 22:04:55 CST; >
Process: 1616 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPT>
Main PID: 1617 (code=exited, status=0/SUCCESS)
再次访问vip
//查看从端发现nginx启动。
[root@slave ~]# systemctl status nginx.service
● nginx.service - The nginx HTTP and reverse proxy server
Loaded: loaded (/usr/lib/systemd/system/nginx.service; disabled; vendor p>
Active: active (running) since Thu 2021-05-20 22:04:55 CST; 3min 53s ago
主恢复的情况
//启动主ngin和keep
[root@master ~]# systemctl start nginx.service
[root@master ~]# systemctl start keepalived.service
//发现从nginx挂了
[root@slave ~]# systemctl status nginx.service
● nginx.service - The nginx HTTP and reverse proxy server
Loaded: loaded (/usr/lib/systemd/system/nginx.service; disabled; vendor p>
Active: failed (Result: timeout) since Thu 2021-05-20 22:13:05 CST; 1min >
访问vip