Keepalived实现高可用
原文发表于cu:2017-03-27
参考文档:
- keepalived user guide:http://www.keepalived.org/pdf/UserGuide.pdf
- 安装文档:源码解压包中的INSTALL文档
本文涉及keepalived的安装,简单配置,为haproxy做高可用。
一.环境准备
1. 操作系统
CentOS-7-x86_64-Everything-1511
2. Keepalived版本
截至2017-03-22,keepalived版本是1.3.5:
http://www.keepalived.org/software/keepalived-1.3.5.tar.gz
3. 拓扑图
-
采用VMware ESXi虚拟出的2台服务器node1/2,前端访问地址10.11.4.151/152,后端地址192.168.4.151/2;
-
Web1服务器为采用docker技术生成的1台服务器,已安装并启动nginx与php服务,ip地址192.168.4.171;
-
Web2/3同Web1服务器,ip地址192.168.4.172/173;
-
计划在node1/2两台服务器上部署keepalive&haproxy,利用keepalived虚拟出vip:10.11.4.150做高可用;
-
Haproxy相关配置请参考:http://www.cnblogs.com/netonline/p/7593762.html,调整后将静态网页指向web1/2服务器的index.html,将动态网页指向web1/2服务器的index.php,其他指向web3服务器;
-
以web1为例,设置测试页面,以方便后续查看验证结果。
二.Keepalived安装配置
以下流程均在node1节点完成,node2节点请参考node1做适当修改。
1. 依赖软件
#升级或者安装相关软件,不是必需都安装一次; #一般libnl3-devel ipset-devel iptables-devel libnfnetlink-devel popt popt-static popt-devel等并没有预安装到系统中; #net-snmp-devel是需要开启相关功能才需要 [root@elk-node1 ~]# yum install openssl-devel libnl3-devel ipset-devel iptables-devel libnfnetlink-devel popt popt-static popt-devel gcc kernel-headers kernel-devel net-snmp-devel -y
2. 下载
[root@elk-node1 ~]# cd /usr/local/src/ [root@elk-node1 src]#wget http://www.keepalived.org/software/keepalived-1.3.5.tar.gz
3. 编译安装
#编译前可通过”./configure --help”查看相关编译参数; #此编译未带“--with-kernel-dir”参数,一般认为采用”--with-kernel-dir=/usr/src/kernels/(version)”指定到内核效果更好,这里环境比较简单,实际使用后并没有明显的问题; #这里未指定是因为centos7在编译使用参数之后找不到”linux/netlink.h”头文件,即使在相应目录下能找到相应头文件,搜了一下也没有找到对应的解决方案 [root@elk-node1 src]# tar -zxvf keepalived-1.3.5.tar.gz [root@elk-node1 src]# cd keepalived-1.3.5 [root@elk-node1 keepalived-1.3.5]# ./configure --prefix=/usr/local/keepalived [root@elk-node1 keepalived-1.3.5]# make [root@elk-node1 keepalived-1.3.5]# make install
4. 配置开机启动
1)启动相关命令
#软链接 [root@elk-node1 ~]# cd /usr/local/keepalived/ [root@elk-node1 keepalived]# ln -s /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/ [root@elk-node1 keepalived]# ln -s /usr/local/keepalived/sbin/keepalived /usr/sbin/
2)配置文件
#软链接 [root@elk-node1 keepalived]# mkdir -p /etc/keepalived [root@elk-node1 keepalived]# ln -s /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/
3)开机启动
#centos7编译安装目录下,默认没有”/etc/rc.d/init.d/keepalived”文件,即自启脚本,需要手工配置,前提是将启动相关命令,配置文件等按脚本定义的目录放置; #启动时,可能需要运行:systemctl daemon-reload再重启keepalived [root@elk-node1 keepalived]# touch /etc/rc.d/init.d/keepalived [root@elk-node1 keepalived]# chmod +x /etc/rc.d/init.d/keepalived [root@elk-node1 keepalived]# vim /etc/rc.d/init.d/keepalived #!/bin/sh # # keepalived High Availability monitor built upon LVS and VRRP # # chkconfig: - 86 14 # description: Robust keepalive facility to the Linux Virtual Server project \ # with multilayer TCP/IP stack checks. ### BEGIN INIT INFO # Provides: keepalived # Required-Start: $local_fs $network $named $syslog # Required-Stop: $local_fs $network $named $syslog # Should-Start: smtpdaemon httpd # Should-Stop: smtpdaemon httpd # Default-Start: # Default-Stop: 0 1 2 3 4 5 6 # Short-Description: High Availability monitor built upon LVS and VRRP # Description: Robust keepalive facility to the Linux Virtual Server # project with multilayer TCP/IP stack checks. ### END INIT INFO # Source function library. . /etc/rc.d/init.d/functions exec="/usr/sbin/keepalived" prog="keepalived" config="/etc/keepalived/keepalived.conf" [ -e /etc/sysconfig/$prog ] && . /etc/sysconfig/$prog lockfile=/var/lock/subsys/keepalived start() { [ -x $exec ] || exit 5 [ -e $config ] || exit 6 echo -n $"Starting $prog: " daemon $exec $KEEPALIVED_OPTIONS retval=$? echo [ $retval -eq 0 ] && touch $lockfile return $retval } stop() { echo -n $"Stopping $prog: " killproc $prog retval=$? echo [ $retval -eq 0 ] && rm -f $lockfile return $retval } restart() { stop start } reload() { echo -n $"Reloading $prog: " killproc $prog -1 retval=$? echo return $retval } force_reload() { restart } rh_status() { status $prog } rh_status_q() { rh_status &>/dev/null } case "$1" in start) rh_status_q && exit 0 $1 ;; stop) rh_status_q || exit 0 $1 ;; restart) $1 ;; reload) rh_status_q || exit 7 $1 ;; force-reload) force_reload ;; status) rh_status ;; condrestart|try-restart) rh_status_q || exit 0 restart ;; *) echo $"Usage: $0 {start|stop|status|restart|condrestart|try-restart|reload|force-reload}" exit 2 esac exit $? #设置开机启动 [root@elk-node1 keepalived]# chkconfig --add keepalived [root@elk-node1 keepalived]# chkconfig --level 35 keepalived on [root@elk-node1 keepalived]# vim /usr/lib/systemd/system/keepalived.service #修改PIDFile,如下: PIDFile=/var/run/keepalived.pid
5. Keepalived配置文件
[root@elk-node1 ~]# vim /usr/local/keepalived/etc/keepalived/keepalived.conf #===================================================== # keepalived.conf 配置 #------------------------------------------------------------ # 1、Keepalived 配置文件以block形式组织,每个块内容都包含在{} # 2、“#”,“!”开头行为注释 # 3、keepalived 配置为三类: # (1)全局配置:对整个keepalived都生效的配置 # (2)VRRPD 配置:核心配置,主要实现keepalived高可用功能 # (3)LVS配置 #===================================================== ! Configuration File for keepalived ######################## # 全局配置 ######################## # global_defs 全局配置标识; global_defs { # notification_email用于设置报警邮件地址; 可以设置多个,每行一个; 设置邮件报警需开启本机Sendmail服务 notification_email { root@localhost.local } # 设置邮件发送地址, smtp server地址, 连接smtp sever超时时间 notification_email_from root@localhost.local smtp_server 10.11.4.151 smtp_connect_timeout 30 # 表示运行keepalived服务器标识,邮件发送时在主题中显示的信息 router_id Haproxy_DEVEL } ###################### # 服务检测配置 ###################### # 服务探测,chk_haproxy为服务名返回0说明服务是正常的 vrrp_script chk_haproxy { script "/usr/local/keepalived/etc/chk_haproxy.sh" #每隔1秒探测一次 interval 1 #haproxy在线,权重加2 # weight 2 } ###################### # VRRPD配置 ###################### # VRRPD配置标识,VI_1是实例名称 vrrp_instance VI_1 { # 指定Keepalvied角色,MASTER(必须大写)表示此主机为主服务器,BACKUP则是表示为备用服务器; # 这里因为配置非抢占模式,nopreempt只作用于BACKUP,将2台主机均配置为BACKUP state BACKUP # 指定HA监测网络的接口 interface eth0 # 虚拟路由标识,标识为数字,1-255可选; # 同1个VRRP实例使用唯一的标识,MASTER_ID = BACKUP_ID virtual_router_id 51 # 定义节点优先级,数字越大表示节点的优先级越高; # 同1个VRRP_instance下,MASTE_PRIORITY > BACKUP_PRIORITY priority 100 # MASTER与BACKUP主机之间同步检查的时间间隔,单位为秒 advert_int 1 # 从实际应用角度,建议配置非抢占模式,防止网络频繁切换震荡 nopreempt # 设定节点间通信验证类型与密码,验证类型主要有PASS和AH两种; # 同1个vrrp_instance,MASTER验证密码和BACKUP保持一致 authentication { auth_type PASS auth_pass 987654 } # 设置虚拟IP地址(VIP),又叫做漂移IP地址; # 可设置多个,1行1个; # keepalived通过“ip address add”命令的形式将VIP添加到系统 virtual_ipaddress { 10.11.4.150 } # 脚本追踪,对应服务检测 track_script { chk_haproxy } } ############################################## # LVS配置,这里keepalived只做高可用,并不做lvs ############################################## # virtual_server LVS配置标识 # 格式: virtual_server VIP port [IP 和 port 之间空格隔开] # virtual_server 10.11.4.150 443 { # 设置健康检查时间间隔,单位为秒 # delay_loop 6 # 设置负载调度算法,常用调度算法是: rr、wlc,另有:lc、lblc、sh、dh等 # lb_algo rr # 设置LVS实现负载均衡的机制,有NAT、TUN和DR三种模式可选 # lb_kind NAT # 会话保持时间,其对动态网页非常有用,为集群系统中的seesion共享提供了一个很好的解决方案; # 用户的请求会一直分发到某个服务节点,直至超过这个会话的保持时间(指最大无响应超时时间), # 即用户操作动态页面如果在50s没有执行任何操作则被分发到另外的节点 # persistence_timeout 50 # 转发协议类型 # protocol TCP # 设置real server段开始的标识 [ IP为真实IP地址] # 格式:real_server realIP port [IP 和 port 之间空格隔开] # real_server 192.168.201.100 443 { # real server节点的权值,权值大小用数字表示,数字越大,权值越高 # weight 1 # 健康检查 SSL_GET # SSL_GET { # 指定SSL检查的URL信息,可以指定多个 # url { # 详细的URL路径 # path /index.html # SSL检查后的摘要信息,可以通过genhash命令工具获取,命令如下: # [root@elk-node1 bin]# /usr/local/keepalived/bin/genhash -s 192.168.4.171 -p 80 -u /index.html # digest ff20ad2481f97b1754ef3e12ecd3a9cc # } # url { # path /mrtg/ # digest 9b3a0c85a887a256d6939da88aabd8cd # } # 无响应超时时间,单位为秒 # connect_timeout 3 # 重试次数 # nb_get_retry 3 # 重试间隔 # delay_before_retry 3 # } # } #}
6. Keepalived检测脚本
#检测haproxy服务是否正常运行,如果没有则尝试拉起来,如果尝试失败则重启keepalived服务,切换keepalived的vip [root@elk-node1 ~]# touch /usr/local/keepalived/etc/chk_haproxy.sh [root@elk-node1 ~]# chmod 755 /usr/local/keepalived/etc/chk_haproxy.sh [root@elk-node1 ~]# vim /usr/local/keepalived/etc/chk_haproxy.sh #!/bin/bash # check haproxy process, if there isn't any process, try to start the process once, # check it again after 3s, if there isn't any process still, restart keepalived process, change state. # 2017-03-22 v0.1 if [ $(ps -C haproxy --no-header | wc -l) -eq 0 ]; then /etc/rc.d/init.d/haproxy start sleep 3 if [ $(ps -C haproxy --no-header | wc -l) -eq 0 ]; then /etc/rc.d/init.d/keepalived restart fi fi # another method to check haproxy process #killall -0 haproxy #if [[ $? -ne 0 ]];then # /etc/rc.d/init.d/keepalived restart #fi
三.验证
1. 启动
[root@elk-node1 ~]# service keepalived start [root@elk-node2 ~]# service keepalived start
2. 查看日志
1)Node1
[root@elk-node1 ~]# tailf /var/log/messages
- 以BACKUP模式启动;
- 切换到MASTER模式;
- 获得vip 10.11.4.150,开始对外发送免费arp通告。
2)Node2
[root@elk-node2 ~]# tailf /var/log/messages
- 两个相关子进程启动;
- 启动后进入BACKUP模式。
3. VIP
#使用的是"ip address add"添加的vip到系统中,因"ifconfig"命令看不到效果 [root@elk-node1 ~]# ip address show eth0
Node1的网卡eth0已经获得vip 10.11.4.150。
4. 故障切换
1)Haproxy故障拉起
[root@elk-node1 ~]# date ; service haproxy stop [root@elk-node1 ~]# date ; service haproxy status
- 手工停止haproxy服务;
- 因为keepalived配置文件中定义了拉起haproxy服务的脚本,可以看到1s的时间内,haproxy服务又开始运行了。
2)Node1日志
- 日志显示haproxy服务停止后再被拉起;
- Keepalived进入FAULT STATE,进而转到BACKUP STATE;
- Node1的eth0网卡的vip被删除。
3)Node2日志
- Node2转到MASTER STATE;
- Node2获得vip 10.11.4.150,并开始对外发免费arp通告。
4)Node2 VIP
[root@elk-node2 ~]# ip address show eth0
Node2的网卡eth0已经获得vip 10.11.4.150。