Keepalived实现Redis Failover
一、环境说明
操作系统版本:RHEL 5.4_64
redis版本:2.8.17
keepalived版本:1.1.15
master:10.142.130.81
slave: 10.142.130.82
Virtural IP Address (VIP) :10.142.130.83(对外提供服务)
redis安装路径:/app/tomcat/redis
redis端口:6379
keepalived安装路径:/etc/keepalived
二、设计思路
1、当master与slave均运作正常时,VIP绑定在master上,master负责对外提供服务,slave进行replication,作为备用。
2、当master挂掉,slave正常时,VIP漂移至slave,由slave接管服务,同时关闭主从复制功能。
3、当master恢复正常后,从slave上同步数据,但不继续接管服务,此时master作为备用。
4、当slave挂掉,master正常,master接管服务,关闭主从复制功能。与此同时,位于slave恢复正常后,从master上同步数据,作为备用。
5、依次循环。
注意:master和slave上均开启rdb快照持久化。
三、具体配置步骤
1、在master和slave上分别安装redis
# su -
# tar zxf redis-2.8.17.tar.gz
# cd zxf redis-2.8.17
# make && make install
# cd ..
# mv redis-2.8.17 /app/tomcat/redis
# chown -R tomcat.app /app/tomcat/redis
2、配置redis
# su - tomcat
# cd /app/tomcat/redis
# mkdir bin rdb conf log
# mv redis.conf sentinel.conf ./conf/
# find /app/tomcat/redis/ -maxdepth 1 -type f -delete
# cd src
# mv mkreleasehdr.sh redis-benchmark redis-check-aof redis-check-dump redis-cli redis-sentinel redis-server ../bin/
编辑/app/tomcat/redis/conf/redis.conf文件,修改如下内容:
daemonize yes
pidfile /app/tomcat/redis/redis.pid
port 6376
logfile /app/tomcat/redis/log/redis.log
dir /app/tomcat/redis/rdb
maxmemory 16106127360 #给redis设定最大使用内存,这里是15G,根据实际情况而定。
3、在master和slave上分别安装keepalived
# su
# tar zxf keepalived-1.1.15.tar.gz
# cd keepalived-1.1.15
# ./configure --sysconf=/etc --with-kernel-dir=/usr/src/kernels/2.6.18-164.el5-x86_64
# make
# make install
# cp /usr/local/sbin/keepalived /sbin/keepalived
4、master上的keepalived配置
# cd /etc/keepalived
# >keepalived.conf
# vi keepalived.conf #添加如下内容
! Configuration File for keepalived
global_defs {
router_id redis-master
}
vrrp_script Monitor_redis {
script "/etc/keepalived/scripts/redis_monitor.sh"
interval 2
weight 2
}
vrrp_instance VI_1{
state BACKUP
interface bond0
virtual_router_id 51
mcast_src_ip 10.142.130.81
priority 100
nopreempt
advert_int 1
authentication {
auth_type PASS
auth_pass 1122
}
track_script {
Monitor_redis
}
virtual_ipaddress {
10.142.130.83
}
notify_master /etc/keepalived/scripts/redis_master.sh
notify_backup /etc/keepalived/scripts/redis_backup.sh
notify_fault /etc/keepalived/scripts/redis_fault.sh
notify_stop /etc/keepalived/scripts/redis_stop.sh
}
# mkdir scripts
# cd scripts
# vi redis_monitor.sh #redis监控脚本
#!/bin/bash
value=$(/app/tomcat/redis/bin/redis-cli -h 10.142.130.81 -p 6379 get name)
if [ "${value}" == "test" ]; then
exit 0
else
/etc/init.d/keepalived stop
exit 1
fi
# vi redis_master.sh #当状态为master时执行的脚本,用来关闭主从复制
#!/bin/bash
REDISCLI="/app/tomcat/redis/bin/redis-cli"
LOGFILE="/app/tomcat/redis/log/keepalived-redis-state.log"
echo "[master]" >> ${LOGFILE}
date >> ${LOGFILE}
echo "Being master...." >> ${LOGFILE} 2>&1
echo "Run SLAVEOF NO ONE cmd ..." >> ${LOGFILE}
${REDISCLI} SLAVEOF NO ONE >> ${LOGFILE} 2>&1
# vi redis_backup.sh #当状态为backup时执行的脚本,用于开启主从复制
#!/bin/bash
REDISCLI="/app/tomcat/redis/bin/redis-cli"
LOGFILE="/app/tomcat/redis/log/keepalived-redis-state.log"
echo "[backup]" >> ${LOGFILE}
date >> ${LOGFILE}
echo "Being slave...." >> ${LOGFILE} 2>&1
sleep 10
echo "Run SLAVEOF cmd ..." >> ${LOGFILE}
${REDISCLI} SLAVEOF 10.142.130.82 6379 >> ${LOGFILE} 2>&1
# vi redis_fault.sh
#!/bin/bash
LOGFILE="/app/tomcat/redis/redis/log/keepalived-redis-state.log"
echo "[fault]" >> $LOGFILE
date >> $LOGFILE
# vi redis_stop.sh
#!/bin/bash
LOGFILE="/app/tomcat/redis/log/keepalived-redis-state.log"
echo "[stop]" >> $LOGFILE
date >> $LOGFILE
给脚本加上可执行权限
chmod +x /etc/keepalived/scripts/*.sh
5、slave上的keepalived配置
# cd /etc/keepalived
# >keepalived.conf
# vi keepalived.conf #添加如下内容
! Configuration File for keepalived
global_defs {
router_id redis_backup
}
vrrp_script Monitor_redis {
script "/etc/keepalived/scripts/redis_monitor.sh"
interval 2
weight 2
}
vrrp_instance VI_1{
state BACKUP
interface bond0
virtual_router_id 51
mcast_src_ip 10.142.130.82
priority 90
advert_int 1
authentication {
auth_type PASS
auth_pass 1122
}
track_script {
Monitor_redis
}
virtual_ipaddress {
10.142.130.83
}
notify_master /etc/keepalived/scripts/redis_master.sh
notify_backup /etc/keepalived/scripts/redis_backup.sh
notify_fault /etc/keepalived/scripts/redis_fault.sh
notify_stop /etc/keepalived/scripts/redis_stop.sh
}
# mkdir scripts
# cd scripts
# vi redis_monitor.sh #redis监控脚本
#!/bin/bash
value=$(/app/tomcat/redis/bin/redis-cli -h 10.142.130.82 -p 6379 get name)
if [ "${value}" == "test" ]; then
exit 0
else
/etc/init.d/keepalived stop
exit 1
fi
# vi redis_master.sh
#!/bin/bash
REDISCLI="/app/tomcat/redis/bin/redis-cli"
LOGFILE="/app/tomcat/redis/log/keepalived-redis-state.log"
echo "[master]" >> ${LOGFILE}
date >> ${LOGFILE}
echo "Being master...." >> ${LOGFILE} 2>&1
echo "Run SLAVEOF NO ONE cmd ..." >> ${LOGFILE}
${REDISCLI} SLAVEOF NO ONE >> ${LOGFILE} 2>&1
# vi redis_backup.sh
#!/bin/bash
REDISCLI="/app/tomcat/redis/bin/redis-cli"
LOGFILE="/app/tomcat/redis/log/keepalived-redis-state.log"
echo "[backup]" >> ${LOGFILE}
date >> ${LOGFILE}
echo "Being slave...." >> ${LOGFILE} 2>&1
sleep 10
echo "Run SLAVEOF cmd ..." >> ${LOGFILE}
${REDISCLI} SLAVEOF 10.142.130.81 6379 >> ${LOGFILE} 2>&1
# vi redis_fault.sh
#!/bin/bash
LOGFILE="/app/tomcat/redis/redis/log/keepalived-redis-state.log"
echo "[fault]" >> ${LOGFILE}
date >> ${LOGFILE}
# vi redis_stop.sh
#!/bin/bash
LOGFILE="/app/tomcat/redis/log/keepalived-redis-state.log"
echo "[stop]" >> ${LOGFILE}
date >> ${LOGFILE}
给脚本加上可执行权限
chmod +x /etc/keepalived/scripts/*.sh四、启动
为方便后续管理,编写一个简单脚本重启redis
# vi restart-redis.sh
#!/bin/bash
ps -ef|awk '/app\/tomcat\/redis\/bin\/redis-server/{print $2}'|xargs kill -9
/app/tomcat/redis/bin/redis-server /app/tomcat/redis/conf/redis.conf
启动顺序
1、分别启动master和slave上redis
2、给master和slave上的redis设置检测的key和value
# /app/tomcat/redis/bin/redis-cli set name test
3、先启动salve上的keepalived,启动方法,用root用户执行命令/etc/init.d/keepalived start
4、用同样的方法启动master上的keepalived
注意事项:
执行ip addr show bond0命令查看VIP绑定在哪个主机上,执行redis-cli info查看主从关系,确保VIP绑定的机器一定要是redis中的master。
如果不是,通过手动关闭keepalived进行调整,一定要保证VIP和主从复制关系正确。首次使用前调整好,后续自动切换基本不会有问题。
五、监控
master挂了,slave会接管服务,那如何去恢复master,让他恢复之后成为slave的角色呢。这就需要要每台主机上部署一个监控脚本,定时每分钟监测一次。脚本如下:
# cat monitor_redis.sh
#!/bin/bash
num_proc=$(/bin/ps -fe|grep [k]eepalived|wc -l)
active=$(/app/tomcat/redis/bin/redis-cli get name)
if [ "${num_proc}" != "3" ];then
if [ "${active}" != "test" ];then
/app/tomcat/bin/restart-redis.sh &>/dev/null
if [ "$(/bin/ps -ef|grep [r]edis-server|wc -l)" == "1" -a "${active}" == "test" ];then
sleep 30
/usr/bin/sudo /etc/init.d/keepalived start &>/dev/null
else
/app/tomcat/redis/bin/redis-cli set name test
[ "${active}" == "test" ] && /usr/bin/sudo /etc/init.d/keepalived start &>/dev/null
fi
else
/usr/bin/sudo /etc/init.d/keepalived start &>/dev/null
fi
fi
六、总结
1、keepalived的master与slave一般通过state和priority来指定,但是这两项设置满足不了我们上面的切换要求,这里是将state 都设置为BACKUP,然后以优先级riority的高低来决定 最初的master归属,另外优先级高的那台需要添加 nopreempt,这个参数的作用是keepalived恢复之后不主动抢占master的角色,等另一端挂掉之后会自动接管。
1、keepalived的master与slave一般通过state和priority来指定,但是这两项设置满足不了我们上面的切换要求,这里是将state 都设置为BACKUP,然后以优先级riority的高低来决定 最初的master归属,另外优先级高的那台需要添加 nopreempt,这个参数的作用是keepalived恢复之后不主动抢占master的角色,等另一端挂掉之后会自动接管。
2、redis启动成功了之后并且能get设定的那个值后keepalived才能正常启动。
3、随着dump.rdb文件日益增大,redis重启后完全将数据加载进内存的时间会越来越长。monitor_redis.sh脚本中sleep 时间也需要作相应调整。