基于docker实现哨兵集群部署
简单dockerfile文件,用于演示sentinel哨兵故障转移
FROM centos:latest MAINTAINER BIXIAOYU RUN groupadd -r redis &&useradd -r -g redis redis RUN yum -y update && yum -y install epel-release && yum -y install redis && yum -y install net-tools EXPOSE 6379
# docker build --no-cache -t redis .
启动docker容器实例
docker run -itd --name redis-master --net=mynetwork -p 6383:6379 --ip 172.60.0.3 redis
docker run -itd --name redis-slave1 --net=mynetwork -p 6384:6379 --ip 172.60.0.4 redis
docker run -itd --name redis-slave2 --net=mynetwork -p 6385:6379 --ip 172.60.0.5 redis
docker run -itd --name redis-sentinel1 --net=mynetwork -p 22536:26379 --ip 172.60.0.5 redis
docker run -itd --name redis-sentinel2 --net=mynetwork -p 22537:26379 --ip 172.60.0.6 redis
docker run -itd --name redis-sentinel3 --net=mynetwork -p 22538:26379 --ip 172.60.0.7 redis
#docker network create --subnet=172.60.0.0/16 mynetwork
【配置】
先完成主从同步,主从同步的配置可参考https://www.cnblogs.com/bixiaoyu/p/10706811.html这篇文章,在此就不必细说了,重点是sentinel的配置
bind 0.0.0.0 #设置允许访问的IP,这里仍然设置0.0.0.0 protected-mode no #允许在连接时,密码为空 port 26379 dir "/tmp" sentinel myid 6d0d4099c13cdeab018e1f2005455be6f1cd6f6b sentinel monitor mymaster 172.60.0.3 6379 2 #设置监听 sentinel config-epoch mymaster 1 sentinel leader-epoch mymaster 1 logfile "/var/log/redis/sentinel.log" sentinel known-slave mymaster 172.60.0.2 6379 sentinel known-slave mymaster 172.60.0.4 6379 sentinel known-sentinel mymaster 172.60.0.7 26379 dfce433e021aa3e82276974aa12fa0684fb0b4f0 sentinel known-sentinel mymaster 172.60.0.6 26379 ceb363cf84103950cfa2a785816c4e8a36c02143 sentinel current-epoch 1
选择一个sentinel节点,查看日志信息,发现默认主节点为172.60.0.2
【测试故障转移】
先通过pkill模拟故障master故障宕机,随后观察哨兵任意节点的情况
我们可以从下面sentinel节点上看到172.0.0.3被选举成为新主节点
[root@ef77b61448fc /]# tail /var/log/redis/sentinel.log 68:X 21 Apr 07:51:26.472 * +sentinel sentinel ceb363cf84103950cfa2a785816c4e8a36c02143 172.60.0.6 26379 @ mymaster 172.60.0.2 6379 68:X 21 Apr 07:53:24.154 * +sentinel sentinel dfce433e021aa3e82276974aa12fa0684fb0b4f0 172.60.0.7 26379 @ mymaster 172.60.0.2 6379 68:X 21 Apr 08:16:05.724 # +sdown master mymaster 172.60.0.2 6379 68:X 21 Apr 08:16:05.895 # +new-epoch 1 68:X 21 Apr 08:16:05.895 # +vote-for-leader dfce433e021aa3e82276974aa12fa0684fb0b4f0 1 68:X 21 Apr 08:16:06.276 # +config-update-from sentinel dfce433e021aa3e82276974aa12fa0684fb0b4f0 172.60.0.7 26379 @ mymaster 172.60.0.2 6379 68:X 21 Apr 08:16:06.276 # +switch-master mymaster 172.60.0.2 6379 172.60.0.3 6379 68:X 21 Apr 08:16:06.276 * +slave slave 172.60.0.4:6379 172.60.0.4 6379 @ mymaster 172.60.0.3 6379 68:X 21 Apr 08:16:06.276 * +slave slave 172.60.0.2:6379 172.60.0.2 6379 @ mymaster 172.60.0.3 6379
查看sentinel节点的日志信息
此时,我们把pkill掉的172.60.0.2(旧主节点)重启,在看看新任Master主节点172.60.0.3有何变化
重启的旧主节点成为从节点并加入进来,此时正在进行复制动作,offset的偏移量还没有同步一致,
【小结】
进入哨兵节点客户端执行SENTINEL masterts命令显示被监控的所有master以及状态
127.0.0.1:26379> SENTINEL masters 1) 1) "name" 2) "mymaster" #被监控主节点的名称 3) "ip" 4) "172.60.0.3" #被监控主节点的IP 5) "port" 6) "6379" 7) "runid" 8) "dd3696a2793e4e19892fca48793d75cec3f07bea" #被监控主节点的runid值 9) "flags" 10) "master" 11) "link-pending-commands" 12) "0" 13) "link-refcount" 14) "1" 15) "last-ping-sent" 16) "0" 17) "last-ok-ping-reply" 18) "847" 19) "last-ping-reply" 20) "847" 21) "down-after-milliseconds" 22) "30000" #监控节点不可达超时时间 23) "info-refresh" 24) "6480" 25) "role-reported" 26) "master" 27) "role-reported-time" 28) "2547025" 29) "config-epoch" 30) "1" 31) "num-slaves" #检测剩余slave节点个数 32) "2" 33) "num-other-sentinels" #检测其它sentinel节点个数 34) "2" 35) "quorum" #允许主节点不可用的sentinels的数量(最多允许两个sentinel节点故障) 36) "2" 37) "failover-timeout" #延迟时间 38) "180000" 39) "parallel-syncs" #复制转移数量 40) "1"
执行SENTINEL slaves mymastert查看从节点信息,此时你会看到两个从节点
1) 1) "name" 2) "172.60.0.2:6379" 3) "ip" 4) "172.60.0.2" 5) "port" 6) "6379" 7) "runid" 8) "3ede55439a3ce6fb1ab171ed7fd6b6c639725966" 9) "flags" 10) "slave" 11) "link-pending-commands" 12) "0" 13) "link-refcount" 14) "1" 15) "last-ping-sent" 16) "0" 17) "last-ok-ping-reply" 18) "537" 19) "last-ping-reply" 20) "537" 21) "down-after-milliseconds" 22) "30000" 23) "info-refresh" 24) "4867" 25) "role-reported" 26) "slave" 27) "role-reported-time" 28) "2525163" 29) "master-link-down-time" 30) "0" 31) "master-link-status" 32) "ok" 33) "master-host" 34) "172.60.0.3" 35) "master-port" 36) "6379" 37) "slave-priority" 38) "100" 39) "slave-repl-offset" 40) "566056" 2) 1) "name" 2) "172.60.0.4:6379" 3) "ip" 4) "172.60.0.4" 5) "port" 6) "6379" 7) "runid" 8) "f17cfcfc4b9217e1f5a3c0d0a2c55d82da46c37e" 9) "flags" 10) "slave" 11) "link-pending-commands" 12) "0" 13) "link-refcount" 14) "1" 15) "last-ping-sent" 16) "0" 17) "last-ok-ping-reply" 18) "304" 19) "last-ping-reply" 20) "304" 21) "down-after-milliseconds" 22) "30000" 23) "info-refresh" 24) "3480" 25) "role-reported" 26) "slave" 27) "role-reported-time" 28) "2865146" 29) "master-link-down-time" 30) "0" 31) "master-link-status" 32) "ok" 33) "master-host" 34) "172.60.0.3" 35) "master-port" 36) "6379" 37) "slave-priority" 38) "100" 39) "slave-repl-offset" 40) "566326"
查看主节点的端口
127.0.0.1:26379> sentinel get-master-addr-by-name mymaster 1) "172.60.0.3" 2) "6379"
执行sentinel failover mymaster命令,强制切换主节点!下图所示,发现主节点已经由172.60.0.3变为172.60.0.2了,
待更。。。。