Redis Sentinel高可用架构

Redis目前高可用的架构非常多,比如keepalived+redis,redis cluster,twemproxy,codis,这些架构各有优劣,今天暂且不说这些架构,今天主要说说redis sentinel高可用架构。

它的主要功能有以下几点

  • 不时地监控redis是否按照预期良好地运行;
  • 如果发现某个redis节点运行出现状况,能够通知另外一个进程(例如它的客户端);
  • 能够进行自动切换。当一个master节点不可用时,能够选举出master的多个slave(如果有超过一个slave的话)中的一个来作为新的master,其它的slave节点会将它所追随的master的地址改为被提升为master的slave的新地址。

关于更加详细的配置以及介绍推荐看完以下文章,我在这里就不多说了,直接进行搭建:

http://segmentfault.com/a/1190000002680804

http://segmentfault.com/a/1190000002685515

redis sentinel的架构如下图:

 

当然Redis-Sentinel推荐使用3个或者3个以上节点,至于为什么这么做看完我上面给的文章链接。

环境介绍:

Redis Sentinel5台服务器:

10.36.30.203
10.36.30.204
10.37.124.202
10.37.124.203
10.37.124.204

这里不要觉得浪费,这样做是为了更加安全高效的监控redis,且redis Sentinel可以进行复用,也就是可以监控多个Redis实例,所以服务器不存在浪费。

Redis 服务器2台,1主1从:

10.69.25.173  master
10.69.30.170 slave

5台Sentinel的配置文件内容如下:

port 26379
dir "/data/redis/sentinel/26379"
daemonize yes
logfile "/data/redis/sentinel/26379/sentinel.log"

# 6379
sentinel monitor master-6379 10.69.25.173 6379 3
sentinel down-after-milliseconds master-6379 15000
sentinel parallel-syncs master-6379 1
sentinel failover-timeout master-6379 180000
sentinel client-reconfig-script master-6379 /sh/redis/notify.py

其中sentinel client-reconfig-script master-6379 /sh/redis/notify.py是在主从切换以后发送告警邮件。其他参数的意义参考我给的文章链接。相关目录自己创建好。
notify.py脚本内容如下,5台服务器上面都需要存在,因为你不知道哪个节点会被选举为leader(网上还没有人提到切换发送告警邮件问题):

#!/usr/bin/python
#coding:utf8

import sys
import time
import smtplib
import logging
from email.mime.text import MIMEText
from email.message import Message
from email.header import Header


alarm_mail =['xxxxxx@163.com']

def main():
  
    failover_time=time.strftime("%Y-%m-%d %H:%M:%S")

    logging.basicConfig(level=logging.DEBUG,
                format='%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s',
                datefmt='%Y-%m-%d %H:%M:%S',
                filename='/sh/redis/failover.log',
                filemode='a')

    console = logging.StreamHandler()
    console.setLevel(logging.INFO)
    formatter = logging.Formatter('%(name)-12s: %(levelname)-8s %(message)s')
    console.setFormatter(formatter)
    logging.getLogger('').addHandler(console)

    mail_host='xxxxx'
    mail_port=25
    mail_user='xxxxxxx'
    mail_pass='xxxxxxxx'
    mail_send_from = 'xxxxxxx'

    def send_mail(to_list,sub,content):
        me=mail_send_from
        msg = MIMEText(content, _subtype='html', _charset='utf-8')
        msg['Subject'] = Header(sub,'utf-8')
        msg['From'] = Header(me,'utf-8')
        msg['To'] = ";".join(to_list)
        try:
            smtp = smtplib.SMTP()
            smtp.connect(mail_host,mail_port)
            smtp.login(mail_user,mail_pass)
            smtp.sendmail(me,to_list, msg.as_string())
            smtp.close()
            return True
        except Exception as error:
            logging.error("邮件发送失败: %s" % (error))
            return False

    try:
        master_name = sys.argv[1]
        role = sys.argv[2]
        from_ip = sys.argv[4]
        from_port = sys.argv[5]
        to_ip = sys.argv[6]
        to_port = sys.argv[7]
    except Exception as error:
        logging.error('从 Sentinel 获取参数错误: %s ' % (error))
        sys.exit(1)

    sub='redis %s faiover' % (master_name)
    nodify_message = "%s %s is failover end. sentinel find redis master %s:%s is down. failover to slave %s:%s" % (failover_time,master_name,from_ip,from_port,to_ip,to_port)
    
    if role == 'leader':
        logging.info(nodify_message)
        send_mail(alarm_mail,sub,nodify_message)

if __name__ == "__main__":
    main()
View Code

10.69.25.173  master

10.69.30.170  slave

自己安装完成redis,并且搭建好复制关系。

 

现在分别在5台Sentinel服务器上面启动Sentinel,有2种方式启动。哪两种自己看前面文章。

redis-sentinel sentinel.conf

启动以后随便找一台服务器查看日志,输出如下提示:

[18219] 12 Dec 09:56:47.161 # Sentinel runid is f3086fc39145cb3d832785899699050d2c7f3b08
[18219] 12 Dec 09:56:47.161 # +monitor master master-6379 10.69.25.173 6379 quorum 1
[18219] 12 Dec 09:56:47.183 * +slave slave 10.69.30.170:6379 10.69.30.170 6379 @ master-6379 10.69.25.173 6379

这里的+slave就表示找到了一个从库。

再看看其他sentinel服务器的日志:

[1480] 12 Dec 09:58:37.250 # Sentinel runid is 812f9f8b860dcc73d4b587e3bdf85df13808a3cd
[1480] 12 Dec 09:58:37.250 # +monitor master master-6379 10.69.25.173 6379 quorum 1
[1480] 12 Dec 09:58:38.252 * +slave slave 10.69.30.170:6379 10.69.30.170 6379 @ master-6379 10.69.25.173 6379
[1480] 12 Dec 09:58:38.304 * +sentinel sentinel 10.36.30.204:26379 10.36.30.204 26379 @ master-6379 10.69.25.173 6379
[1480] 12 Dec 09:58:38.388 * +sentinel sentinel 10.37.124.202:26379 10.37.124.202 26379 @ master-6379 10.69.25.173 6379
[1480] 12 Dec 09:58:38.461 * +sentinel sentinel 10.37.124.203:26379 10.37.124.203 26379 @ master-6379 10.69.25.173 6379
[1480] 12 Dec 09:58:39.423 * +sentinel sentinel 10.37.124.204:26379 10.37.124.204 26379 @ master-6379 10.69.25.173 6379

+sentinel表示发现了其他的sentinel服务器。现在整个集群就已经工作了。

首先进入sentinel查看现在的主节点是哪台服务器(随便哪台sentinel都可以):

redis-cli -p 26379
127.0.0.1:26379> info Sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
master0:name=master-6379,status=ok,address=10.69.25.173:6379,slaves=1,sentinels=5
127.0.0.1:26379> 

可以看到现在的主库是10.69.25.173:6379。现在我们把这台服务器的redis进程kill掉,查看是否会进行切换:

pkill -9 redis

再次查看,发现主库已经是原来的从库了。
而且还会收到告警邮件,内容如下:

127.0.0.1:26379> info Sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
master0:name=master-6379,status=ok,address=10.69.30.170:6379,slaves=1,sentinels=5
127.0.0.1:26379> 

同样的,如果把刚才kill掉的reids重新启动,又会把启动的redis设置为10.69.30.170的从库。

[1480] 12 Dec 10:01:48.921 # +new-epoch 1
[1480] 12 Dec 10:01:48.933 # +vote-for-leader 92517289efcb4ae695eff3e064fde7f4e0e43a1f 1
[1480] 12 Dec 10:01:48.955 # +sdown master master-6379 10.69.25.173 6379
[1480] 12 Dec 10:01:48.955 # +odown master master-6379 10.69.25.173 6379 #quorum 1/1
[1480] 12 Dec 10:01:48.955 # Next failover delay: I will not start a failover before Sat Dec 12 10:07:49 2015
[1480] 12 Dec 10:01:50.067 # +config-update-from sentinel 10.37.124.203:26379 10.37.124.203 26379 @ master-6379 10.69.25.173 6379
[1480] 12 Dec 10:01:50.067 # +switch-master master-6379 10.69.25.173 6379 10.69.30.170 6379
[1480] 12 Dec 10:01:50.067 * +slave slave 10.69.25.173:6379 10.69.25.173 6379 @ master-6379 10.69.30.170 6379
[1480] 12 Dec 10:02:05.109 # +sdown slave 10.69.25.173:6379 10.69.25.173 6379 @ master-6379 10.69.30.170 6379
[1480] 12 Dec 10:03:19.241 # -sdown slave 10.69.25.173:6379 10.69.25.173 6379 @ master-6379 10.69.30.170 6379
[1480] 12 Dec 10:03:29.219 * +convert-to-slave slave 10.69.25.173:6379 10.69.25.173 6379 @ master-6379 10.69.30.170 6379

那么客户端如何知道主从进行切换了呢,如果是java那么有jedis客户端比较方便,如果是php,python语言呢,我们可以自己进行判断。当然还有另外一种方法就是采用dns,修改dns解析。
我这里用python简单写了一个daemon,不会php,哎。

#!/usr/bin/python
import redis
import os

sentinel_server=['10.36.30.203:26379','10.36.30.204:26379','10.37.124.202:26379','10.37.124.203:26379','10.37.124.204:26379']

def queue(host,port):
    str=''.join(map(lambda xx:(hex(ord(xx))[2:]),os.urandom(16)))
    pool = redis.ConnectionPool(host=host, port=port, db=0)
    r = redis.Redis(connection_pool=pool)
    r.lpush('low_task_queue',str)

def get_sentinel():
    global master_host
    global master_port

    for info in sentinel_server:
        host=info.split(':')[0]
        port=info.split(':')[1]
        try:
            r = redis.Redis(host=host, port=port)
            info=r.info('sentinel')['master0']['address'].split(':')
            master_host=info[0]
            master_port=info[1]
        except Exception as error:
            print 'concat to sentinel error: %s' % (error)
            pass
        else:
            break 

if __name__ == "__main__":
    get_sentinel()
    while True:
        try:
            queue(master_host,master_port)
        except Exception as error:
            print 'conct redis error %s' % (error)
            get_sentinel()
            continue     
View Code

 

如果引入dns,那么架构图可以是下面这样:

以上就是简单的测试了,更多的测试交给大家了。

总结:

Redis Sentinel实现高可用还是比较靠谱的,后面线上也打算使用。需要注意的是Redis Sentinel节点推荐3个以上。相比keepalived+redis实现高可用更靠谱,且keepalived+redis还不能管理多个实例,这点是比较麻烦的。

 

参考资料:

http://segmentfault.com/a/1190000002680804

http://segmentfault.com/a/1190000002685515

http://redis.io/topics/sentinel-clients

https://pypi.python.org/pypi/redis/

 

posted @ 2015-12-12 10:12  yayun  阅读(9521)  评论(5编辑  收藏  举报