Redis 集群是一个提供在多个Redis间节点间共享数据的程序集.
Redis集群并不支持处理多个keys的命令,因为这需要在不同的节点间移动数据,从而达不到像Redis那样的性能,
在高负载的情况下可能会导致不可预料的错误.
Redis 集群通过分区来提供一定程度的可用性,在实际环境中当某个节点宕机或者不可达的情况下继续处理命令.
Redis 集群的优势:
自动分割数据到不同的节点上.
整个集群的部分节点失败或者不可达的情况下能够继续处理命令.
Redis 集群的数据分片
Redis 集群没有使用一致性hash, 而是引入了哈希槽的概念.
Redis 集群有16384个哈希槽,每个key通过CRC16校验后对16384取模来决定放置哪个槽.
集群的每个节点负责一部分hash槽,举个例子,比如当前集群有3个节点,那么:
节点 A 包含 0 到 5500号哈希槽.
节点 B 包含5501 到 11000 号哈希槽.
节点 C 包含11001 到 16384号哈希槽.
这种结构很容易添加或者删除节点. 比如如果我想新添加个节点D, 我需要从节点 A, B, C中得部分槽到D上.
如果我想移除节点A,需要将A中得槽移到B和C节点上,然后将没有任何槽的A节点从集群中移除即可.
由于从一个节点将哈希槽移动到另一个节点并不会停止服务,所以无论添加删除或者改变某个节点
的哈希槽的数量都不会造成集群不可用的状态.
Redis 集群的主从复制模型
为了使在部分节点失败或者大部分节点无法通信的情况下集群仍然可用,所以集群使用了主从复制模型,每个节点都会有N-1个复制品.
在我们例子中具有A B C三个节点的集群,在没有复制模型的情况下,如果节点B失败了 那么整个集群就会以为缺少5501-11000这个范围的槽而不可用. 然而如果在集群创建的时候我们为每个节点添加一个从节点A1 B1 C1,那么整个集群便有三个master节点和三个slave节点组成,这样在节点B失败后,集群便会选举B1为新的主节点继续服务,整个集群便不会因为槽找不到而不可用了
不过当B和B1 都失败后,集群仍然是不可用的.
Redis 一致性保证
Redis 并不能保证数据的强一致性. 这意味这在实际中集群在特定的条件下可能会丢失写操作.
第一个原因是因为集群是用了异步复制. 写操作过程:
客户端向主节点B写入一条命令.
主节点B向客户端回复命令状态.
主节点将写操作复制给他得从节点 B1, B2 和 B3.
主节点对命令的复制工作发生在返回命令回复之 因为如果每次处理命令请求都需要等待复制操作完成的话
那么主节点处理命令请求的速度将极大地降低 —— 我们必须在性能和一致性之间做出权衡。
注意:Redis 集群可能会在将来提供同步写的方法。
Redis 集群另外一种可能会丢失命令的情况是集群出现了网络分区 并且一个客户端与至少包括一个主节点在内的少数实例被孤立。.
举个例子 假设集群包含 A B C A1 B1 C1 六个节点 其中 A B C 为主节点, A1 B1 C1 为A B C的从节点, 还有一个客户端 Z1
假设集群中发生网络分区,那么集群可能会分为两方 大部分的一方包含节点 A C A1 B1 和C1 小部分的一方则包含节点B和客户
端 Z1,Z1仍然能够向主节点B中写入, 如果网络分区发生时间较短,那么集群将会继续正常运作,如果分区的时间足够让大部分的一
方将B1选举为新的master 那么Z1写入B中得数据便丢失了.
注意 在网络分裂出现期间 客户端 Z1 可以向主节点 B 发送写命令的最大时间是有限制的 这一时间限制称为节点超时时间是 Redis 集群的一个重要的配置选项.
一:集群配置:
创建7000-7005留个Redis Cluster实例:
创建6个redis实例 使用写好的redis_v1.2.tar.gz 创建即可。
安装依赖包:
yum install -y ruby rubygems
gem sources -a http://ruby.sdutlinux.org/
gem install redis --version 3.0.1
启动集群:
/usr/local/redis/7000/sbin/redis-trib.rb create --replicas 1 \
192.168.30.141:7000 \
192.168.30.141:7001 \
192.168.30.141:7002 \
192.168.30.141:7003 \
192.168.30.141:7004 \
192.168.30.141:7005
#参数解释
create
--replicas 1 表示我们希望为集群中的每个主节点创建一个从节点
#启动过程如下:
[root@work-01 sbin]# /usr/local/redis/7000/sbin/redis-trib.rb create --replicas 1 192.168.30.141:7000 192.168.30.141:7001 192.168.30.141:7002 192.168.30.141:7003 192.168.30.141:7004 192.168.30.141:7005
>>> Creating cluster
Connecting to node 192.168.30.141:7000: OK
Connecting to node 192.168.30.141:7001: OK
Connecting to node 192.168.30.141:7002: OK
Connecting to node 192.168.30.141:7003: OK
Connecting to node 192.168.30.141:7004: OK
Connecting to node 192.168.30.141:7005: OK
>>> Performing hash slots allocation on 6 nodes...
Using 3 masters:
192.168.30.141:7000
192.168.30.141:7001
192.168.30.141:7002
Adding replica 192.168.30.141:7003 to 192.168.30.141:7000
Adding replica 192.168.30.141:7004 to 192.168.30.141:7001
Adding replica 192.168.30.141:7005 to 192.168.30.141:7002
M: 6e3d07bfab1f6423e42d321cda33e907d2532d41 192.168.30.141:7000
slots:0-5460 (5461 slots) master
M: 72a8718117113835f29001ad5379eed51188483c 192.168.30.141:7001
slots:5461-10922 (5462 slots) master
M: 7122a00b25cac401f04cf0a9a795c6644ca8f1f4 192.168.30.141:7002
slots:10923-16383 (5461 slots) master
S: 15dd1525de8d5153302e17f1a458bc88a982709d 192.168.30.141:7003
replicates 6e3d07bfab1f6423e42d321cda33e907d2532d41
S: 6b42a5a1dc3db98bf04b78f7fc1f91876651adde 192.168.30.141:7004
replicates 72a8718117113835f29001ad5379eed51188483c
S: a626848063395b4db3ff1799f25000adeeb719ff 192.168.30.141:7005
replicates 7122a00b25cac401f04cf0a9a795c6644ca8f1f4
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join.....
>>> Performing Cluster Check (using node 192.168.30.141:7000)
M: 6e3d07bfab1f6423e42d321cda33e907d2532d41 192.168.30.141:7000
slots:0-5460 (5461 slots) master
M: 72a8718117113835f29001ad5379eed51188483c 192.168.30.141:7001
slots:5461-10922 (5462 slots) master
M: 7122a00b25cac401f04cf0a9a795c6644ca8f1f4 192.168.30.141:7002
slots:10923-16383 (5461 slots) master
M: 15dd1525de8d5153302e17f1a458bc88a982709d 192.168.30.141:7003
slots: (0 slots) master
replicates 6e3d07bfab1f6423e42d321cda33e907d2532d41
M: 6b42a5a1dc3db98bf04b78f7fc1f91876651adde 192.168.30.141:7004
slots: (0 slots) master
replicates 72a8718117113835f29001ad5379eed51188483c
M: a626848063395b4db3ff1799f25000adeeb719ff 192.168.30.141:7005
slots: (0 slots) master
replicates 7122a00b25cac401f04cf0a9a795c6644ca8f1f4
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
#集群节点状态检查:
[root@work-01 sbin]# /usr/local/redis/7000/sbin/redis-trib.rb check 192.168.30.141:7000
Connecting to node 192.168.30.141:7000: OK
Connecting to node 192.168.30.141:7005: OK
Connecting to node 192.168.30.141:7001: OK
Connecting to node 192.168.30.141:7002: OK
Connecting to node 192.168.30.141:7003: OK
>>> Performing Cluster Check (using node 192.168.30.141:7000)
M: 6e3d07bfab1f6423e42d321cda33e907d2532d41 192.168.30.141:7000
slots:0-5460 (5461 slots) master
1 additional replica(s)
S: a626848063395b4db3ff1799f25000adeeb719ff 192.168.30.141:7005
slots: (0 slots) slave
replicates 7122a00b25cac401f04cf0a9a795c6644ca8f1f4
M: 72a8718117113835f29001ad5379eed51188483c 192.168.30.141:7001
slots:5461-10922 (5462 slots) master
0 additional replica(s)
M: 7122a00b25cac401f04cf0a9a795c6644ca8f1f4 192.168.30.141:7002
slots:10923-16383 (5461 slots) master
1 additional replica(s)
S: 15dd1525de8d5153302e17f1a458bc88a982709d 192.168.30.141:7003
slots: (0 slots) slave
replicates 6e3d07bfab1f6423e42d321cda33e907d2532d41
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
#添加新的节点到集群节点 新的集群节点只能为master 没有任何数据 没有任何solt 集群需要将某个slave节点升级为新的主节点时 新节点不会被选中:
[root@work-01 tmp]# /usr/local/redis/7000/sbin/redis-trib.rb add-node 192.168.30.141:8000 192.168.30.141:7000
>>> Adding node 192.168.30.141:8000 to cluster 192.168.30.141:7000
Connecting to node 192.168.30.141:7000: OK
Connecting to node 192.168.30.141:7005: OK
Connecting to node 192.168.30.141:7001: OK
Connecting to node 192.168.30.141:7002: OK
Connecting to node 192.168.30.141:7004: OK
Connecting to node 192.168.30.141:7003: OK
>>> Performing Cluster Check (using node 192.168.30.141:7000)
M: 6e3d07bfab1f6423e42d321cda33e907d2532d41 192.168.30.141:7000
slots:0-5460 (5461 slots) master
1 additional replica(s)
S: a626848063395b4db3ff1799f25000adeeb719ff 192.168.30.141:7005
slots: (0 slots) slave
replicates 7122a00b25cac401f04cf0a9a795c6644ca8f1f4
M: 72a8718117113835f29001ad5379eed51188483c 192.168.30.141:7001
slots:5461-10922 (5462 slots) master
1 additional replica(s)
M: 7122a00b25cac401f04cf0a9a795c6644ca8f1f4 192.168.30.141:7002
slots:10923-16383 (5461 slots) master
1 additional replica(s)
S: 6b42a5a1dc3db98bf04b78f7fc1f91876651adde 192.168.30.141:7004
slots: (0 slots) slave
replicates 72a8718117113835f29001ad5379eed51188483c
S: 15dd1525de8d5153302e17f1a458bc88a982709d 192.168.30.141:7003
slots: (0 slots) slave
replicates 6e3d07bfab1f6423e42d321cda33e907d2532d41
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
Connecting to node 192.168.30.141:8000: OK
>>> Send CLUSTER MEET to node 192.168.30.141:8000 to make it join the cluster.
[OK] New node added correctly.
#为新节点分配solt:
[root@work-01 tmp]# /usr/local/redis/7000/sbin/redis-trib.rb reshard 192.168.30.141:8001
How many slots do you want to move (from 1 to 16384)? 2000 #根据提示选择要迁移的slot数量
What is the receiving node ID? 29f6f0c3f208425b640aa37cda1ac91a04cc2a3b #选择要接受这些slot的node-id
Please enter all the source node IDs.
Type 'all' to use all the nodes as source nodes for the hash slots. #all表示从所有的master重新分配
Type 'done' once you entered all the source nodes IDs. #或者数据要提取slot的master节点id,最后用done结束
Source node #1:all
Do you want to proceed with the proposed reshard plan (yes/no)? yes #打印被移动的slot后,输入yes开始移动slot以及对应的数据.
#添加slave节点 例子:添加8006为新master(8000)的slave节点 在线添加slave 时 需要dump整个master进程
并传递到slave 再由 slave加载rdb文件到内存 rdb传输过程中Master可能无法提供服务,整个过程消耗大量io,小心操作.
#节点加入集群
[root@work-01 tmp]# /usr/local/redis/7000/sbin/redis-trib.rb add-node 192.168.30.141:8006 192.168.30.141:7000
>>> Adding node 192.168.30.141:8006 to cluster 192.168.30.141:7000
.....................................................................................................................................
>>> Check slots coverage...
[OK] All 16384 slots covered.
Connecting to node 192.168.30.141:8006: OK
>>> Send CLUSTER MEET to node 192.168.30.141:8006 to make it join the cluster.
[OK] New node added correctly.
#ID为master节点ID
[root@work-01 tmp]# /usr/local/redis/7000/sbin/redis-cli -c -p 8006
127.0.0.1:8006> cluster replicate c5773ba506164530294f914485ead94aa953a43a
OK
#查看节点信息
[root@work-01 tmp]# /usr/local/redis/7000/sbin/redis-cli -p 7000 cluster nodes
a626848063395b4db3ff1799f25000adeeb719ff 192.168.30.141:7005 slave 7122a00b25cac401f04cf0a9a795c6644ca8f1f4 0 1432772934467 6 connected
7122a00b25cac401f04cf0a9a795c6644ca8f1f4 192.168.30.141:7002 master - 0 1432772933961 3 connected 12175-16383
6b42a5a1dc3db98bf04b78f7fc1f91876651adde 192.168.30.141:7004 slave 72a8718117113835f29001ad5379eed51188483c 0 1432772934467 5 connected
c5773ba506164530294f914485ead94aa953a43a 192.168.30.141:8000 master - 0 1432772934970 9 connected 244-665 5461-6127 10923-11588
eddeae2e313f6573276016ccf36a00579081328f 192.168.30.141:8006 slave c5773ba506164530294f914485ead94aa953a43a 0 1432772934467 9 connected
4172ac92054f360dd25c5d9c36f6bad3d673d8da 192.168.30.141:8005 slave 29f6f0c3f208425b640aa37cda1ac91a04cc2a3b 0 1432772933455 11 connected
6e3d07bfab1f6423e42d321cda33e907d2532d41 192.168.30.141:7000 myself,master - 0 0 1 connected 1251-5460
29b0339d4536535e94a8a1f2351c6ddeedabeb0f 192.168.30.141:8002 master - 0 1432772934970 8 connected
72a8718117113835f29001ad5379eed51188483c 192.168.30.141:7001 master - 0 1432772934467 2 connected 6713-10922
6430db232341c5856bfa28677e7240d75fad441d 192.168.30.141:8004 slave 29b0339d4536535e94a8a1f2351c6ddeedabeb0f 0 1432772932955 12 connected
15dd1525de8d5153302e17f1a458bc88a982709d 192.168.30.141:7003 slave 6e3d07bfab1f6423e42d321cda33e907d2532d41 0 1432772933961 4 connected
29f6f0c3f208425b640aa37cda1ac91a04cc2a3b 192.168.30.141:8001 master - 0 1432772933961 10 connected 0-243 666-1250 6128-6712 11589-12174
#删除slave 节点
[root@work-01 tmp]# /usr/local/redis/8000/sbin/redis-trib.rb del-node 192.168.30.141:8006 'eddeae2e313f6573276016ccf36a00579081328f'
>>> Removing node eddeae2e313f6573276016ccf36a00579081328f from cluster 192.168.30.141:8006
Connecting to node 192.168.30.141:8006: OK
Connecting to node 192.168.30.141:7001: OK
Connecting to node 192.168.30.141:8001: OK
Connecting to node 192.168.30.141:8002: OK
Connecting to node 192.168.30.141:7002: OK
Connecting to node 192.168.30.141:7000: OK
Connecting to node 192.168.30.141:7003: OK
Connecting to node 192.168.30.141:8005: OK
Connecting to node 192.168.30.141:8004: OK
Connecting to node 192.168.30.141:8000: OK
Connecting to node 192.168.30.141:7004: OK
Connecting to node 192.168.30.141:7005: OK
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
删除master 节点:
删除master节点之前首先要使用reshard移除master的全部slot,然后再删除当前节点(目前只能把被删除
master的slot迁移到一个节点上)
需要首先删除节点的slot 然后执行如下命令:
[root@work-01 tmp]#redis-trib.rb del-node 192.168.30.141:8000 'c5773ba506164530294f914485ead94aa953a43a'
总结:感觉单纯的做cache还是很不错的 实现了无中心节点部署,提高了可用性,但是要是单纯的做为数据库使用,很是不建议,中间的节点timeout时间,
对整个集群影响很大,要求网络上,一定要确保高可用。
参考:http://www.redis.cn/topics/cluster-tutorial.html