Redis Cluster的搭建与部署,实现redis的分布式方案
前言
上篇Redis Sentinel安装与部署,实现redis的高可用实现了redis的高可用,针对的主要是master宕机的情况,我们发现所有节点的数据都是一样的,那么一旦数据量过大,redis也会效率下降的问题。redis3.0版本正式推出后,有效地解决了Redis分布式方面的需求,当遇到单机内存、并发、流量等瓶颈时,可以采用Cluster架构方法达到负载均衡的目的。
而此篇将带领大家实现Redis Cluster的搭建, 并进行简单的客户端操作。
环境准备
redis版本:redis-3.0.0
linux:centos6.7
ip:192.168.11.202,不同的端口实现不同的redis实例
客户端jedis,基于spring-boot
redis cluster环境搭建
节点准备
192.168.11.202:6382,192.168.11.202:6383,192.168.11.202:6384,192.168.11.202:6385,192.168.11.202:6386,192.168.11.202:6387搭建初始集群
192.168.11.202:6388,192.168.11.202:6389扩容时用到
redis-6382.conf
port 6382 bind 192.168.11.202 requirepass "myredis" daemonize yes logfile "6382.log" dbfilename "dump-6382.rdb" dir "/opt/soft/redis/cluster_data" masterauth "myredis" cluster-enabled yes cluster-node-timeout 15000 cluster-config-file "nodes-6382.conf"
redis-6383.conf
port 6383 bind 192.168.11.202 requirepass "myredis" daemonize yes logfile "6383.log" dbfilename "dump-6383.rdb" dir "/opt/soft/redis/cluster_data" masterauth "myredis" cluster-enabled yes cluster-node-timeout 15000 cluster-config-file "nodes-6383.conf"
redis-6384.conf
port 6384 bind 192.168.11.202 requirepass "myredis" daemonize yes logfile "6384.log" dbfilename "dump-6384.rdb" dir "/opt/soft/redis/cluster_data" masterauth "myredis" cluster-enabled yes cluster-node-timeout 15000 cluster-config-file "nodes-6384.conf"
redis-6385.conf
port 6385 bind 192.168.11.202 requirepass "myredis" daemonize yes logfile "6385.log" dbfilename "dump-6385.rdb" dir "/opt/soft/redis/cluster_data" masterauth "myredis" cluster-enabled yes cluster-node-timeout 15000 cluster-config-file "nodes-6385.conf"
redis-6386.conf
port 6386 bind 192.168.11.202 requirepass "myredis" daemonize yes logfile "6386.log" dbfilename "dump-6386.rdb" dir "/opt/soft/redis/cluster_data" masterauth "myredis" cluster-enabled yes cluster-node-timeout 15000 cluster-config-file "nodes-6386.conf"
redis-6387.conf
port 6387 bind 192.168.11.202 requirepass "myredis" daemonize yes logfile "6387.log" dbfilename "dump-6387.rdb" dir "/opt/soft/redis/cluster_data" masterauth "myredis" cluster-enabled yes cluster-node-timeout 15000 cluster-config-file "nodes-6387.conf"
启动全部节点
[root@slave1 redis_cluster]# cd /opt/redis-3.0.0/redis_cluster/ [root@slave1 redis_cluster]# ./../src/redis-server redis-6382.conf [root@slave1 redis_cluster]# ./../src/redis-server redis-6383.conf [root@slave1 redis_cluster]# ./../src/redis-server redis-6384.conf [root@slave1 redis_cluster]# ./../src/redis-server redis-6385.conf [root@slave1 redis_cluster]# ./../src/redis-server redis-6386.conf [root@slave1 redis_cluster]# ./../src/redis-server redis-6387.conf
创建集群
节点全部启动后,每个节点目前只能识别出自己的节点信息,彼此之间并不知道对方的存在;
采用redis-trib.rb来实现集群的快速搭建,redis-trib.rb是采用Rudy实现的集群管理工具,内部通过Cluster相关命令帮我们简化集群创建、检查、槽迁移和均衡等常见运维操作。
有兴趣的朋友可以采用cluster 命令一步一步的手动实现redis cluster的搭建,就可以明白redis-trib.rb是如何快速实现redis cluster的搭建的。
搭建命令如下,其中--replicas 1表示每个主节点配置1个从节点
[root@slave1 src]# cd /opt/redis-3.0.0/src/
[root@slave1 src]# ./redis-trib.rb create --replicas 1 192.168.11.202:6382 192.168.11.202:6383 192.168.11.202:6384 192.168.11.202:6385 192.168.11.202:6386 192.168.11.202:6387
创建过程中会给出主从节点角色分配的计划,如下所示
>>> Creating cluster Connecting to node 192.168.11.202:6382: OK Connecting to node 192.168.11.202:6383: OK Connecting to node 192.168.11.202:6384: OK Connecting to node 192.168.11.202:6385: OK Connecting to node 192.168.11.202:6386: OK Connecting to node 192.168.11.202:6387: OK >>> Performing hash slots allocation on 6 nodes... Using 3 masters: 192.168.11.202:6382 192.168.11.202:6383 192.168.11.202:6384 Adding replica 192.168.11.202:6385 to 192.168.11.202:6382 Adding replica 192.168.11.202:6386 to 192.168.11.202:6383 Adding replica 192.168.11.202:6387 to 192.168.11.202:6384 M: 0ec055f9daa5b4f570e6a4c4d46e5285d16e0afe 192.168.11.202:6382 slots:0-5460 (5461 slots) master M: 3771e67edab547deff6bd290e1a07b23646906ee 192.168.11.202:6383 slots:5461-10922 (5462 slots) master M: 10b3789bb30889b5e6f67175620feddcd496d19e 192.168.11.202:6384 slots:10923-16383 (5461 slots) master S: 7649466ec006e0902a7f1578417247a6d5540c47 192.168.11.202:6385 replicates 0ec055f9daa5b4f570e6a4c4d46e5285d16e0afe S: 4f36b08d8067a003af45dbe96a5363f348643509 192.168.11.202:6386 replicates 3771e67edab547deff6bd290e1a07b23646906ee S: a583def1e6a059e4fdb3592557fd6ab691fd61ec 192.168.11.202:6387 replicates 10b3789bb30889b5e6f67175620feddcd496d19e Can I set the above configuration? (type 'yes' to accept):
为什么192.168.11.202:6382 192.168.11.202:6383 192.168.11.202:6384是主节点,请看注意点中第1点;当我们同意这份计划之后,输入yes,redis-trib.rb开始执行节点握手和槽分配操作,输出如下
>>> Nodes configuration updated >>> Assign a different config epoch to each node >>> Sending CLUSTER MEET messages to join the cluster Waiting for the cluster to join... >>> Performing Cluster Check (using node 192.168.11.202:6382) M: 0ec055f9daa5b4f570e6a4c4d46e5285d16e0afe 192.168.11.202:6382 slots:0-5460 (5461 slots) master M: 3771e67edab547deff6bd290e1a07b23646906ee 192.168.11.202:6383 slots:5461-10922 (5462 slots) master M: 10b3789bb30889b5e6f67175620feddcd496d19e 192.168.11.202:6384 slots:10923-16383 (5461 slots) master M: 7649466ec006e0902a7f1578417247a6d5540c47 192.168.11.202:6385 slots: (0 slots) master replicates 0ec055f9daa5b4f570e6a4c4d46e5285d16e0afe M: 4f36b08d8067a003af45dbe96a5363f348643509 192.168.11.202:6386 slots: (0 slots) master replicates 3771e67edab547deff6bd290e1a07b23646906ee M: a583def1e6a059e4fdb3592557fd6ab691fd61ec 192.168.11.202:6387 slots: (0 slots) master replicates 10b3789bb30889b5e6f67175620feddcd496d19e [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.
16384个槽全部被分配,集群创建成功。
集群完整性检查
redis-trib.rb check命令可以完成检查工作,check命令只需给出集群中任意一个节点地址就可以完成整个集群的检查工作,如下
redis-trib.rb check 192.168.11.202:6382, 输出结果如下
Connecting to node 192.168.11.202:6382: OK Connecting to node 192.168.11.202:6385: OK Connecting to node 192.168.11.202:6383: OK Connecting to node 192.168.11.202:6384: OK Connecting to node 192.168.11.202:6387: OK Connecting to node 192.168.11.202:6386: OK >>> Performing Cluster Check (using node 192.168.11.202:6382) M: 0ec055f9daa5b4f570e6a4c4d46e5285d16e0afe 192.168.11.202:6382 slots:0-5460 (5461 slots) master 1 additional replica(s) S: 7649466ec006e0902a7f1578417247a6d5540c47 192.168.11.202:6385 slots: (0 slots) slave replicates 0ec055f9daa5b4f570e6a4c4d46e5285d16e0afe M: 3771e67edab547deff6bd290e1a07b23646906ee 192.168.11.202:6383 slots:5461-10922 (5462 slots) master 1 additional replica(s) M: 10b3789bb30889b5e6f67175620feddcd496d19e 192.168.11.202:6384 slots:10923-16383 (5461 slots) master 1 additional replica(s) S: a583def1e6a059e4fdb3592557fd6ab691fd61ec 192.168.11.202:6387 slots: (0 slots) slave replicates 10b3789bb30889b5e6f67175620feddcd496d19e S: 4f36b08d8067a003af45dbe96a5363f348643509 192.168.11.202:6386 slots: (0 slots) slave replicates 3771e67edab547deff6bd290e1a07b23646906ee [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.
[OK] All 16384 slots covered.表示集群所有的槽都已分配到节点。
引入了槽之后,整个数据流向如下图所示:
至于为什么引入槽,请看注意点中第3点
redis cluster简单操作
连接集群,随便连接某个节点都可以;-c 集群支持,支持自动重定向
[root@slave1 redis_cluster]# ./../src/redis-cli -h 192.168.11.202 -p 6382 -a myredis -c
连接上redis cluster后就可以执行相关redis命令了,如下
192.168.11.202:6382> get name -> Redirected to slot [5798] located at 192.168.11.202:6388 "youzhibing" 192.168.11.202:6388> set weight 112 -> Redirected to slot [16280] located at 192.168.11.202:6384 OK 192.168.11.202:6384> get weight "112" 192.168.11.202:6384>
客户端(Jedis)连接与操作
redis-cluster.properties
#cluster redis.cluster.host=192.168.11.202 redis.cluster.port=6382,6383,6384,6385,6386,6387 #redis读写超时时间(毫秒) redis.cluster.socketTimeout=1000 #redis连接超时时间(毫秒) redis.cluster.connectionTimeOut=3000 #最大尝试连接次数 redis.cluster.maxAttempts=10 #最大重定向次数 redis.cluster.maxRedirects=5 #master连接密码 redis.password=myredis # 连接池 # 连接池最大连接数(使用负值表示没有限制) redis.pool.maxActive=150 # 连接池中的最大空闲连接 redis.pool.maxIdle=10 # 连接池中的最小空闲连接 redis.pool.minIdle=1 # 获取连接时的最大等待毫秒数,小于零:阻塞不确定的时间,默认-1 redis.pool.maxWaitMillis=3000 # 每次释放连接的最大数目 redis.pool.numTestsPerEvictionRun=50 # 释放连接的扫描间隔(毫秒) redis.pool.timeBetweenEvictionRunsMillis=3000 # 连接最小空闲时间(毫秒) redis.pool.minEvictableIdleTimeMillis=1800000 # 连接空闲多久后释放, 当空闲时间>该值 且 空闲连接>最大空闲连接数 时直接释放(毫秒) redis.pool.softMinEvictableIdleTimeMillis=10000 # 在获取连接的时候检查有效性, 默认false redis.pool.testOnBorrow=true # 在空闲时检查有效性, 默认false redis.pool.testWhileIdle=true # 在归还给pool时,是否提前进行validate操作 redis.pool.testOnReturn=true # 连接耗尽时是否阻塞, false报异常,ture阻塞直到超时, 默认true redis.pool.blockWhenExhausted=true
RedisClusterConfig.java
package com.lee.redis.config.cluster; import java.util.HashSet; import java.util.Set; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.beans.factory.annotation.Value; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.context.annotation.PropertySource; import org.springframework.util.StringUtils; import redis.clients.jedis.HostAndPort; import redis.clients.jedis.JedisCluster; import redis.clients.jedis.JedisPoolConfig; import com.alibaba.fastjson.JSON; import com.lee.redis.exception.LocalException; @Configuration @PropertySource("redis/redis-cluster.properties") public class RedisClusterConfig { private static final Logger LOGGER = LoggerFactory.getLogger(RedisClusterConfig.class); // pool @Value("${redis.pool.maxActive}") private int maxTotal; @Value("${redis.pool.maxIdle}") private int maxIdle; @Value("${redis.pool.minIdle}") private int minIdle; @Value("${redis.pool.maxWaitMillis}") private long maxWaitMillis; @Value("${redis.pool.numTestsPerEvictionRun}") private int numTestsPerEvictionRun; @Value("${redis.pool.timeBetweenEvictionRunsMillis}") private long timeBetweenEvictionRunsMillis; @Value("${redis.pool.minEvictableIdleTimeMillis}") private long minEvictableIdleTimeMillis; @Value("${redis.pool.softMinEvictableIdleTimeMillis}") private long softMinEvictableIdleTimeMillis; @Value("${redis.pool.testOnBorrow}") private boolean testOnBorrow; @Value("${redis.pool.testWhileIdle}") private boolean testWhileIdle; @Value("${redis.pool.testOnReturn}") private boolean testOnReturn; @Value("${redis.pool.blockWhenExhausted}") private boolean blockWhenExhausted; // cluster @Value("${redis.cluster.host}") private String host; @Value("${redis.cluster.port}") private String port; @Value("${redis.cluster.socketTimeout}") private int socketTimeout; @Value("${redis.cluster.connectionTimeOut}") private int connectionTimeOut; @Value("${redis.cluster.maxAttempts}") private int maxAttempts; @Value("${redis.cluster.maxRedirects}") private int maxRedirects; @Value("${redis.password}") private String password; @Bean public JedisPoolConfig jedisPoolConfig() { JedisPoolConfig jedisPoolConfig = new JedisPoolConfig(); jedisPoolConfig.setMaxTotal(maxTotal); jedisPoolConfig.setMaxIdle(maxIdle); jedisPoolConfig.setMinIdle(minIdle); jedisPoolConfig.setMaxWaitMillis(maxWaitMillis); jedisPoolConfig.setNumTestsPerEvictionRun(numTestsPerEvictionRun); jedisPoolConfig .setTimeBetweenEvictionRunsMillis(timeBetweenEvictionRunsMillis); jedisPoolConfig .setMinEvictableIdleTimeMillis(minEvictableIdleTimeMillis); jedisPoolConfig .setSoftMinEvictableIdleTimeMillis(softMinEvictableIdleTimeMillis); jedisPoolConfig.setTestOnBorrow(testOnBorrow); jedisPoolConfig.setTestWhileIdle(testWhileIdle); jedisPoolConfig.setTestOnReturn(testOnReturn); jedisPoolConfig.setBlockWhenExhausted(blockWhenExhausted); return jedisPoolConfig; } @Bean public JedisCluster jedisCluster(JedisPoolConfig jedisPoolConfig) { if (StringUtils.isEmpty(host)) { LOGGER.info("redis集群主机未配置"); throw new LocalException("redis集群主机未配置"); } if (StringUtils.isEmpty(port)) { LOGGER.info("redis集群端口未配置"); throw new LocalException("redis集群端口未配置"); } String[] hosts = host.split(","); String[] portArray = port.split(";"); if (hosts.length != portArray.length) { LOGGER.info("redis集群主机数与端口数不匹配"); throw new LocalException("redis集群主机数与端口数不匹配"); } Set<HostAndPort> redisNodes = new HashSet<HostAndPort>(); for (int i = 0; i < hosts.length; i++) { String ports = portArray[i]; String[] hostPorts = ports.split(","); for (String port : hostPorts) { HostAndPort node = new HostAndPort(hosts[i], Integer.parseInt(port)); redisNodes.add(node); } } LOGGER.info("Set<RedisNode> : {}", JSON.toJSONString(redisNodes), true); return new JedisCluster(redisNodes, connectionTimeOut, socketTimeout, maxAttempts, password, jedisPoolConfig); } }
ApplicationCluster.java
package com.lee.redis; import org.springframework.boot.Banner; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.EnableAutoConfiguration; import org.springframework.context.annotation.ComponentScan; import org.springframework.context.annotation.Configuration; @Configuration @EnableAutoConfiguration @ComponentScan(basePackages={"com.lee.redis.config.cluster"}) public class ApplicationCluster { public static void main(String[] args) { SpringApplication app = new SpringApplication(ApplicationCluster.class); app.setBannerMode(Banner.Mode.OFF); // 是否打印banner // app.setApplicationContextClass(); // 指定spring应用上下文启动类 app.setWebEnvironment(false); app.run(args); } }
RedisClusterTest.java
package com.lee.redis; import java.util.List; import java.util.Map; import org.junit.Test; import org.junit.runner.RunWith; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.boot.test.context.SpringBootTest; import org.springframework.test.context.junit4.SpringRunner; import com.alibaba.fastjson.JSON; import redis.clients.jedis.JedisCluster; import redis.clients.jedis.JedisPool; @RunWith(SpringRunner.class) @SpringBootTest(classes = ApplicationCluster.class) public class RedisClusterTest { private static final Logger LOGGER = LoggerFactory.getLogger(RedisClusterTest.class); @Autowired private JedisCluster jedisCluster; @Test public void initTest() { String name = jedisCluster.get("name"); LOGGER.info("name is {}", name); // list操作 long count = jedisCluster.lpush("list:names", "陈芸"); // lpush的返回值是在 push操作后的 list 长度 LOGGER.info("count = {}", count); long nameLen = jedisCluster.llen("list:names"); LOGGER.info("list:names lens is {}", nameLen); List<String> nameList = jedisCluster.lrange("list:names", 0, nameLen); LOGGER.info("names : {}", JSON.toJSONString(nameList)); } }
执行RedisClusterTest.java中的initTest方法, 结果如下
...... 2018-03-06 09:56:05|INFO|com.lee.redis.RedisClusterTest|name is youzhibing 2018-03-06 09:56:05|INFO|com.lee.redis.RedisClusterTest|count = 3 2018-03-06 09:56:05|INFO|com.lee.redis.RedisClusterTest|list:names lens is 3 2018-03-06 09:56:05|INFO|com.lee.redis.RedisClusterTest|names : ["陈芸","沈复","沈复"] ......
集群的伸缩与故障转移
cluster扩容
新增节点:192.168.11.202:6388, 192.168.11.202:6389, 配置文件与之前的基本一致
redis-6388.conf
port 6388 bind 192.168.11.202 requirepass "myredis" daemonize yes logfile "6388.log" dbfilename "dump-6388.rdb" dir "/opt/soft/redis/cluster_data" masterauth "myredis" cluster-enabled yes cluster-node-timeout 15000 cluster-config-file "nodes-6388.conf"
redis-6389.conf
port 6389 bind 192.168.11.202 requirepass "myredis" daemonize yes logfile "6389.log" dbfilename "dump-6389.rdb" dir "/opt/soft/redis/cluster_data" masterauth "myredis" cluster-enabled yes cluster-node-timeout 15000 cluster-config-file "nodes-6389.conf"
启动6388、6389节点
[root@slave1 redis_cluster]# cd /opt/redis-3.0.0/redis_cluster/ [root@slave1 redis_cluster]# ./../src/redis-server redis-6388.conf [root@slave1 redis_cluster]# ./../src/redis-server redis-6389.conf
加入集群:
[root@slave1 redis_cluster]# ./../src/redis-trib.rb add-node 192.168.11.202:6388 192.168.11.202:6382
#将6389添加成6388的从节点
[root@slave1 redis_cluster]# ./../src/redis-trib.rb add-node --slave --master-id e073db09e7aaed3c20d133726a26c8994932262c 192.168.11.202:6389 192.168.11.202:6382
迁移槽和数据
采用redis-trib-rb reshard命令执行槽重分片:[root@slave1 redis_cluster]# ./../src/redis-trib.rb reshard 192.168.11.202:6382
当出现 How many slots do you want to move (from 1 to 16384)? 提示我们想移动多少个槽,我们输入4096
当出现What is the receiving node ID? 提示我们哪个主节点接收新移动的槽, 我们输入6388的节点id:e073db09e7aaed3c20d133726a26c8994932262c,目标节点只能指定一个(节点id可以拷贝的哦)
之后输入源节点的id,用done结束,这里我用的all,就是从之前的全部主节点中移动4096个槽到6388
数据迁移之前会打印出所有的槽从源节点到目标节点的计划,确认无误后输入yes执行迁移工作
若迁移过程没有出错,那么迁移则顺利完成
cluster故障转移
6388上的所有key
192.168.11.202:6388> keys * 1) "list:names" 2) "name" 192.168.11.202:6388>
杀掉6388进程
[root@slave1 redis_cluster]# ps -ef | grep redis-server | grep 6388 root 8280 1 0 Mar05 ? 00:05:07 ./../src/redis-server 192.168.11.202:6388 [cluster] [root@slave1 redis_cluster]# kill -9 8280 #集群节点查看 [root@slave1 redis_cluster]# ./../src/redis-cli -h 192.168.11.202 -p 6382 -a myredis -c 192.168.11.202:6382> cluster nodes 4f36b08d8067a003af45dbe96a5363f348643509 192.168.11.202:6386 slave 3771e67edab547deff6bd290e1a07b23646906ee 0 1520304517911 5 connected 0ec055f9daa5b4f570e6a4c4d46e5285d16e0afe 192.168.11.202:6382 myself,master - 0 0 1 connected 1394-5460 a583def1e6a059e4fdb3592557fd6ab691fd61ec 192.168.11.202:6387 slave 10b3789bb30889b5e6f67175620feddcd496d19e 0 1520304514886 6 connected 10b3789bb30889b5e6f67175620feddcd496d19e 192.168.11.202:6384 master - 0 1520304516904 3 connected 12318-16383 3771e67edab547deff6bd290e1a07b23646906ee 192.168.11.202:6383 master - 0 1520304513879 2 connected 7106-10922 e073db09e7aaed3c20d133726a26c8994932262c 192.168.11.202:6388 master,fail - 1520304485678 1520304484473 10 disconnected 0-1393 5461-7105 10923-12317 37de0d2dc1c267760156d4230502fa96a6bba64d 192.168.11.202:6389 slave e073db09e7aaed3c20d133726a26c8994932262c 0 1520304515895 10 connected 7649466ec006e0902a7f1578417247a6d5540c47 192.168.11.202:6385 slave 0ec055f9daa5b4f570e6a4c4d46e5285d16e0afe 0 1520304518923 4 connected 192.168.11.202:6382> #查询key name 192.168.11.202:6382> get name -> Redirected to slot [5798] located at 192.168.11.202:6389 "youzhibing" 192.168.11.202:6389> keys * 1) "list:names" 2) "name" 192.168.11.202:6389>
6389已经成为主节点,承担着之前6388的角色,集群状态还是ok的, 对外提供的服务不受任何影响
重新启动6388服务
[root@slave1 redis_cluster]# ./../src/redis-server redis-6388.conf #查看集群节点 192.168.11.202:6389> cluster nodes 0ec055f9daa5b4f570e6a4c4d46e5285d16e0afe 192.168.11.202:6382 master - 0 1520304789567 1 connected 1394-5460 37de0d2dc1c267760156d4230502fa96a6bba64d 192.168.11.202:6389 myself,master - 0 0 12 connected 0-1393 5461-7105 10923-12317 7649466ec006e0902a7f1578417247a6d5540c47 192.168.11.202:6385 slave 0ec055f9daa5b4f570e6a4c4d46e5285d16e0afe 0 1520304788061 1 connected a583def1e6a059e4fdb3592557fd6ab691fd61ec 192.168.11.202:6387 slave 10b3789bb30889b5e6f67175620feddcd496d19e 0 1520304787556 3 connected 3771e67edab547deff6bd290e1a07b23646906ee 192.168.11.202:6383 master - 0 1520304786550 2 connected 7106-10922 e073db09e7aaed3c20d133726a26c8994932262c 192.168.11.202:6388 slave 37de0d2dc1c267760156d4230502fa96a6bba64d 0 1520304786047 12 connected 10b3789bb30889b5e6f67175620feddcd496d19e 192.168.11.202:6384 master - 0 1520304785542 3 connected 12318-16383 4f36b08d8067a003af45dbe96a5363f348643509 192.168.11.202:6386 slave 3771e67edab547deff6bd290e1a07b23646906ee 0 1520304788561 2 connected 192.168.11.202:6389>
可以看到6388启动成功后,仍在集群中,只是是作为6389的从节点了
cluster收缩
1、我们下线6389和6388节点
通过集群节点信息我们知道6389负责槽:0-1393 5461-7105 10923-12317, 现在将0-1393迁移到6382,5461-7105迁移到6383, 10923-12317迁移到6384
How many slots do you want to move (from 1 to 16384)?1394 What is the receiving node ID? 0ec055f9daa5b4f570e6a4c4d46e5285d16e0afe Please enter all the source node IDs. Type 'all' to use all the nodes as source nodes for the hash slots. Type 'done' once you entered all the source nodes IDs. Source node #1:37de0d2dc1c267760156d4230502fa96a6bba64d Source node #2:done ...... Do you want to proceed with the proposed reshard plan (yes/no)? yes How many slots do you want to move (from 1 to 16384)? 1645 What is the receiving node ID? 3771e67edab547deff6bd290e1a07b23646906ee Please enter all the source node IDs. Type 'all' to use all the nodes as source nodes for the hash slots. Type 'done' once you entered all the source nodes IDs. Source node #1:37de0d2dc1c267760156d4230502fa96a6bba64d Source node #2:done ...... Do you want to proceed with the proposed reshard plan (yes/no)? yes How many slots do you want to move (from 1 to 16384)?1395 What is the receiving node ID? 10b3789bb30889b5e6f67175620feddcd496d19e Please enter all the source node IDs. Type 'all' to use all the nodes as source nodes for the hash slots. Type 'done' once you entered all the source nodes IDs. Source node #1:37de0d2dc1c267760156d4230502fa96a6bba64d Source node #2:done ...... Do you want to proceed with the proposed reshard plan (yes/no)? yes
槽节点迁移完之后,集群节点信息,发现6388已经没有分配槽了
192.168.11.202:6382> cluster nodes 3771e67edab547deff6bd290e1a07b23646906ee 192.168.11.202:6383 master - 0 1520333368013 16 connected 5461-10922 0ec055f9daa5b4f570e6a4c4d46e5285d16e0afe 192.168.11.202:6382 myself,master - 0 0 13 connected 0-5460 10b3789bb30889b5e6f67175620feddcd496d19e 192.168.11.202:6384 master - 0 1520333372037 17 connected 10923-16383 4f36b08d8067a003af45dbe96a5363f348643509 192.168.11.202:6386 slave 3771e67edab547deff6bd290e1a07b23646906ee 0 1520333370024 16 connected a583def1e6a059e4fdb3592557fd6ab691fd61ec 192.168.11.202:6387 slave 10b3789bb30889b5e6f67175620feddcd496d19e 0 1520333370525 17 connected 37de0d2dc1c267760156d4230502fa96a6bba64d 192.168.11.202:6389 slave e073db09e7aaed3c20d133726a26c8994932262c 0 1520333369017 15 connected 7649466ec006e0902a7f1578417247a6d5540c47 192.168.11.202:6385 slave 0ec055f9daa5b4f570e6a4c4d46e5285d16e0afe 0 1520333367008 13 connected e073db09e7aaed3c20d133726a26c8994932262c 192.168.11.202:6388 master - 0 1520333371031 15 connected
2、忘记节点
由于集群内的节点不停地通过Gossip消息彼此交换节点信息,因此需要通过一种健壮的机制让集群内所有节点忘记下线的节点。也就是说让其他节点不再与要下线的节点进行Gossip消息交换。
利用redis-trib.rb del-node命令实现节点下线,先下线从节点再下线主节点,避免不必要的全量复制。命令如下
[root@slave1 redis_cluster]# ./../src/redis-trib.rb del-node 192.168.11.202:6389 37de0d2dc1c267760156d4230502fa96a6bba64d
[root@slave1 redis_cluster]# ./../src/redis-trib.rb del-node 192.168.11.202:6388 e073db09e7aaed3c20d133726a26c8994932262c
集群节点信息如下
192.168.11.202:6382> cluster nodes 3771e67edab547deff6bd290e1a07b23646906ee 192.168.11.202:6383 master - 0 1520333828887 16 connected 5461-10922 0ec055f9daa5b4f570e6a4c4d46e5285d16e0afe 192.168.11.202:6382 myself,master - 0 0 13 connected 0-5460 10b3789bb30889b5e6f67175620feddcd496d19e 192.168.11.202:6384 master - 0 1520333827377 17 connected 10923-16383 4f36b08d8067a003af45dbe96a5363f348643509 192.168.11.202:6386 slave 3771e67edab547deff6bd290e1a07b23646906ee 0 1520333826880 16 connected a583def1e6a059e4fdb3592557fd6ab691fd61ec 192.168.11.202:6387 slave 10b3789bb30889b5e6f67175620feddcd496d19e 0 1520333829892 17 connected 7649466ec006e0902a7f1578417247a6d5540c47 192.168.11.202:6385 slave 0ec055f9daa5b4f570e6a4c4d46e5285d16e0afe 0 1520333827879 13 connected
16384个槽节点都有分布,集群状态ok, 节点6389和6388下线成功
注意点
1、创建集群的时候,redis-trib.rb会尽可能保证主从节点不分配在同一机器下,因此会重新排序节点顺序;节点列表顺序用于确定主从角色,先主节点之后是从节点
2、redis-trib.rb创建集群的时候,节点地址必须是不包含任何槽 / 数据的节点,否则会拒绝创建集群
3、虚拟槽的采用主要是针对一致性哈希分区的不足而提出的,一致性哈希分区不适用少量节点的情况,而虚拟槽的范围(redis cluster槽范围是0 ~ 16383)一般远大于节点数的,然后每个节点负责一定数量的槽,这样就规避掉了少量节点的问题,因为在数据与节点之间多了一层虚拟槽的映射
4、Jedis连接redis cluster的时候,配置redis-cluster节点的时候只需要配置任意一个可达的节点即可,不一定全部节点都配置上,因为每个节点都有整个集群的信息;Jedis键命令执行流程如下图, 有兴趣的朋友可以查看Jedis源码; 当然,全部节点都配置是更全面的做法
5、集群的伸缩与故障转移对客户端没有影响,只要整个集群状态是ok,那么客户端的请求都是能够得到正常响应的
6、只有保证16384个槽节点都能分配到节点上,那么集群状态就是ok,才能正常对外提供服务;所以 无论是集群扩容还是收缩,都必须保证16384个槽能正确的分配到节点上
参考
《Redis开发与运维》