Redis集群搭建与管理
server1:172.16.16.34 server2:172.16.16.35 redis版本:redis3.2
搭建环境:redis集群,server1有7001,7002,7003三主,server2有7001,7002,7003三从,总共六个节点。这样做是为了保证redis的集群的高可用。redis的复制也是采用异步复制的方式。
cd /home/maxiangqian tar xzf redis-3.2.8.tar.gz cd redis-3.2.8 yum install gcc make
2:创建redis目录文件夹
mkdir /home/redis7001/data mkdir -p /home/redis7001/data /home/redis7001/log /home/redis7001/tmp mkdir -p /home/redis7002/data /home/redis7002/log /home/redis7002/tmp mkdir -p /home/redis7003/data /home/redis7003/log /home/redis7003/tmp
3:为server1和server2的三个节点分别配置配置文件
port 7001 timeout 300 daemonize yes pidfile "/home/redis7001/tmp/redis_7001.pid" loglevel notice logfile "/home/redis7001/log/redis_7001.log" databases 16 save 900 1 save 300 10 save 60 10000 stop-writes-on-bgsave-error yes rdbcompression yes rdbchecksum yes dbfilename "dump.rdb" dir "/home/redis7001/data" slave-serve-stale-data yes #slave-read-only yes # yes开启从库只读 repl-diskless-sync no repl-diskless-sync-delay 5 repl-disable-tcp-nodelay no slave-priority 100 appendonly yes #appendfilename "appendonly.aof" appendfsync everysec no-appendfsync-on-rewrite no auto-aof-rewrite-percentage 100 auto-aof-rewrite-min-size 64mb lua-time-limit 5000 slowlog-log-slower-than 10000 slowlog-max-len 128 latency-monitor-threshold 0 requirepass "maxiangqianredis" masterauth "maxiangqianredis" #cluster cluster-enabled yes cluster-config-file /home/redis7001/nodes7001.conf cluster-node-timeout 5000
上面是redis7001的配置文件内容
redis-server /home/redis7001/redis7001.conf
我们看一下启动日志:
1574:M 03 May 16:22:53.444 * No cluster configuration found, I'm 363ecec54c92c2548dcab016146bdb4c104e5e84
server1 7001 server1 7002 server1 7003 server2 7001 server2 7002 server2 7003
OK,现在已经有六个已经启动的redis实例了。我们下一步开始做集群
redis-trib.rb create --replicas 1 10.103.16.34:7001 10.103.16.34:7002 10.103.16.34:7003 10.103.16.35:7001 10.103.16.35:7002 10.103.16.35:7003
执行报错:
/usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require': no such file to load -- redis (LoadError) from /usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `require' from /home/maxiangqian/redis-3.2.8/src/redis-trib.rb:25
我们需要安装以下几个包:
yum -y install zlib ruby rubygems gem install redis
然后重新启动创建集群的操作:
[root@localhost redis7003]# redis-trib.rb create --replicas 1 10.103.16.34:7001 10.103.16.34:7002 10.103.16.34:7003 10.103.16.35:7001 10.103.16.35:7002 10.103.16.35:7003 >>> Creating cluster [ERR] Sorry, can't connect to node 10.103.16.34:7001
又报错了我擦。
requirepass "maxiangqianredis" masterauth "maxiangqianredis"
群集认证是要配置完成再添加的,而且两个参数配置必须一样,我们现在暂时不配置认证模式:
[root@localhost redis7003]# redis-trib.rb create --replicas 1 10.103.16.34:7001 10.103.16.34:7002 10.103.16.34:7003 10.103.16.35:7001 10.103.16.35:7002 10.103.16.35:7003 >>> Creating cluster >>> Performing hash slots allocation on 6 nodes... Using 3 masters: 10.103.16.35:7001 10.103.16.34:7001 10.103.16.35:7002 Adding replica 10.103.16.34:7002 to 10.103.16.35:7001 Adding replica 10.103.16.35:7003 to 10.103.16.34:7001 Adding replica 10.103.16.34:7003 to 10.103.16.35:7002 M: 363ecec54c92c2548dcab016146bdb4c104e5e84 10.103.16.34:7001 slots:5461-10922 (5462 slots) master S: 93a0e8d405959480fcbd310a5d15a92346c69d43 10.103.16.34:7002 replicates d015a22abc57c021f568973f4f1c03c7a5c7b772 S: 78f77749f9f9a5f0d7c99427e0311912a3fa04e7 10.103.16.34:7003 replicates 89147e5837e378b69233dd2b8290267975719bc4 M: d015a22abc57c021f568973f4f1c03c7a5c7b772 10.103.16.35:7001 slots:0-5460 (5461 slots) master M: 89147e5837e378b69233dd2b8290267975719bc4 10.103.16.35:7002 slots:10923-16383 (5461 slots) master S: ce9d635236567ccde4c864f78863fa0a4b26f25a 10.103.16.35:7003 replicates 363ecec54c92c2548dcab016146bdb4c104e5e84 Can I set the above configuration? (type 'yes' to accept):
OK,已经提示成功了,我们直接选择yes就好了。
>>> Nodes configuration updated >>> Assign a different config epoch to each node >>> Sending CLUSTER MEET messages to join the cluster Waiting for the cluster to join.. >>> Performing Cluster Check (using node 10.103.16.34:7001) M: 363ecec54c92c2548dcab016146bdb4c104e5e84 10.103.16.34:7001 slots:5461-10922 (5462 slots) master 1 additional replica(s) S: 78f77749f9f9a5f0d7c99427e0311912a3fa04e7 10.103.16.34:7003 slots: (0 slots) slave replicates 89147e5837e378b69233dd2b8290267975719bc4 M: d015a22abc57c021f568973f4f1c03c7a5c7b772 10.103.16.35:7001 slots:0-5460 (5461 slots) master 1 additional replica(s) M: 89147e5837e378b69233dd2b8290267975719bc4 10.103.16.35:7002 slots:10923-16383 (5461 slots) master 1 additional replica(s) S: ce9d635236567ccde4c864f78863fa0a4b26f25a 10.103.16.35:7003 slots: (0 slots) slave replicates 363ecec54c92c2548dcab016146bdb4c104e5e84 S: 93a0e8d405959480fcbd310a5d15a92346c69d43 10.103.16.34:7002 slots: (0 slots) slave replicates d015a22abc57c021f568973f4f1c03c7a5c7b772 [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.
这样群集就设置成功了。
[root@mxqmongodb2 sa]# redis-cli -c -p 7001 127.0.0.1:7001> get name -> Redirected to slot [5798] located at 10.103.16.34:7001 "txt" 10.103.16.34:7001> exit [root@mxqmongodb2 sa]# redis-cli -c -p 7002 127.0.0.1:7002> get name -> Redirected to slot [5798] located at 10.103.16.34:7001 "txt" 10.103.16.34:7001> exit [root@mxqmongodb2 sa]# redis-cli -c -p 7003 127.0.0.1:7003> get name -> Redirected to slot [5798] located at 10.103.16.34:7001 "txt"
5:我们接下来查看一下集群的基本信息:
[root@localhost redis7003]# redis-cli -p 7001 cluster nodes 78f77749f9f9a5f0d7c99427e0311912a3fa04e7 10.103.16.34:7003 slave 89147e5837e378b69233dd2b8290267975719bc4 0 1493879665448 5 connected d015a22abc57c021f568973f4f1c03c7a5c7b772 10.103.16.35:7001 master - 0 1493879663946 4 connected 0-5460 89147e5837e378b69233dd2b8290267975719bc4 10.103.16.35:7002 master - 0 1493879664948 5 connected 10923-16383 ce9d635236567ccde4c864f78863fa0a4b26f25a 10.103.16.35:7003 slave 363ecec54c92c2548dcab016146bdb4c104e5e84 0 1493879665949 6 connected 93a0e8d405959480fcbd310a5d15a92346c69d43 10.103.16.34:7002 slave d015a22abc57c021f568973f4f1c03c7a5c7b772 0 1493879664446 4 connected 363ecec54c92c2548dcab016146bdb4c104e5e84 10.103.16.34:7001 myself,master - 0 0 1 connected 5461-10922
可以看到现在的集群有六个节点,三个主节点和三个从节点。而且每个主节点都会记录自己分配的哈希槽,从中我们可以看到
103.16.35:7001 master - 0 1493879663946 4 connected 0-5460 10.103.16.34:7001 myself,master - 0 0 1 connected 5461-10922 10.103.16.35:7002 master - 0 1493879664948 5 connected 10923-16383
当然我们也可以对这些节点的哈希槽进行重新的分配,我们现在打算将103.16.35:7001的前100个哈希槽移动到10.103.16.34:7001
[root@localhost redis7003]# redis-trib.rb reshard 10.103.16.34:7001
然后会提示我输入数值以及从哪里迁移到哪里:
How many slots do you want to move (from 1 to 16384)? 100 What is the receiving node ID? 363ecec54c92c2548dcab016146bdb4c104e5e84 Please enter all the source node IDs. Type 'all' to use all the nodes as source nodes for the hash slots. Type 'done' once you entered all the source nodes IDs. Source node #1:d015a22abc57c021f568973f4f1c03c7a5c7b772 Source node #2:done
执行完以后就可以进行迁移了,迁移完以后我们再打印出来节点信息看一下:
[root@localhost redis7003]# redis-cli -p 7001 cluster nodes 78f77749f9f9a5f0d7c99427e0311912a3fa04e7 10.103.16.34:7003 slave 89147e5837e378b69233dd2b8290267975719bc4 0 1493881167965 5 connected d015a22abc57c021f568973f4f1c03c7a5c7b772 10.103.16.35:7001 master - 0 1493881166460 4 connected 101-5460 89147e5837e378b69233dd2b8290267975719bc4 10.103.16.35:7002 master - 0 1493881166962 5 connected 10923-16383 ce9d635236567ccde4c864f78863fa0a4b26f25a 10.103.16.35:7003 slave 363ecec54c92c2548dcab016146bdb4c104e5e84 0 1493881167465 7 connected 93a0e8d405959480fcbd310a5d15a92346c69d43 10.103.16.34:7002 slave d015a22abc57c021f568973f4f1c03c7a5c7b772 0 1493881167965 4 connected 363ecec54c92c2548dcab016146bdb4c104e5e84 10.103.16.34:7001 myself,master - 0 0 7 connected 0-100 5461-10922
我们可以很清楚的看到已经迁移成功了。
[root@localhost redis7003]# redis-cli -p 7001 cluster nodes | grep master d015a22abc57c021f568973f4f1c03c7a5c7b772 10.103.16.35:7001 master - 0 1493883826713 4 connected 101-5460 89147e5837e378b69233dd2b8290267975719bc4 10.103.16.35:7002 master - 0 1493883827213 5 connected 10923-16383 363ecec54c92c2548dcab016146bdb4c104e5e84 10.103.16.34:7001 myself,master - 0 0 7 connected 0-100 5461-10922
我们现在要使10.103.16.35:7001这个主节点断掉,然后重启看一下基本信息
[root@mxqmongodb2 sa]# /home/maxiangqian/redis-3.2.8/src/redis-cli -p 7001 127.0.0.1:7001> SHUTDOWN not connected> exit [root@mxqmongodb2 sa]# redis-server /home/redis7001/redis7001.conf
然后再打印一下集群信息看一下:
[root@localhost redis7003]# redis-cli -p 7001 cluster nodes 78f77749f9f9a5f0d7c99427e0311912a3fa04e7 10.103.16.34:7003 slave 89147e5837e378b69233dd2b8290267975719bc4 0 1493884247801 5 connected d015a22abc57c021f568973f4f1c03c7a5c7b772 10.103.16.35:7001 slave 93a0e8d405959480fcbd310a5d15a92346c69d43 0 1493884247300 8 connected 89147e5837e378b69233dd2b8290267975719bc4 10.103.16.35:7002 master - 0 1493884246798 5 connected 10923-16383 ce9d635236567ccde4c864f78863fa0a4b26f25a 10.103.16.35:7003 slave 363ecec54c92c2548dcab016146bdb4c104e5e84 0 1493884246298 7 connected 93a0e8d405959480fcbd310a5d15a92346c69d43 10.103.16.34:7002 master - 0 1493884248301 8 connected 101-5460 363ecec54c92c2548dcab016146bdb4c104e5e84 10.103.16.34:7001 myself,master - 0 0 7 connected 0-100 5461-10922
通过信息我们可以很明显的看到了10.103.16.35:7001这个主节点已经变成了从节点,而本身他的从节点也上升为主节点了。
./redis-trib.rb add-node 10.103.16.34:7004 10.103.16.34:7001
这样我们就把10.103.16.34:7004添加为集群的新的主节点,不过我们要注意的是,这时候他仅仅是一个没有哈希槽的主节点,并不会存储任何数据。
./redis-trib.rb add-node 10.103.16.34:7004 10.103.16.34:7001 redis 10.103.16.34::7004> cluster replicate 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e
将新节点指定为ID为3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e的从节点。
./redis-trib del-node 127.0.0.1:7000 `<node-id>`
但是我们要注意一点,移除主节点的时候必须保证主节点是空的,也就是事先将要移除的主节点的哈希槽给转移到其他的主节点上。
cluster命令
CLUSTER INFO 打印集群的信息 CLUSTER NODES 列出集群当前已知的所有节点(node),以及这些节点的相关信息。 //节点 CLUSTER MEET <ip> <port> 将 ip 和 port 所指定的节点添加到集群当中,让它成为集群的一份子。 CLUSTER FORGET <node_id> 从集群中移除 node_id 指定的节点。 CLUSTER REPLICATE <node_id> 将当前节点设置为 node_id 指定的节点的从节点。 CLUSTER SAVECONFIG 将节点的配置文件保存到硬盘里面。 CLUSTER ADDSLOTS <slot> [slot ...] 将一个或多个槽(slot)指派(assign)给当前节点。 CLUSTER DELSLOTS <slot> [slot ...] 移除一个或多个槽对当前节点的指派。 CLUSTER FLUSHSLOTS 移除指派给当前节点的所有槽,让当前节点变成一个没有指派任何槽的节点。 CLUSTER SETSLOT <slot> NODE <node_id> 将槽 slot 指派给 node_id 指定的节点。 CLUSTER SETSLOT <slot> MIGRATING <node_id> 将本节点的槽 slot 迁移到 node_id 指定的节点中。 CLUSTER SETSLOT <slot> IMPORTING <node_id> 从 node_id 指定的节点中导入槽 slot 到本节点。 CLUSTER SETSLOT <slot> STABLE 取消对槽 slot 的导入(import)或者迁移(migrate)。 //键 CLUSTER KEYSLOT <key> 计算键 key 应该被放置在哪个槽上。 CLUSTER COUNTKEYSINSLOT <slot> 返回槽 slot 目前包含的键值对数量。 CLUSTER GETKEYSINSLOT <slot> <count> 返回 count 个 slot 槽中的键。 //新增 CLUSTER SLAVES node-id 返回一个master节点的slaves 列表
redis 有很多命令,同意,加入到cluster后,也有一些列的命令,现在一一来看下 (http://redis.io/commands/cluster-addslots):
我们来一个一个的实践一下。
我按照上一篇的理论实践知识的基础上,再次搭建了一个集群,这次运行了8个端口,用实际的ip代替127.0.0.1:
redis-trib.rb create --replicas 1 192.168.33.13:7000 192.168.33.13:7001
192.168.33.13:7002 192.168.33.13:7003 192.168.33.13:7004 192.168.33.13:7005
192.168.33.13:7006 192.168.33.13:7007 192.168.33.13:7008
这个命令过后,就会创建一个redis cluster 集群,包括4个Master
和5个slave
。OK,现在我们来一一试一下上述的CLUSTER *
命令。
cluster info
这个命令是显示当前连接的集群的各种信息。
[root@web3 7008]# redis-cli -c -p 7000 127.0.0.1:7000> cluster info cluster_state:ok cluster_slots_assigned:16384 cluster_slots_ok:16384 cluster_slots_pfail:0 cluster_slots_fail:0 cluster_known_nodes:9 cluster_size:4 cluster_current_epoch:9 cluster_my_epoch:1 cluster_stats_messages_sent:41417 cluster_stats_messages_received:41417 cluster_state:集群的状态。ok表示集群是成功的,如果至少有一个solt坏了,就将处于error状态。 cluster_slots_assigned:有多少槽点被分配了,如果是16384,表示全部槽点已被分配。 cluster_slots_ok:多少槽点状态是OK的, 16384 表示都是好的。 cluster_slots_pfail:多少槽点处于暂时疑似下线[PFAIL]状态,这些槽点疑似出现故障,但并不表示是有问题,也会继续提供服务。 cluster_slots_fail:多少槽点处于暂时下线[FAIL]状态,这些槽点已经出现故障,下线了。等待修复解决。 cluster_known_nodes:已知节点的集群中的总数,包括在节点 握手的状态可能不是目前该集群的成员。这里总公有9个。 cluster_size:(The number of master nodes serving at least one hash slot in the cluster) 简单说就是集群中主节点[Master]的数量。 cluster_current_epoch:本地当前时期变量。这是使用,以创造独特的不断增加的版本号过程中失败接管。{不懂} cluster_my_epoch:这是分配给该节点的当前配置版本。{不懂} cluster_stats_messages_sent:通过群集节点到节点的二进制总线发送的消息数。 cluster_stats_messages_received:通过群集节点到节点的二进制总线上接收报文的数量。
cluster nodes
获取集群上的所有的节点信息。一般这个命令用的比较多。
127.0.0.1:7008> cluster nodes 8916fb224bbae3dc0291ca47e066dca0a62fba19 192.168.33.13:7004 slave 3d2b7dccfc45ae2eb7aeb9e0bf001b0ac8f7b3da 0 1446115185933 5 connected 404cf1ecf54d4df46d5faaec4103cfdf67888ad2 192.168.33.13:7001 master - 0 1446115184929 2 connected 4096-8191 a035546046a607487436cf354c187b1712edf39b 192.168.33.13:7006 slave 6f5cd78ee644c1df9756fc11b3595403f51216cc 0 1446115184929 7 connected 6f5cd78ee644c1df9756fc11b3595403f51216cc 192.168.33.13:7002 master - 0 1446115185432 3 connected 8192-12287 f325d80e770ce319e4490818a49bad033cce942c 192.168.33.13:7008 myself,slave 3d2b7dccfc45ae2eb7aeb9e0bf001b0ac8f7b3da 0 0 9 connected e357bea5151b32a971c1f7a5788271106195f99a 192.168.33.13:7005 slave 404cf1ecf54d4df46d5faaec4103cfdf67888ad2 0 1446115186435 6 connected 3d2b7dccfc45ae2eb7aeb9e0bf001b0ac8f7b3da 192.168.33.13:7000 master - 0 1446115184426 1 connected 0-4095 6650a95b874cacf399f174cb7a1b3802fc9bcef9 192.168.33.13:7007 slave 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce 0 1446115184426 8 connected 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce 192.168.33.13:7003 master - 0 1446115184426 4 connected 12288-16383
先看下每一条的结构:
<id> <ip:port> <flags> <master> <ping-sent> <pong-recv> <config-epoch> <link-state> <slot> <slot> ... <slot>
[节点id] [ip:端口] [标志(master、myself、salve)] [(- 或者主节id)] [ping发送的毫秒UNIX时间,0表示没有ping] [pong接收的unix毫秒时间戳] [配置-epoch] [连接状态] [槽点]
cluster meet
将 ip 和 port 所指定的节点添加到集群当中,让它成为集群的一份子
我们一般会用redis-trib.rb add-node 192.168.33.13:7009 192.168.33.13:7000
这种方式将一个节点加入队列。这是不需要连接redis-cli
客户端。
其实,也可以用cluster meet
命令,使用方法:
cluster meet <ip> <port>
我们来实践下,新建一个7009
新节点,然后试着用这个命令加入到集群中来:
127.0.0.1:7000> cluster meet 192.168.33.13 7009 OK 127.0.0.1:7000> cluster nodes .... 70795a3a7b93b7d059124e171cd46ba1683d6b7d 192.168.33.13:7009 master - 0 1446198910590 0 connected
7009
已经成功加入到来集群当中,同样,还没有分配槽点给它。槽点分配在下面的命令中再仔细说。cluster forget
从集群中移除一个节点。这个功能:
redis-trib del-node 192.168.33.13:7009 `<node-id>`
类似。同样,删除从节点,可以直接删除。删除主节点,要是有slot的话需要先迁移。
我们就来删除上一步加的这个192.168.33.13 7009
,他是一个master 节点,但是里面还没分配slot,所以,我们删除试一下:
使用方法为:
cluster forget <node_id>
开始:
127.0.0.1:7000> cluster forget 70795a3a7b93b7d059124e171cd46ba1683d6b7d OK
提示OK了,说明已经成功了。
再看下node 列表:
127.0.0.1:7000> cluster nodes a035546046a607487436cf354c187b1712edf39b 192.168.33.13:7006 slave 6f5cd78ee644c1df9756fc11b3595403f51216cc 0 1448519211988 7 connected f325d80e770ce319e4490818a49bad033cce942c 192.168.33.13:7008 slave 3d2b7dccfc45ae2eb7aeb9e0bf001b0ac8f7b3da 0 1448519212994 9 connected e357bea5151b32a971c1f7a5788271106195f99a 192.168.33.13:7005 slave 404cf1ecf54d4df46d5faaec4103cfdf67888ad2 0 1448519213499 6 connected 8916fb224bbae3dc0291ca47e066dca0a62fba19 192.168.33.13:7004 slave 3d2b7dccfc45ae2eb7aeb9e0bf001b0ac8f7b3da 0 1448519212994 5 connected 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce 192.168.33.13:7003 master - 0 1448519211485 4 connected 12288-16383 3d2b7dccfc45ae2eb7aeb9e0bf001b0ac8f7b3da 192.168.33.13:7000 myself,master - 0 0 1 connected 0-4095 6650a95b874cacf399f174cb7a1b3802fc9bcef9 192.168.33.13:7007 slave 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce 0 1448519212493 8 connected 6f5cd78ee644c1df9756fc11b3595403f51216cc 192.168.33.13:7002 master - 0 1448519213499 3 connected 8192-12287 404cf1ecf54d4df46d5faaec4103cfdf67888ad2 192.168.33.13:7001 master - 0 1448519213499 2 connected 4096-8191
嗯。节点被移除了。
但是,其实是没有真正移除!不知道为啥。
[vagrant@web3 7009]$ redis-trib.rb check 192.168.33.13:7009 Connecting to node 192.168.33.13:7009: OK Connecting to node 192.168.33.13:7004: OK Connecting to node 192.168.33.13:7007: OK Connecting to node 192.168.33.13:7000: OK Connecting to node 192.168.33.13:7008: OK Connecting to node 192.168.33.13:7006: OK Connecting to node 192.168.33.13:7003: OK Connecting to node 192.168.33.13:7005: OK Connecting to node 192.168.33.13:7001: OK Connecting to node 192.168.33.13:7002: OK
进程也还在。
[vagrant@web3 7009]$ ps -ef|grep redis root 3017 1 0 Nov23 ? 00:04:24 redis-server *:7009 [cluster]
而且也还能连上:
[vagrant@web3 7009]$ redis-cli -p 7009 -c 127.0.0.1:7009> cluster nodes 70795a3a7b93b7d059124e171cd46ba1683d6b7d 192.168.33.13:7009 myself,master - 0 0 0 connected
日了狗了!!!!为啥啊。不管啦。继续。
cluster replicate
将当前节点
设置为 node_id 指定的节点的从节点
。
既然刚才没把7009删掉,那就用这个命令把它设置成7003的从节点吧。
使用方法为:
cluster replicate <master_nodeId>
先用7009连接
[root@web3 7009]# redis-cli -p 7009 -c 127.0.0.1:7009> cluster replicate 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce OK
OK了,说明成功了,我们再看下:
127.0.0.1:7009> cluster nodes ... b3917e10123230f2f5b0e2c948a7eeda7f88ccf7 192.168.33.13:7009 myself,slave 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce 0 0 0 connected 357bea5151b32a971c1f7a5788271106195f99a 192.168.33.13:7003 master - 0 1448525721782 4 connected 12288-16383
OK,说明设置成功了,那我推出cli
用redis-trib
看下:
[root@web3 7009]# redis-trib.rb check 192.168.33.13:7000 Connecting to node 192.168.33.13:7009: OK M: 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce 192.168.33.13:7003 slots:12288-16383 (4096 slots) master 2 additional replica(s) S: 6650a95b874cacf399f174cb7a1b3802fc9bcef9 192.168.33.13:7007 slots: (0 slots) slave replicates 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce S: b3917e10123230f2f5b0e2c948a7eeda7f88ccf7 192.168.33.13:7009 slots: (0 slots) slave replicates 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce
成功了!
cluster saveconfig
将节点的配置文件保存到硬盘里面.
试一下:
127.0.0.1:7009> cluster saveconfig OK
ok说明成功了,它会覆盖配置文件夹里的nodes.conf
文件。这样做是为了某种情况下nodes文件丢失,这样就会生成一个最新的节点配置文件。
为了说明是新生成的,我们可以先删除掉7009目录下的nodes.conf
文件:
[root@web3 7009]# ll total 52 -rw-r--r-- 1 root root 0 Nov 26 08:14 appendonly.aof -rw-r--r-- 1 root root 18 Nov 26 08:14 dump.rdb -rw-r--r-- 1 root root 1269 Nov 26 08:50 nodes.conf -rw-r--r-- 1 root root 41550 Oct 30 03:40 redis.conf [root@web3 7009]# rm -rf nodes.conf [root@web3 7009]# ll total 42 -rw-r--r-- 1 root root 0 Nov 26 08:14 appendonly.aof -rw-r--r-- 1 root root 18 Nov 26 08:14 dump.rdb -rw-r--r-- 1 root root 41550 Oct 30 03:40 redis.conf [root@web3 7009]# redis-cli -p 7009 -c 127.0.0.1:7009> cluster saveconfig OK 127.0.0.1:7009> exit [root@web3 7009]# ll total 52 -rw-r--r-- 1 root root 0 Nov 26 08:14 appendonly.aof -rw-r--r-- 1 root root 18 Nov 26 08:14 dump.rdb -rw-r--r-- 1 root root 1269 Nov 26 08:51 nodes.conf -rw-r--r-- 1 root root 41550 Oct 30 03:40 redis.conf [root@web3 7009]#
cluster delslots
移除当前节点
的一个或多个槽点。只能删除自己的节点,删除别人的没用。
因为master才会有槽点,所以,也是只能在master 节点上操作,在slave 操作也没用。
用法是:
cluster delslots slots1 slotes2 slots3
我们看一下槽点的分配情况:
[root@web3 7009]# redis-cli -p 7009 -c cluster nodes| grep master 3d2b7dccfc45ae2eb7aeb9e0bf001b0ac8f7b3da 192.168.33.13:7000 master - 0 1448529511113 1 connected 0-4095 404cf1ecf54d4df46d5faaec4103cfdf67888ad2 192.168.33.13:7001 master - 0 1448529511113 2 connected 4096-8191 6f5cd78ee644c1df9756fc11b3595403f51216cc 192.168.33.13:7002 master - 0 1448529509101 3 connected 8192-12287 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce 192.168.33.13:7003 master - 0 1448529510609 4 connected 12288-16383
4台master,那就把16381 16382 16383 3个槽点给删掉。
开始:
[root@web3 7009]# redis-cli -p 7003 127.0.0.1:7003> cluster delslots 16381 16382 16383 OK 127.0.0.1:7003> cluster nodes 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce 192.168.33.13:7003 myself,master - 0 0 4 connected 12288-16380
看,7003的缺失少了3个节点。我们在看下cluster info
127.0.0.1:7003> cluster info cluster_state:fail cluster_slots_assigned:16381 cluster_slots_ok:16381
cluster_state:fail
,集群失败了!!!为什么呢?为什么删除了3个槽点就失败了呢。因为集群就是要满足所有的16364个槽点全部分配才会成功。所以。就失败了。
数据读取自然也会失败:
127.0.0.1:7003> get name (error) CLUSTERDOWN The cluster is down
我们用redis-trib
检查一下,就知道了:
[root@web3 7009]# redis-trib.rb check 192.168.33.13:7000 ... ... [ERR] Nodes don't agree about configuration! >>> Check for open slots... >>> Check slots coverage... [ERR] Not all 16384 slots are covered by nodes.
那如何挽救呢?那就顺便看下下面的这个命令吧。
cluster addslots
将一个或多个槽(slot)指派(assign)给当前节点
。
用法是:
cluster addslots slots1 slotes2 slots3
那,我就用这个命令将上面删掉的3个槽点再加到7003上看看:
127.0.0.1:7003> cluster addslots 16381 16382 16383 OK 127.0.0.1:7003>
OK了,看下是不是真的成功了:
127.0.0.1:7003> cluster nodes 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce 192.168.33.13:7003 myself,master - 0 0 4 connected 12288-16383
确实回来了,再看下集群状态,启动了没?
127.0.0.1:7003> cluster info cluster_state:ok
数据读取也正常了:
127.0.0.1:7003> get name -> Redirected to slot [5798] located at 192.168.33.13:7001 "123" 192.168.33.13:7001>
cluster flushslots
移除当前节点
的所有
槽点,让当前节点变成一个没有指派任何槽的节点。
我们还是拿7003来开刀吧。谁叫它在最后呢哈哈哈哈哈哈😄
[root@web3 ~]# redis-cli -p 7003 -c 127.0.0.1:7003> cluster flushslots OK
ok了,理论上7003上的槽点应该都被移除了,它被悬空了,那么集群也应该失效了吧。看看:
127.0.0.1:7003> cluster info cluster_state:fail cluster_slots_assigned:12288 cluster_slots_ok:12288
果然,移除了7003的所有4006个槽点,而且集群也失败了。用redis-trib
看看。
[root@web3 ~]# redis-trib.rb check 192.168.33.13:7000 M: 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce 192.168.33.13:7003 slots: (0 slots) master 2 additional replica(s) [ERR] Nodes don't agree about configuration! >>> Check for open slots... >>> Check slots coverage... [ERR] Not all 16384 slots are covered by nodes.
可怜的7003上,已经没有任何slot了。
cluster setslot <\slot> node <\node_id>
将slot 指派给 node_id指定的节点,如果槽已经指派给另一个节点,那么先让另一个节点删除该槽,然后再进行指派。
刚才7003的全部slot已经空出来了,那用这个命令试一下转移。
开始搞,把12288-16384 这中间的几个节点移动到7002上去。
127.0.0.1:7003> cluster setslot 16383 node 6f5cd78ee644c1df9756fc11b3595403f51216cc OK 127.0.0.1:7003> cluster setslot 16382 node 6f5cd78ee644c1df9756fc11b3595403f51216cc OK
妈的,这个命令只能一个一个的移动,太变态了!!!
那看看。刚才移动的几个好了没?
127.0.0.1:7003> cluster nodes 6f5cd78ee644c1df9756fc11b3595403f51216cc 192.168.33.13:7002 master - 0 1448611602575 3 connected 8192-12287 16382-16383
果然,这2个slot被移动过来了。那,再移动下,把 16382 丢给 7000看看:
127.0.0.1:7003> cluster setslot 16382 node 3d2b7dccfc45ae2eb7aeb9e0bf001b0ac8f7b3da OK
看看好了没?
127.0.0.1:7003> cluster nodes 3d2b7dccfc45ae2eb7aeb9e0bf001b0ac8f7b3da 192.168.33.13:7000 master - 0 1448611827539 1 connected 0-4095 16382
嗯,已经移动过来了。
cluster setslot <\slot> migrating <\destination-node-id>
将本节点
的槽 slot 迁移到 node_id 指定的节点中
试一下 16382 ,把她给移会去。可怜。。。
[root@web3 7000]# redis-cli -p 7002 -c 127.0.0.1:7002> cluster setslot 16383 migrating 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce (error) ERR I'm not the owner of hash slot 16383
居然出错了!!!为啥啊。说 16383 不是 7002的槽点,无法移动。我了个叉,怎么会不是呢?
我看下:
6f5cd78ee644c1df9756fc11b3595403f51216cc 192.168.33.13:7002 myself,master - 0 0 3 connected 8192-12287 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce 192.168.33.13:7003 master - 0 1448619265700 4 connected 12288-16383
我了个擦的。确实不是啊。我明明已经把7003的槽点全部置空了啊,为什么这里还有啊。日了狗啊。
进 7003看看呢?
[root@web3 7000]# redis-cli -p 7003 -c
看看:
127.0.0.1:7003> cluster nodes 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce 192.168.33.13:7003 myself,master - 0 0 4 connected 6f5cd78ee644c1df9756fc11b3595403f51216cc 192.168.33.13:7002 master - 0 1448619384580 3 connected 8192-12287 16383
哔了狗了,居然2边的信息不一样!!!我擦。这尼玛。
我退出去check看看。
[root@web3 7000]# redis-trib.rb check 192.168.33.13:7000
看看结果:
M: 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce 192.168.33.13:7003 slots: (0 slots) master 2 additional replica(s) M: 6f5cd78ee644c1df9756fc11b3595403f51216cc 192.168.33.13:7002 slots:8192-12287 (4096 slots) master 1 additional replica(s)
我日了,16383 和 16382 用上面的setslot <slot> node <node_id>
根本没移动过去,这尼玛。无语了。。。不管了吧。继续下面的命令学习吧。
cluster setslot <\slot> importing <\node_id>
从 node_id 指定的节点中导入 slot 到本节点。
上面的命令居然失效了,日了狗,现在看下这个命令呢?
127.0.0.1:7003> cluster setslot 16383 importing 6f5cd78ee644c1df9756fc11b3595403f51216cc OK 127.0.0.1:7003>
将 7002 上的 18383转移到 7003上来,居然成功了。好吧。😓
127.0.0.1:7003> cluster nodes 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce 192.168.33.13:7003 myself,master - 0 0 4 connected
[16383-<-6f5cd78ee644c1df9756fc11b3595403f51216cc] 6f5cd78ee644c1df9756fc11b3595403f51216cc 192.168.33.13:7002 master - 0 1448621921706 3 connected 8192-12287 16383
居然显示成这样的,屌屌屌。然而,16383还在 7002上。哎。无法理解。
check 下:
[root@web3 7000]# redis-trib.rb check 192.168.33.13:7000 ... M: 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce 192.168.33.13:7003 slots: (0 slots) master 2 additional replica(s) ... >>> Check for open slots... [WARNING] Node 192.168.33.13:7003 has slots in importing state (16383). [WARNING] The following slots are open: 16383 >>> Check slots coverage... [ERR] Not all 16384 slots are covered by nodes.
它提示集群失败,说正在导入!什么鬼啊。完全搞不懂。跳过吧,看下一个命令:
cluster setslot <\slot> stable
取消 slot 的导入(import)或者迁移(migrate)。
这个命令屌啊,还可以反悔啊。赶紧试一试:
127.0.0.1:7003> cluster setslot 16383 stable OK
看下回去了没?
127.0.0.1:7003> cluster nodes 35bdcb51ceeff00f9cc608fa1b4364943c7c07ce 192.168.33.13:7003 myself,master - 0 0 4 connected 6f5cd78ee644c1df9756fc11b3595403f51216cc 192.168.33.13:7002 master - 0 1448623146511 3 connected 8192-12287 16383
果然回去了,赞赞赞。再 check 下:
[ERR] Nodes don't agree about configuration! >>> Check for open slots... >>> Check slots coverage... [ERR] Not all 16384 slots are covered by nodes.
刚才那个错误消息了。好吧。继续学习其它命令吧。
cluster keyslot <\key>
计算键 key 应该被放置在哪个槽上。这个是个蛮不错的功能。有时候很想找一个key在哪个节点。看下能不能用。
127.0.0.1:7000> cluster keyslot name (integer) 5798 127.0.0.1:7000> get name -> Redirected to slot [5798] located at 192.168.33.13:7001 "123" 192.168.33.13:7001>
这个很简单,就是一个运算,之前也讲过: CRC16('name')%16384 = 5798
cluster countkeysinslot <\slot>
计算一个slot 包含多少个key。这也是一个很使用的小功能。
试一下:
192.168.33.13:7001> cluster countkeysinslot 5798 (integer) 1
由于是测试,所以,slot里数据很少,这里显示1个,估计就是上面的name
cluster getkeysinslot <\slot> <\count>
返回 一个 slot 中 count 个 key 的集合。 也是一个蛮实用的功能。
试一下:
127.0.0.1:7000> cluster getkeysinslot 5798 1 (empty list or set)
那设置几个值吧:
192.168.33.13:7000> set name1 yangyi -> Redirected to slot [12933] located at 192.168.33.13:7003 OK 192.168.33.13:7003> set name2 yangyi -> Redirected to slot [742] located at 192.168.33.13:7000 OK 192.168.33.13:7000> set name3 yangyi -> Redirected to slot [4807] located at 192.168.33.13:7001 OK 192.168.33.13:7001> set name4 yangyi -> Redirected to slot [8736] located at 192.168.33.13:7002 OK 192.168.33.13:7002> set name5 yangyi -> Redirected to slot [12801] located at 192.168.33.13:7003 OK 192.168.33.13:7003> set name6 yangyi -> Redirected to slot [610] located at 192.168.33.13:7000 OK 192.168.33.13:7000> set name7 yangyi -> Redirected to slot [4675] located at 192.168.33.13:7001 OK 192.168.33.13:7001>
192.168.33.13:7001> cluster getkeysinslot 4675 1 1) "name7"
cluster slaves <\node-id>
返回一个master节点的slaves 列表
192.168.33.13:7001> cluster slaves 7b39b81b5ba94a9f4d96931dd0879cc13dab6f07 1) "f8c7a3113497d8d828bdb05fec4041b382e5fd0a 192.168.33.13:7005 slave 7b39b81b5ba94a9f4d96931dd0879cc13dab6f07
0 1450671948824 6 connected"
和cluster nodes
命令查看的结果是一样的。