redis集群在线迁移第一篇(数据在线迁移至新集群)实战一

迁移背景:
1、原来redis集群在A机房,需要把其迁移到新机房B上来.
2、保证现有环境稳定。
3、采用在线迁移方式,因为原有redis集群内有大量数据。
4、如果是一个全新的redis集群搭建会简单很多。
5、10.128.51.14(ht4)是A机房,10.121.51.30(ht20)在B机房。

 首先介绍下redis集群

Redis Cluster在多个节点之间进行数据共享,即使部分节点失效或无法进行通讯时,Cluster仍然可以继续处理请求
Redis 3.0之后,官方版本支持了Cluster
如果每个主节点都有一个从节点支持,在主节点下线或无法与集群的大多数节点进行通讯的情况下,
从节点提升为主节点,并提供服务,保证Cluster正常运行
我这里redis采用单机,多实例方式部署

安装之前,安装集群管理工具redis-trib.rb,需要安装ruby环境:

#首先安装ruby相关依赖。
[root@ht20 redis]# cat /etc/redhat-release 
CentOS Linux release 7.9.2009 (Core)
[root@ht20 redis]# uname -r
3.10.0-1160.45.1.el7.x86_64

[root@node1 ~]# yum install ruby rubygems -y
[root@node1 ~]# cd /usr/local/src/ 
[root@ht20 src]# wget https://rubygems.org/downloads/redis-3.3.3.gem
--2022-03-01 10:53:14--  https://rubygems.org/downloads/redis-3.3.3.gem
Resolving rubygems.org (rubygems.org)... 151.101.65.227, 151.101.1.227, 151.101.129.227, ...
Connecting to rubygems.org (rubygems.org)|151.101.65.227|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 92672 (90K) [application/octet-stream]
Saving to: ‘redis-3.3.3.gem’

100%[============================================================================================>] 
92,672 428KB/s in 0.2s 2022-03-01 10:53:15 (428 KB/s) - ‘redis-3.3.3.gem’ saved [92672/92672] [root@ht20 src]# gem install redis-3.3.3.gem Successfully installed redis-3.3.3 Parsing documentation for redis-3.3.3 Installing ri documentation for redis-3.3.3 1 gem installed

一、首先建立目录:
/data/redis_mai1/redis
/data/redis_mai2/redis
/data/redis_mai3/redis
/data/redis_mai4/redis
/data/redis_mai5/redis
/data/redis_mai6/redis

二、复制别的机器上的redis,内容就是下面的列表,这些内容是我从别的现成的redis集群直接scp 过来的,到6个目录下

[root@ht20 redis]# ll
total 77196
drwxr-xr-x 2 root root       59 Mar  1 09:51 data
-rw-r--r-- 1 root root      158 Mar  1 09:29 keys.redis
-rwxr-xr-x 1 root root  2451134 Mar  1 09:29 redis-benchmark
-rwxr-xr-x 1 root root  5777399 Mar  1 09:29 redis-check-aof
-rwxr-xr-x 1 root root  5777399 Mar  1 09:29 redis-check-rdb
-rwxr-xr-x 1 root root  2617215 Mar  1 09:29 redis-cli
-rw-r--r-- 1 root root     1558 Mar  1 09:43 redis.conf
-rw-r--r-- 1 root root 50779877 Mar  1 09:51 redis.log
-rwxr-xr-x 1 root root  5777399 Mar  1 09:29 redis-sentinel
-rwxr-xr-x 1 root root  5777399 Mar  1 09:29 redis-server
-rwxr-xr-x 1 root root    65991 Mar  1 09:29 redis-trib.rb

查看版本信息,不用启动

 [root@ht20 redis]# ./redis-cli -v
 redis-cli 4.0.14

 Redis 3.0之后,官方版本支持了Cluster.

 

三、对应修改配置文件(6个目录下的redis.conf都需要进行修改)

bind 10.121.51.30 127.0.0.1  //bind用于指定本机网卡对应的IP地址
protected-mode yes
port 7736  // 定义启动端口
tcp-backlog 511
timeout 0
tcp-keepalive 300
daemonize yes  //redis后台运行
supervised no
pidfile /var/run/redis_7736.pid  //socket pid
loglevel notice
logfile "/data/redis_mai6/redis/redis.log"  //设定日志
databases 16
always-show-logo yes
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /data/redis_mai6/redis/data/
slave-serve-stale-data yes
slave-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no
slave-priority 100
slave-announce-ip 10.121.51.30
slave-announce-port 7736
lazyfree-lazy-expire no
lazyfree-lazy-server-del no
slave-lazy-flush no
appendonly yes
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
aof-use-rdb-preamble no
lua-time-limit 5000 //5秒
cluster-enabled yes  //启动集群模式
cluster-config-file nodes.conf  //集群中的实例启动时,自动生成
cluster-node-timeout 15000 //节点超时时间,单位毫秒,设置一个较小的超时时间(15秒)
cluster-announce-ip 10.121.51.30
cluster-announce-port 7736
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events ""
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-size -2
list-compress-depth 0
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes
copy 过来的 nodes.conf的样子,这个文件在重新安装集群时,需要删除掉,因为redis node节点启动的时候会自动生成。
41767372c36cc268872e86d96660a32bf540624f 10.121.51.30:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1636343114419 43 connected
ff34a5fadf42078332445bcf41a2b59416fd92da 10.121.51.30:7734@17734 slave fb28e95eb37b8d827c0f48800c4d08e64c6fc335 0 1636343115000 44 connected
4bee54a5049524a6280fdb8a8de00a5bb94ccaa1 10.121.51.30:7733@17733 slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1636343116780 46 connected
07c99eae8176eb0e985c5a9914f42900d18fb39b 10.121.51.30:7736@17736 myself,master - 0 1636343113000 46 connected 10923-16383
b9a55f739a65e6277e40cdbb806b5443c8aad66e 10.121.51.30:7735@17735 master - 0 1636343115654 43 connected 5461-10922
fb28e95eb37b8d827c0f48800c4d08e64c6fc335 10.121.51.30:7731@17731 master - 0 1636343116656 44 connected 0-5460
vars currentEpoch 47 lastVoteEpoch 44
//这个里面标记了,节点名称、ip、端口号、角色、连接状态等。


由于整个redis是copy 过来的,需要进入到每个节点,删除data下的几个文件,保证生成的时候不受干扰,因为对应3个文件也会自动生成。

四、删除data目录下的3个文件。
[root@ht20 data]#cd /data/redis_mai1/redis/data/

[root@ht20 data]# ll
total 2296
-rw-r--r-- 1 root root 1442571 Mar 1 12:52 appendonly.aof
-rw-r--r-- 1 root root 897617 Mar 1 12:49 dump.rdb

//默认情况下 每隔一段时间redis服务器程序会自动对数据库做一次遍历,把内存快照写在一个叫做“dump.rdb”的文件里,这个持久化机制叫做SNAPSHOT
-rw-r--r-- 1 root root 1432 Mar 1 12:49 nodes.conf

到6个实例的各自的data目录下删除掉对应的三个文件.

大家该说了,怎么这么麻烦你直接用一个新的redis包不就可以了,是的,我们可以用新的,是一样的,只是我为了和现在的兼容。

创建集群,你会发现会有这种错误,是因为没有删掉data下的内容, 我这里只起了2个节点, 测试了一下。

(加入集群的节点必须是空节点,不包含槽/数据信息,否则不能加入集群)


[root@ht20 redis]# ./redis-trib.rb create --replicas 1 10.121.51.30:7731 10.129.51.30:7732
>>> Creating cluster
[ERR] Node 10.121.51.30:7732 is not empty. Either the node already knows other nodes 
(check with CLUSTER NODES) or contains some key in database 0.

//我现在只启动了2个,11,12 ,会报至少需要6个实例的提示信息,或者叫6个节点(我们可以设置node,放置在不同服务器上)
[root@ht20 redis]# ./redis-trib.rb create --replicas 1 10.129.51.30:7731 10.129.51.30:7732
>>> Creating cluster
*** ERROR: Invalid configuration for cluster creation.
*** Redis Cluster requires at least 3 master nodes.  //至少3个 master节点
*** This is not possible with 2 nodes and 1 replicas per node.
*** At least 6 nodes are required.  //至少需要6个节点

 

好了,说下整个过程,现在正式说安装启动6个实例过程(一台机器上,启动6个实例方式)

1、cp 过来整个redis ,6个实例目录已经规划好了  /data/redis_mai* 方式
2、修改redis.conf(如端口,ip等)
3、删除掉data下的3个文件 appendonly.aof dump.rdb nodes.conf

4、启动各个实例。

 [root@ht20 redis]#/data/redis_mai1/redis/redis-server /data/redis_mai1/redis/redis.conf
 [root@ht20 redis]#/data/redis_mai2/redis/redis-server /data/redis_mai2/redis/redis.conf
 [root@ht20 redis]#/data/redis_mai3/redis/redis-server /data/redis_mai3/redis/redis.conf
 [root@ht20 redis]#/data/redis_mai4/redis/redis-server /data/redis_mai4/redis/redis.conf
 [root@ht20 redis]#/data/redis_mai5/redis/redis-server /data/redis_mai5/redis/redis.conf
 [root@ht20 redis]#/data/redis_mai6/redis/redis-server /data/redis_mai6/redis/redis.conf

   查看6个实例进程启动情况,都正常启动

[root@ht20 data]# ps -ef | grep redis
root 57063 1 0 12:39 ? 00:00:01 /data/redis_mai1/redis/redis-server 10.121.51.30:7731 [cluster]
root 58394 1 0 12:45 ? 00:00:00 /data/redis_mai2/redis/redis-server 10.121.51.30:7732 [cluster]
root 58881 1 0 12:47 ? 00:00:00 /data/redis_mai3/redis/redis-server 10.121.51.30:7733 [cluster]
root 58893 1 0 12:47 ? 00:00:00 /data/redis_mai4/redis/redis-server 10.121.51.30:7734 [cluster]
root 58905 1 0 12:47 ? 00:00:00 /data/redis_mai5/redis/redis-server 10.121.51.30:7735 [cluster]
root 58939 1 0 12:47 ? 00:00:00 /data/redis_mai6/redis/redis-server 10.121.51.30:7736 [cluster]

//这个时候他们虽然各自启动了,但是还相互发现不了对方。需要进行集群的配置

 

5、集群配置(这里注意下,由于14和30是部署在两个机房的,但是网络是相通的,结果执行创建后,这些都和原来的redis集群混到了一块)

说明: "--replicas 1"参数,指定每个主节点配置的从节点数量,如果这里设置为1(1:1比例,一个主,一个从),
最少3个主节点的情况下,总节点数不能低于6,否则集群不能成功创建
因为在 Redis 集群中有且仅有 16383 个 solt ,默认情况会给我们平均分配,
当然你可以指定,后续的增减节点也可以重新分配

[root@ht20 redis]# ./redis-trib.rb create --replicas 1 10.121.51.30:7731
                          10.121.51.30:7732 10.121.51.30:7733 10.121.51.30:7734
                                        10.121.51.30:7735 10.121.51.30:7736
>>> Creating cluster
>>> Performing hash slots allocation on 6 nodes...
Using 3 masters:
10.121.51.30:7731
10.121.51.30:7732
10.121.51.30:7733
Adding replica 10.121.51.30:7735 to 10.121.51.30:7731
Adding replica 10.121.51.30:7736 to 10.121.51.30:7732
Adding replica 10.121.51.30:7734 to 10.121.51.30:7733
>>> Trying to optimize slaves allocation for anti-affinity
[WARNING] Some slaves are in the same host as their master
M: d46f032ea50763de8353fd530535412df6ffdc00 10.121.51.30:7731
   slots:0-5460 (5461 slots) master
M: 91b7f8c79c91c7edd77458c332f0b9299bdb94d4 10.121.51.30:7732
   slots:5461-10922 (5462 slots) master
M: d1c7d99e13a2d7317baf883dffa906470a606641 10.121.51.30:7733
   slots:10923-16383 (5461 slots) master
S: 27f7e8dede5b68581486ce8cefdd656032baed70 10.121.51.30:7734
   replicates d46f032ea50763de8353fd530535412df6ffdc00
S: 4d9500ff2ee1432094178dad628007858e876e99 10.121.51.30:7735
   replicates 91b7f8c79c91c7edd77458c332f0b9299bdb94d4
S: bc5051aef1a756c274907d23198fbb932c384151 10.121.51.30:7736
   replicates d1c7d99e13a2d7317baf883dffa906470a606641
Can I set the above configuration? (type 'yes' to accept): yes

 同意分配计划,输入"yes"后,各节点开始通讯,并协商哈希槽的分配,
 最后输出报告指出各主节点分配的哈希槽,16384个哈希槽全部分配完毕,集群创建成功。
一旦回车之后
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join................
>>> Performing Cluster Check (using node 10.129.51.30:7731)
S: d46f032ea50763de8353fd530535412df6ffdc00 10.129.51.30:7731
   slots: (0 slots) slave
   replicates fb28e95eb37b8d827c0f48800c4d08e64c6fc335
S: d1c7d99e13a2d7317baf883dffa906470a606641 10.129.51.30:7733
   slots: (0 slots) slave
   replicates 07c99eae8176eb0e985c5a9914f42900d18fb39b
M: 07c99eae8176eb0e985c5a9914f42900d18fb39b 10.128.51.14:7736
   slots:10923-16383 (5461 slots) master
   2 additional replica(s)
M: 27f7e8dede5b68581486ce8cefdd656032baed70 10.129.51.30:7734
   slots: (0 slots) master
   0 additional replica(s)
S: ff34a5fadf42078332445bcf41a2b59416fd92da 10.128.51.14:7734
   slots: (0 slots) slave
   replicates fb28e95eb37b8d827c0f48800c4d08e64c6fc335
S: 41767372c36cc268872e86d96660a32bf540624f 10.128.51.14:7732
   slots: (0 slots) slave
   replicates b9a55f739a65e6277e40cdbb806b5443c8aad66e
S: 91b7f8c79c91c7edd77458c332f0b9299bdb94d4 10.129.51.30:7732
   slots: (0 slots) slave
   replicates b9a55f739a65e6277e40cdbb806b5443c8aad66e
M: fb28e95eb37b8d827c0f48800c4d08e64c6fc335 10.128.51.14:7731
   slots:0-5460 (5461 slots) master
   2 additional replica(s)
M: bc5051aef1a756c274907d23198fbb932c384151 10.129.51.30:7736
   slots: (0 slots) master
   0 additional replica(s)
S: 4bee54a5049524a6280fdb8a8de00a5bb94ccaa1 10.128.51.14:7733
   slots: (0 slots) slave
   replicates 07c99eae8176eb0e985c5a9914f42900d18fb39b
M: b9a55f739a65e6277e40cdbb806b5443c8aad66e 10.128.51.14:7735
   slots:5461-10922 (5462 slots) master
   2 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
//注意由于其他环境下也有一组集群,他们自动都进行相互关联了。

 

6、登录控制台查看


//登录集群节点(随便哪个节点都可以,查看和测试集群内的数据)

[root@ht20 redis]# ./redis-cli -c -h 10.129.51.30 -p 7731
10.121.51.30:7731> get kingking
-> Redirected to slot [5153] located at 10.128.51.14:7731
(nil)
10.128.51.14:7731> set a 1111
-> Redirected to slot [15495] located at 10.128.51.14:7736
OK
10.128.51.14:7736> get a
"1111"
10.128.51.14:7736> exit

//登录控制台,必须是-c 集群命令你可以执行,但是例如:set key value 等操作就会出现错误。
[root@ht20 redis]# ./redis-cli -c -h 10.129.51.30 -p 7732
10.129.51.30:7732> set bb aaaaaa
-> Redirected to slot [8620] located at 10.128.51.14:7735
OK
10.128.51.14:7735> get bb
"aaaaaa"

 //我们看到有些数据设置到了,另一组redis集群里面了

//登录单节点,不加-c , 这种方式由于不加-c 就启动不了集群模式,是没有意义的,会报错。

[root@ht20 redis]# ./redis-cli -h 10.121.51.30 -p 7732
10.121.51.30:7732> get bb
(error) MOVED 8620 10.128.51.14:7735

其他cluster命令,看看有哪些可以查看的

CLUSTER info:打印集群的信息。
CLUSTER nodes:列出集群当前已知的所有节点(node)的相关信息。
CLUSTER meet <ip> <port>:将ip和port所指定的节点添加到集群当中。
CLUSTER addslots <slot> [slot ...]:将一个或多个槽(slot)指派(assign)给当前节点。
CLUSTER delslots <slot> [slot ...]:移除一个或多个槽对当前节点的指派。
CLUSTER slots:列出槽位、节点信息。
CLUSTER slaves <node_id>:列出指定节点下面的从节点信息。 比如:  cluster slaves 11f9169577352c33d85ad0d1ca5f5bf0deba3209
这个实际查的是nodes.conf
CLUSTER replicate <node_id>:将当前节点设置为指定节点的从节点。
CLUSTER saveconfig:         手动执行命令保存保存集群的配置文件,集群默认在配置修改的时候会自动保存配置文件。
CLUSTER keyslot <key>:      列出key被放置在哪个槽上。 例如:cluster keyslot 9223372036854742675
CLUSTER flushslots:         移除指派给当前节点的所有槽,让当前节点变成一个没有指派任何槽的节点。
CLUSTER countkeysinslot <slot>:      返回槽目前包含的键值对数量。
CLUSTER getkeysinslot <slot槽号> <key的数量>:返回count个槽中的键。  比如:cluster getkeysinslot 202 3   
CLUSTER setslot <slot> node <node_id> 将槽指派给指定的节点,如果槽已经指派给另一个节点,那么先让另一个节点删除该槽,然后再进行指派。  
CLUSTER setslot <slot> migrating <node_id> 将本节点的槽迁移到指定的节点中。  
CLUSTER setslot <slot> importing <node_id> 从 node_id 指定的节点中导入槽 slot 到本节点。  
CLUSTER setslot <slot> stable 取消对槽 slot 的导入(import)或者迁移(migrate)。 
CLUSTER failover:      手动进行故障转移。
CLUSTER forget <node_id>: 从集群中移除指定的节点,这样就无法完成握手,过期时为60s,60s后两节点又会继续完成握手。
CLUSTER reset [HARD|SOFT]:重置集群信息,soft是清空其他节点的信息,但不修改自己的id,hard还会修改自己的id,不传该参数则使用soft方式。
CLUSTER count-failure-reports <node_id>:列出某个节点的故障报告的长度。
CLUSTER SET-CONFIG-EPOCH: 设置节点epoch,只有在节点加入集群前才能设置。

//使用集群工具来查看集群是否正常
[root@ht20 redis]# ./redis-trib.rb check 10.121.51.30:7731
[root@ht20 redis]#  ./redis-trib.rb check 10.121.51.30:7731
>>> Performing Cluster Check (using node 10.121.51.30:7731)
S: d46f032ea50763de8353fd530535412df6ffdc00 10.121.51.30:7731  //从1
   slots: (0 slots) slave
   replicates fb28e95eb37b8d827c0f48800c4d08e64c6fc335
S: d1c7d99e13a2d7317baf883dffa906470a606641 10.121.51.30:7733  //从2
   slots: (0 slots) slave
   replicates 07c99eae8176eb0e985c5a9914f42900d18fb39b
M: 07c99eae8176eb0e985c5a9914f42900d18fb39b 10.128.51.14:7736  //主5
   slots:10923-16383 (5461 slots) master
   2 additional replica(s)
M: 27f7e8dede5b68581486ce8cefdd656032baed70 10.121.51.30:7734  //主4
   slots: (0 slots) master
   0 additional replica(s)
S: ff34a5fadf42078332445bcf41a2b59416fd92da 10.128.51.14:7734  //从3
   slots: (0 slots) slave
   replicates fb28e95eb37b8d827c0f48800c4d08e64c6fc335
S: 41767372c36cc268872e86d96660a32bf540624f 10.128.51.14:7732   //从4
   slots: (0 slots) slave
   replicates b9a55f739a65e6277e40cdbb806b5443c8aad66e
S: 91b7f8c79c91c7edd77458c332f0b9299bdb94d4 10.121.51.30:7732  //从5
   slots: (0 slots) slave
   replicates b9a55f739a65e6277e40cdbb806b5443c8aad66e
M: fb28e95eb37b8d827c0f48800c4d08e64c6fc335 10.128.51.14:7731   //主3
   slots:0-5460 (5461 slots) master
   2 additional replica(s)
M: bc5051aef1a756c274907d23198fbb932c384151 10.121.51.30:7736   //主2
   slots: (0 slots) master
   0 additional replica(s)
S: 4bee54a5049524a6280fdb8a8de00a5bb94ccaa1 10.128.51.14:7733   //从6
   slots: (0 slots) slave
   replicates 07c99eae8176eb0e985c5a9914f42900d18fb39b
M: b9a55f739a65e6277e40cdbb806b5443c8aad66e 10.128.51.14:7735   //主1
   slots:5461-10922 (5462 slots) master
   2 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
//数了一下发现少了一个实例5,后来查看之后发现是第5个实例下的redis.conf配置ip错误,这里就不改了。

 

  集群相关命令

查看集群,非常有用的cluster info
[root@ht20 redis]# ./redis-cli -c -p 7731 cluster info
cluster_state:ok
cluster_slots_assigned:16384  //#被分配的槽位数
cluster_slots_ok:16384  //#正确分配的槽位
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6  //当前是6个节点
cluster_size:3    //当前集群下的有槽位分配的节点,即主节点
cluster_current_epoch:48
cluster_my_epoch:44
cluster_stats_messages_ping_sent:295
cluster_stats_messages_pong_sent:336
cluster_stats_messages_sent:631
cluster_stats_messages_ping_received:336
cluster_stats_messages_pong_received:295
cluster_stats_messages_received:631

如果集群失败,则信息如下

[root@ht20 redis]# ./redis-cli -c -p 7731 cluster info
  cluster_state:fail
  cluster_slots_assigned:2380
  cluster_slots_ok:2380
  cluster_slots_pfail:0
  cluster_slots_fail:0
  cluster_known_nodes:1
  cluster_size:1
  cluster_current_epoch:0
  cluster_my_epoch:0
  cluster_stats_messages_sent:0
  cluster_stats_messages_received:0

//登录单节点,没有-c参数,集群单节点显示失败的相关信息
[root@ht20 redis]# ./redis-cli -h 10.121.51.30 -p 7731

  10.121.51.30:7731> get king
  (error) CLUSTERDOWN Hash slot not served

//我们可以查看所有的节点信息

[root@ht20 redis]# ./redis-cli -c -h 10.129.51.30 -p 7731
10.129.51.30:7731> cluster nodes 
d1c7d99e13a2d7317baf883dffa906470a606641 10.129.51.30:7733@17733 slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1646112764069 46 connected
d46f032ea50763de8353fd530535412df6ffdc00 10.129.51.30:7731@17731 myself,slave fb28e95eb37b8d827c0f48800c4d08e64c6fc335 0
   1646112761000 1 connected 07c99eae8176eb0e985c5a9914f42900d18fb39b 10.128.51.14:7736@17736 master - 0 1646112766077 46 connected 10923-16383 //最后的 10923-16383 就是槽位 27f7e8dede5b68581486ce8cefdd656032baed70 10.129.51.30:7734@17734 master - 0 1646112764000 4 connected ff34a5fadf42078332445bcf41a2b59416fd92da 10.128.51.14:7734@17734 slave fb28e95eb37b8d827c0f48800c4d08e64c6fc335 0 1646112766000 44 connected 41767372c36cc268872e86d96660a32bf540624f 10.128.51.14:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1646112762071 43 connected 91b7f8c79c91c7edd77458c332f0b9299bdb94d4 10.129.51.30:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1646112766000 43 connected fb28e95eb37b8d827c0f48800c4d08e64c6fc335 10.128.51.14:7731@17731 master - 0 1646112763073 44 connected 0-5460 //槽位 bc5051aef1a756c274907d23198fbb932c384151 10.129.51.30:7736@17736 master - 0 1646112767573 6 connected 4bee54a5049524a6280fdb8a8de00a5bb94ccaa1 10.128.51.14:7733@17733 slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1646112767078 46 connected b9a55f739a65e6277e40cdbb806b5443c8aad66e 10.128.51.14:7735@17735 master - 0 1646112767578 43 connected 5461-10922 //槽位
//实际读取的是 data下的nodes.conf文件
//可以查看槽位与对应节点的信息
[root@ht20 redis]# ./redis-cli -c -h 10.121.51.30 -p 7731
10.121.51.30:7731> cluster slots
1) 1) (integer) 10923
   2) (integer) 16383
   3) 1) "10.128.51.14"
      2) (integer) 7736
      3) "07c99eae8176eb0e985c5a9914f42900d18fb39b"
   4) 1) "10.128.51.14"
      2) (integer) 7733
      3) "4bee54a5049524a6280fdb8a8de00a5bb94ccaa1"
   5) 1) "10.121.51.30"
      2) (integer) 7733
      3) "d1c7d99e13a2d7317baf883dffa906470a606641"
2) 1) (integer) 0  //槽位范围 (0-5460) 
   2) (integer) 5460
   3) 1) "10.128.51.14"
      2) (integer) 7731
      3) "fb28e95eb37b8d827c0f48800c4d08e64c6fc335"
   4) 1) "10.121.51.30"
      2) (integer) 7731
      3) "d46f032ea50763de8353fd530535412df6ffdc00"
   5) 1) "10.128.51.14"
      2) (integer) 7734
      3) "ff34a5fadf42078332445bcf41a2b59416fd92da"
3) 1) (integer) 5461 //解释槽位(5461-10923)
   2) (integer) 10922
   3) 1) "10.128.51.14"
      2) (integer) 7735
      3) "b9a55f739a65e6277e40cdbb806b5443c8aad66e"
   4) 1) "10.128.51.14"
      2) (integer) 7732
      3) "41767372c36cc268872e86d96660a32bf540624f"
   5) 1) "10.121.51.30"
      2) (integer) 7732
      3) "91b7f8c79c91c7edd77458c332f0b9299bdb94d4"
10.121.51.30:7731> 

10.129.51.30:7731> get bb
-> Redirected to slot [8620] located at 10.128.51.14:7735
"aaaaaa"
//cluster countkeysinslot 数字 可以获取指定哈希槽的key的数量
10.128.51.14:7735> cluster countkeysinslot 7735 (integer) 0

 

//info信息比较丰富,查看主从关系.

[root@ht20 redis]# ./redis-cli -c -h 10.121.51.30 -p 7731
10.121.51.30:7731> info
# Server
redis_version:4.0.14
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:a754ba616e326a74
redis_mode:cluster
os:Linux 3.10.0-1160.45.1.el7.x86_64 x86_64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:atomic-builtin
gcc_version:4.8.5
process_id:57063
run_id:409144476e070fcf4813ccacef38effdc9952534
tcp_port:7731
uptime_in_seconds:15517
uptime_in_days:0
hz:10
lru_clock:1957924
executable:/data/redis_fpmai1/redis/redis-server
config_file:/data/redis_fpmai1/redis/redis.conf

# Clients
connected_clients:129
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0

# Memory
used_memory:7421344
used_memory_human:7.08M
used_memory_rss:15884288
used_memory_rss_human:15.15M
used_memory_peak:9244376
used_memory_peak_human:8.82M
used_memory_peak_perc:80.28%
used_memory_overhead:4937420
used_memory_startup:1445024
used_memory_dataset:2483924
used_memory_dataset_perc:41.56%
total_system_memory:267306831872
total_system_memory_human:248.95G
used_memory_lua:39936
used_memory_lua_human:39.00K
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
mem_fragmentation_ratio:2.14
mem_allocator:jemalloc-4.0.3
active_defrag_running:0
lazyfree_pending_objects:0

# Persistence
loading:0
rdb_changes_since_last_save:61
rdb_bgsave_in_progress:0
rdb_last_save_time:1646124910
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:0
rdb_current_bgsave_time_sec:-1
rdb_last_cow_size:4972544
aof_enabled:1
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:0
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:12689408
aof_current_size:4539890
aof_base_size:1509571
aof_pending_rewrite:0
aof_buffer_length:0
aof_rewrite_buffer_length:0
aof_pending_bio_fsync:0
aof_delayed_fsync:0

# Stats
total_connections_received:1705
total_commands_processed:25957
instantaneous_ops_per_sec:1
total_net_input_bytes:4558664
total_net_output_bytes:4948669
instantaneous_input_kbps:0.04
instantaneous_output_kbps:6.26
rejected_connections:0
sync_full:2
sync_partial_ok:0
sync_partial_err:2
expired_keys:0
expired_stale_perc:0.00
expired_time_cap_reached_count:0
evicted_keys:0
keyspace_hits:51
keyspace_misses:26
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:1254
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0

# Replication
role:slave
master_host:10.128.51.14
master_port:7731
master_link_status:up
master_last_io_seconds_ago:2
master_sync_in_progress:0
slave_repl_offset:1542999173
slave_priority:100
slave_read_only:1
connected_slaves:1
slave0:ip=10.128.51.14,port=7734,state=online,offset=1542999173,lag=0
master_replid:2dd82c47c221ccb5b19871366f533f5856413f74
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:1542999173
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1541950598
repl_backlog_histlen:1048576

# CPU
used_cpu_sys:9.87
used_cpu_user:5.31
used_cpu_sys_children:0.25
used_cpu_user_children:0.60

# Cluster
cluster_enabled:1

# Keyspace
db0:keys=3189,expires=708,avg_ttl=0
10.121.51.30:7731> 

 

//这里补充下,由于第5个实例,添加失败,是由于redis.conf配置ip出错,查看/data下的nodes.conf出现问题。


  [root@ht20 data]# pwd
  /data/redis_fpmai5/redis/data
  [root@ht20 data]# cat nodes.conf 
  32ee19af1f6a534c4014b9c41d387666f049354b :0@0 myself,master - 0 0 0 connected
  vars currentEpoch 0 lastVoteEpoch 0

//下面把实例5节点加入到集群

//增加一个节点到集群
[root@ht20 redis]# ./redis-trib.rb add-node 10.129.51.30:7735 10.129.51.30:7731
>>> Adding node 10.129.51.30:7735 to cluster 10.129.51.30:7731
>>> Performing Cluster Check (using node 10.129.51.30:7731)
S: d46f032ea50763de8353fd530535412df6ffdc00 10.129.51.30:7731
   slots: (0 slots) slave
   replicates fb28e95eb37b8d827c0f48800c4d08e64c6fc335
S: d1c7d99e13a2d7317baf883dffa906470a606641 10.129.51.30:7733
   slots: (0 slots) slave
   replicates 07c99eae8176eb0e985c5a9914f42900d18fb39b
M: 07c99eae8176eb0e985c5a9914f42900d18fb39b 10.128.51.14:7736
   slots:10923-16383 (5461 slots) master
   2 additional replica(s)
M: 27f7e8dede5b68581486ce8cefdd656032baed70 10.129.51.30:7734
   slots: (0 slots) master
   0 additional replica(s)
S: ff34a5fadf42078332445bcf41a2b59416fd92da 10.128.51.14:7734
   slots: (0 slots) slave
   replicates fb28e95eb37b8d827c0f48800c4d08e64c6fc335
S: 41767372c36cc268872e86d96660a32bf540624f 10.128.51.14:7732
   slots: (0 slots) slave
   replicates b9a55f739a65e6277e40cdbb806b5443c8aad66e
S: 91b7f8c79c91c7edd77458c332f0b9299bdb94d4 10.129.51.30:7732
   slots: (0 slots) slave
   replicates b9a55f739a65e6277e40cdbb806b5443c8aad66e
M: fb28e95eb37b8d827c0f48800c4d08e64c6fc335 10.128.51.14:7731
   slots:0-5460 (5461 slots) master
   2 additional replica(s)
M: bc5051aef1a756c274907d23198fbb932c384151 10.129.51.30:7736
   slots: (0 slots) master
   0 additional replica(s)
S: 4bee54a5049524a6280fdb8a8de00a5bb94ccaa1 10.128.51.14:7733
   slots: (0 slots) slave
   replicates 07c99eae8176eb0e985c5a9914f42900d18fb39b
M: b9a55f739a65e6277e40cdbb806b5443c8aad66e 10.128.51.14:7735
   slots:5461-10922 (5462 slots) master
   2 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 10.129.51.30:7735 to make it join the cluster.
[OK] New node added correctly.

 

 

槽位补充知识(由于要进行槽位迁移,所以这些还是要了解,槽位可以理解为数据占位,
集群就这么多个位置,每个位置在集群中唯一的,这里是三个master,给了3个槽段)
这个也大大方便我们进行相互迁移。

槽位机制(slot)

分配槽位:cluster addslots 槽位,一个槽位只能分配一个节点,16384个槽位必须分配完,不同节点不能冲突,例如

  10.121.51.30:7730> cluster addslots 0

  OK
  10.121.51.30:7730> cluster addslots 0   #冲突
  (error) ERR Slot 0 is already busy

批量添加槽位的脚本

node1:
#!/bin/bash
n=0
for ((i=n;i<=5461;i++))
do
   /usr/local/bin/redis-cli -h 192.168.100.134 -p 17021 -a dxy CLUSTER ADDSLOTS $i
done

node2:
#!/bin/bash
n=5462
for ((i=n;i<=10922;i++))
do
   /usr/local/bin/redis-cli -h 192.168.100.135 -p 17021 -a dxy CLUSTER ADDSLOTS $i
done

node3:
#!/bin/bash
n=10923
for ((i=n;i<=16383;i++))
do
   /usr/local/bin/redis-cli -h 192.168.100.136 -p 17021 -a dxy CLUSTER ADDSLOTS $i
done

 移除一个槽位看看:cluster delslots 槽位

10.121.51.31:7731> cluster delslots 0
OK
10.121.51.31:7731> cluster info
cluster_state:fail
cluster_slots_assigned:16383  //原来是16384
cluster_slots_ok:16383
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:3
cluster_size:3
cluster_current_epoch:2
cluster_my_epoch:1
cluster_stats_messages_sent:4482
cluster_stats_messages_received:4482
如果16384 个槽位如果没有分配完全,集群是不成功的。

 如何进行迁移槽位,目前是槽位都在10.128.51.14上

[root@ht20 redis]# ./redis-cli -c -h 10.129.51.30 -p 7731
10.129.51.30:7731> cluster nodes 
d1c7d99e13a2d7317baf883dffa906470a606641 10.129.51.30:7733@17733 slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1646112764069 46 connected
d46f032ea50763de8353fd530535412df6ffdc00 10.129.51.30:7731@17731 myself,slave fb28e95eb37b8d827c0f48800c4d08e64c6fc335 0 1646112761000 1 connected
07c99eae8176eb0e985c5a9914f42900d18fb39b 10.128.51.14:7736@17736 master - 0 1646112766077 46 connected 10923-16383  
//最后的 10923-16383 就是槽位
27f7e8dede5b68581486ce8cefdd656032baed70 10.129.51.30:7734@17734 master - 0 1646112764000 4 connected
ff34a5fadf42078332445bcf41a2b59416fd92da 10.128.51.14:7734@17734 slave fb28e95eb37b8d827c0f48800c4d08e64c6fc335 0 1646112766000 44 connected
41767372c36cc268872e86d96660a32bf540624f 10.128.51.14:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1646112762071 43 connected
91b7f8c79c91c7edd77458c332f0b9299bdb94d4 10.129.51.30:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1646112766000 43 connected
fb28e95eb37b8d827c0f48800c4d08e64c6fc335 10.128.51.14:7731@17731 master - 0 1646112763073 44 connected 0-5460 
 //槽位,这个要先记住当前是master,后面很多操作之后,变成slave了。
bc5051aef1a756c274907d23198fbb932c384151 10.129.51.30:7736@17736 master - 0 1646112767573 6 connected
4bee54a5049524a6280fdb8a8de00a5bb94ccaa1 10.128.51.14:7733@17733 slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1646112767078 46 connected
b9a55f739a65e6277e40cdbb806b5443c8aad66e 10.128.51.14:7735@17735 master - 0 1646112767578 43 connected 5461-10922 //槽位
//实际读取的是 data下的nodes.conf文件
//14机器本身在另外一个机房,而30在一个机房。

 

计划迁移槽位有两个目的:
1)、由于我们要迁移机房,所以14这台机器要释放掉,所以集群要迁移到30上去
2)、操作当中由于一次失误,cluster nodes命令不识别 第5台实例需要修复

//把实例5节点补充增加至集群里面,由于之前集群本身是11个,5主6从,实例5加入后自动变成主,从而变成了6主,6从
[root@ht20 redis]# ./redis-trib.rb add-node 10.129.51.30:7735 10.129.51.30:7731  //这里必须是两台机器.
>>> Adding node 10.129.51.30:7735 to cluster 10.129.51.30:7731
>>> Performing Cluster Check (using node 10.129.51.30:7731)
S: d46f032ea50763de8353fd530535412df6ffdc00 10.129.51.30:7731
   slots: (0 slots) slave
   replicates fb28e95eb37b8d827c0f48800c4d08e64c6fc335
S: d1c7d99e13a2d7317baf883dffa906470a606641 10.129.51.30:7733
   slots: (0 slots) slave
   replicates 07c99eae8176eb0e985c5a9914f42900d18fb39b
M: 07c99eae8176eb0e985c5a9914f42900d18fb39b 10.128.51.14:7736
   slots:10923-16383 (5461 slots) master
   2 additional replica(s)
M: 27f7e8dede5b68581486ce8cefdd656032baed70 10.129.51.30:7734
   slots: (0 slots) master
   0 additional replica(s)
S: ff34a5fadf42078332445bcf41a2b59416fd92da 10.128.51.14:7734
   slots: (0 slots) slave
   replicates fb28e95eb37b8d827c0f48800c4d08e64c6fc335
S: 41767372c36cc268872e86d96660a32bf540624f 10.128.51.14:7732
   slots: (0 slots) slave
   replicates b9a55f739a65e6277e40cdbb806b5443c8aad66e
S: 91b7f8c79c91c7edd77458c332f0b9299bdb94d4 10.129.51.30:7732
   slots: (0 slots) slave
   replicates b9a55f739a65e6277e40cdbb806b5443c8aad66e
M: fb28e95eb37b8d827c0f48800c4d08e64c6fc335 10.128.51.14:7731
   slots:0-5460 (5461 slots) master
   2 additional replica(s)
M: bc5051aef1a756c274907d23198fbb932c384151 10.129.51.30:7736
   slots: (0 slots) master
   0 additional replica(s)
S: 4bee54a5049524a6280fdb8a8de00a5bb94ccaa1 10.128.51.14:7733
   slots: (0 slots) slave
   replicates 07c99eae8176eb0e985c5a9914f42900d18fb39b
M: b9a55f739a65e6277e40cdbb806b5443c8aad66e 10.128.51.14:7735
   slots:5461-10922 (5462 slots) master
   2 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 10.129.51.30:7735 to make it join the cluster.
[OK] New node added correctly.

 

槽位迁移:  从14的master,迁移至30上的master上

步骤1、首先查下cluster nodes ,得到节点信息后整理如下

10.129.51.30:7735> cluster nodes
//30机器主
27f7e8dede5b68581486ce8cefdd656032baed70 10.121.51.30:7734@17734 master - 0 1646125938000 4 connected
32ee19af1f6a534c4014b9c41d387666f049354b 10.121.51.30:7735@17735 myself,master - 0 1646125936000 0 connected
bc5051aef1a756c274907d23198fbb932c384151 10.121.51.30:7736@17736 master - 0 1646125939755 6 connected
//30机器从
91b7f8c79c91c7edd77458c332f0b9299bdb94d4 10.129.51.30:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1646125940000 43 connected
d46f032ea50763de8353fd530535412df6ffdc00 10.121.51.30:7731@17731 slave fb28e95eb37b8d827c0f48800c4d08e64c6fc335 0 1646125941758 44 connected
d1c7d99e13a2d7317baf883dffa906470a606641 10.121.51.30:7733@17733 slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1646125940757 46 connected
//14机器从
41767372c36cc268872e86d96660a32bf540624f 10.128.51.14:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1646125942263 43 connected
4bee54a5049524a6280fdb8a8de00a5bb94ccaa1 10.128.51.14:7733@17733 slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1646125943767 46 connected
ff34a5fadf42078332445bcf41a2b59416fd92da 10.128.51.14:7734@17734 slave fb28e95eb37b8d827c0f48800c4d08e64c6fc335 0 1646125938000 44 connected
//14机器主
fb28e95eb37b8d827c0f48800c4d08e64c6fc335 10.128.51.14:7731@17731 master - 0 1646125942765 44 connected 0-5460
b9a55f739a65e6277e40cdbb806b5443c8aad66e 10.128.51.14:7735@17735 master - 0 1646125939560 43 connected 5461-10922
07c99eae8176eb0e985c5a9914f42900d18fb39b 10.128.51.14:7736@17736 master - 0 1646125937757 46 connected 10923-16383

一共就是0-16383个槽位,需要导入到30机器对应上(这里本来整理好了,但是我特意把某台master shutdown了,
但是这种比较清晰看着,所以还是放在这了, 重启之后会发现master变了)

步骤2)规划到源节点进行迁移操作,要做三个master的转移工作

一、7734 master的迁移工作。
[root@ht20 redis]# ./redis-cli -c -h 10.121.51.30 -p 7734 10.121.51.30:7734> cluster nodes 07c99eae8176eb0e985c5a9914f42900d18fb39b 10.128.51.14:7736@17736 master - 0 1646130359542 46 connected 10923-16383 准备要迁出的 b9a55f739a65e6277e40cdbb806b5443c8aad66e 10.128.51.14:7735@17735 master - 0 1646130363549 43 connected 5461-10922 准备要迁出的 91b7f8c79c91c7edd77458c332f0b9299bdb94d4 10.121.51.30:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1646130362544 43 connected fb28e95eb37b8d827c0f48800c4d08e64c6fc335 10.128.51.14:7731@17731 slave ff34a5fadf42078332445bcf41a2b59416fd92da 0 1646130363549 49 connected ff34a5fadf42078332445bcf41a2b59416fd92da 10.128.51.14:7734@17734 master - 0 1646130362000 49 connected 0-5460 要迁移的源 27f7e8dede5b68581486ce8cefdd656032baed70 10.121.51.30:7734@17734 myself,master - 0 1646130357000 4 connected 要迁入的目标 4bee54a5049524a6280fdb8a8de00a5bb94ccaa1 10.128.51.14:7733@17733 slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1646130358541 46 connected 41767372c36cc268872e86d96660a32bf540624f 10.128.51.14:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1646130360544 43 connected bc5051aef1a756c274907d23198fbb932c384151 10.121.51.30:7736@17736 master - 0 1646130359037 6 connected 32ee19af1f6a534c4014b9c41d387666f049354b 10.121.51.30:7735@17735 master - 0 1646130358000 0 connected d1c7d99e13a2d7317baf883dffa906470a606641 10.121.51.30:7733@17733 slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1646130363000 46 connected d46f032ea50763de8353fd530535412df6ffdc00 10.121.51.30:7731@17731 slave ff34a5fadf42078332445bcf41a2b59416fd92da 0 1646130364546 49 connected
// 27f7e8dede5b68581486ce8cefdd656032baed70 是导入目标id,槽位是5460
10.128.51.14:7734> cluster setslot 5460 MIGRATING  27f7e8dede5b68581486ce8cefdd656032baed70  
OK

//查看下执行情况,如果想看到剪头-> ,必须在要迁出的实例上登录 10.128.51.14:7734> cluster nodes //必须在master节点或者被迁移的master上才能看到 箭头->表示在迁移,别的节点看不到这种变化。 91b7f8c79c91c7edd77458c332f0b9299bdb94d4 10.121.51.30:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1646132546000 43 connected 27f7e8dede5b68581486ce8cefdd656032baed70 10.121.51.30:7734@17734 master - 0 1646132548000 4 connected //迁入目标 d1c7d99e13a2d7317baf883dffa906470a606641 10.121.51.30:7733@17733 slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1646132546000 46 connected 41767372c36cc268872e86d96660a32bf540624f 10.128.51.14:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1646132546811 43 connected bc5051aef1a756c274907d23198fbb932c384151 10.121.51.30:7736@17736 master - 0 1646132547518 6 connected 07c99eae8176eb0e985c5a9914f42900d18fb39b 10.128.51.14:7736@17736 master - 0 1646132545000 46 connected 10923-16383 //准备迁出的 fb28e95eb37b8d827c0f48800c4d08e64c6fc335 10.128.51.14:7731@17731 slave ff34a5fadf42078332445bcf41a2b59416fd92da 0 1646132546000 49 connected 4bee54a5049524a6280fdb8a8de00a5bb94ccaa1 10.128.51.14:7733@17733 slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1646132547312 46 connected b9a55f739a65e6277e40cdbb806b5443c8aad66e 10.128.51.14:7735@17735 master - 0 1646132548814 43 connected 5461-10922 //准备迁出的 d46f032ea50763de8353fd530535412df6ffdc00 10.121.51.30:7731@17731 slave ff34a5fadf42078332445bcf41a2b59416fd92da 0 1646132548000 49 connected 32ee19af1f6a534c4014b9c41d387666f049354b 10.121.51.30:7735@17735 master - 0 1646132545514 0 connected //这里有变化,看到正在迁移。 ff34a5fadf42078332445bcf41a2b59416fd92da 10.128.51.14:7734@17734 myself,master - 0 1646132543000 49 connected 0-5460 [5460->-27f7e8dede5b68581486ce8cefdd656032baed70] //ff34a5fadf42078332445bcf41a2b59416fd92da 导出目标id
//我们再用查看槽位是否迁移成功,cluster getkeysinslot 该指令表示 返回存储在连接节点的指定插槽即槽位里面的是否有key名称列表,应该是多个。
10.128.51.14:7734> cluster getkeysinslot 100 2
(empty list or set)
10.128.51.14:7734> cluster getkeysinslot 201 3
(empty list or set)
10.128.51.14:7734> cluster getkeysinslot 302 2
(empty list or set)
我们看到都为空
如果当前没有数据,也不一定是没有迁移,我们看别的master,//标记号为 07c99eae8176eb0e985c5a9914f42900d18fb39b,找上面的对应
[root@ht20 redis]# ./redis-cli -c -h 10.128.51.14 -p 7736 10.128.51.14:7736> cluster getkeysinslot 10923 10926   1) "4240247369" //这里有数据。

//大约过了几个小时之后,我觉得比较安全了,我停掉了10.128.14上的redis 7734 实例。

  [root@ht4 redis]# /data/redis_mai1/redis/redis-cli -c -h 10.128.51.14 -p 7733 //进入7733
  10.128.51.14:7733> cluster nodes
  d46f032ea50763de8353fd530535412df6ffdc00 10.121.51.30:7731@17731 master - 0 1646146691954 60 connected 0-5460  //已经转移过来了
  32ee19af1f6a534c4014b9c41d387666f049354b 10.121.51.30:7735@17735 master - 0 1646146691554 0 connected
  bc5051aef1a756c274907d23198fbb932c384151 10.121.51.30:7736@17736 master - 0 1646146687000 6 connected
  91b7f8c79c91c7edd77458c332f0b9299bdb94d4 10.121.51.30:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1646146690000 43 connected
  fb28e95eb37b8d827c0f48800c4d08e64c6fc335 10.128.51.14:7731@17731 slave d46f032ea50763de8353fd530535412df6ffdc00 0 1646146690947 60 connected
  41767372c36cc268872e86d96660a32bf540624f 10.128.51.14:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1646146689945 43 connected
  ff34a5fadf42078332445bcf41a2b59416fd92da 10.128.51.14:7734@17734 master,fail - 1646146668988 1646146665880 49 disconnected  //已停掉了
  4bee54a5049524a6280fdb8a8de00a5bb94ccaa1 10.128.51.14:7733@17733 myself,slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1646146688000 45 connected
  d1c7d99e13a2d7317baf883dffa906470a606641 10.121.51.30:7733@17733 slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1646146693959 46 connected
  b9a55f739a65e6277e40cdbb806b5443c8aad66e 10.128.51.14:7735@17735 master - 0 1646146688000 43 connected 5461-10922   //准备迁出的
  27f7e8dede5b68581486ce8cefdd656032baed70 10.121.51.30:7734@17734 master - 0 1646146692957 4 connected 
  07c99eae8176eb0e985c5a9914f42900d18fb39b 10.121.51.14:7736@17736 master - 0 1646146687936 46 connected 10923-16383  //准备迁出的
  10.128.51.14:7733>

   我类似的执行了(这里注意必须对应进入他们的控制台)

  二、7735 master的迁移工作 

   [root@ht4 redis]# /data/redis_mai1/redis/redis-cli -c -h 10.128.51.14 -p 7735  //进入到7735

   10.121.51.30:7735> set aaaaaa kingking      //设置key-->value
   -> Redirected to slot [11924] located at 10.128.51.14:7736  //11922即slot号
   OK

    [root@ht4 redis]# /data/redis_mai1/redis/redis-cli -c -h 10.128.51.14 -p 7736

    10.128.51.14:7736> cluster getkeysinslot 11924 2  //slot槽号上输出2个key.
    1) "a2707b52fb94381977b22720f7dc752b63699adbffad9d9bd3b8b2b85349cffa79999ca07c7a8662f6d8afd58089f072c8f36b3772f74c4a093a06f4c3
5eb479059e293a83a10011"
   2) "aaaaaa"  //这个就是我们设定的key。

    //我们在进入7735

    [root@ht4 redis]# /data/redis_mai1/redis/redis-cli -c -h 10.128.51.14 -p 7735

 10.128.51.14:7735> cluster setslot 10922 MIGRATING  32ee19af1f6a534c4014b9c41d387666f049354b  //迁出到10.121.51.30:7735

   10.128.51.14:7735> cluster getkeysinslot 5451 2
   (empty list or set)
   10.128.51.14:7735> cluster getkeysinslot 5580 3
  (empty list or set)
  10.128.51.14:7735> cluster getkeysinslot 5490 4
  (empty list or set)
  10.128.51.14:7735> cluster getkeysinslot 5482 3

 

   //暂时没有数据了。

   10.128.51.14:7735> cluster nodes

   27f7e8dede5b68581486ce8cefdd656032baed70 10.121.51.30:7734@17734 master - 0 1646148888438 4 connected

   d46f032ea50763de8353fd530535412df6ffdc00 10.121.51.30:7731@17731 master - 0 1646148887435 60 connected 0-5460  //已迁出完毕

   32ee19af1f6a534c4014b9c41d387666f049354b 10.121.51.30:7735@17735 master - 0 1646148888000 0 connected

   4bee54a5049524a6280fdb8a8de00a5bb94ccaa1 10.128.51.14:7733@17733 slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1646148889434 46 connected

   d1c7d99e13a2d7317baf883dffa906470a606641 10.121.51.30:7733@17733 slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1646148886000 46 connected

   41767372c36cc268872e86d96660a32bf540624f 10.128.51.14:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1646148889000 43 connected

   07c99eae8176eb0e985c5a9914f42900d18fb39b 10.128.51.14:7736@17736 master - 0 1646148889000 46 connected 10923-16383 //准备迁出

   fb28e95eb37b8d827c0f48800c4d08e64c6fc335 10.128.51.14:7731@17731 slave d46f032ea50763de8353fd530535412df6ffdc00 0 1646148886427 60 connected

   bc5051aef1a756c274907d23198fbb932c384151 10.121.51.30:7736@17736 master - 0 1646148890941 6 connected

   91b7f8c79c91c7edd77458c332f0b9299bdb94d4 10.121.51.30:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1646148890441 43 connected

   //正在迁出中

   b9a55f739a65e6277e40cdbb806b5443c8aad66e 10.128.51.14:7735@17735 myself,master - 0 1646148882000 43 connected 5461-10922 [10922->-32ee19af1f6a534c4014b9c41d387666f049354b]

   ff34a5fadf42078332445bcf41a2b59416fd92da :0@0 master,fail,noaddr - 1646146669049 1646146665845 49 disconnected

 

  三、7736 master的迁移工作

  [root@ht4 redis]# /data/redis_mai1/redis/redis-cli -c -h 10.128.51.14 -p 7736 //进入到7736

 10.128.51.14:7736> cluster setslot 16383 MIGRATING  bc5051aef1a756c274907d23198fbb932c384151 //导出到10.121.51.30:7736

//我们看看7736节点的变化情况。...
10.128.51.14:7736> cluster nodes //登录7736才能看到 ---> 箭头在工作。 41767372c36cc268872e86d96660a32bf540624f 10.128.51.14:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1646148060771 43 connected b9a55f739a65e6277e40cdbb806b5443c8aad66e 10.128.51.14:7735@17735 master - 0 1646148059000 43 connected 5461-10922 //迁出中 d46f032ea50763de8353fd530535412df6ffdc00 10.129.51.30:7731@17731 master - 0 1646148057770 60 connected 0-5460 //已迁入,之前操作完了 bc5051aef1a756c274907d23198fbb932c384151 10.129.51.30:7736@17736 master - 0 1646148062779 6 connected //迁入中,当前正在操作,在写本文时。
//迁出中 07c99eae8176eb0e985c5a9914f42900d18fb39b 10.128.51.14:7736@17736 myself,master - 0 1646148053000 46 connected 10923-16383
[16383->-bc5051aef1a756c274907d23198fbb932c384151]
d1c7d99e13a2d7317baf883dffa906470a606641 10.129.51.30:7733@17733 slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1646148056000 46 connected
32ee19af1f6a534c4014b9c41d387666f049354b 10.129.51.30:7735@17735 master - 0 1646148058000 0 connected              //等待迁入
27f7e8dede5b68581486ce8cefdd656032baed70 10.129.51.30:7734@17734 master - 0 1646148060576 4 connected              //等待迁入
fb28e95eb37b8d827c0f48800c4d08e64c6fc335 10.128.51.14:7731@17731 slave d46f032ea50763de8353fd530535412df6ffdc00 0 1646148059570 60 connected
ff34a5fadf42078332445bcf41a2b59416fd92da :0@0 master,fail,noaddr - 1646146669048 1646146666847 49 disconnected    //已停掉了
91b7f8c79c91c7edd77458c332f0b9299bdb94d4 10.129.51.30:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1646148060000 43 connected
4bee54a5049524a6280fdb8a8de00a5bb94ccaa1 10.128.51.14:7733@17733 slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1646148061772 46 connected

//迁出中,看到数据,
//语法: CLUSTER getkeysinslot <slot槽号> <key的数量>:返回槽中N个key。  比如:cluster getkeysinslot 202 3   
10.128.51.14:7736>  cluster getkeysinslot 10924 3 
1) "card_bin_data_623022"
2) "login_eyJ0eXAiOiJsb2dpbkpXVCibG9naW50b2tlbiIsImFsZyI6IkhTMjU2In0.eyJ1c2VyaWQiOiI5MDU4NjgyNDk2NTgyNDEwMzIiLCJleHAiOjE2M
zg5Mzk4MTYsImlhdCI6MTYzODkzOTgxNn0.1KV4uLqavTRBD4U_549j-7BehiXlA8HtFyHJbg6g3cc_bnz_user_type"

  10.128.51.14:7736> cluster getkeysinslot 11924 3
  1) "a2707b52fb94381977b22720f7dc752b63699adbffad9d9bd3b8b2b85349cffa79999ca07c7a8662f6d8afd58089f072c8f36b3772f74c4
a093a06f4c35eb479059e293a83a10 011"
  2) "aaaaaa"
  3) "access:admin78.news.delete"

后面像7734一样操作即可,完成迁移工作。

 最后结果如下:

10.129.51.30:7736> cluster nodes
27f7e8dede5b68581486ce8cefdd656032baed70 10.129.51.30:7734@17734 master - 0 1646180960000 4 connected
4bee54a5049524a6280fdb8a8de00a5bb94ccaa1 10.128.51.14:7733@17733 slave d1c7d99e13a2d7317baf883dffa906470a606641 0 1646180963601 62 connected
07c99eae8176eb0e985c5a9914f42900d18fb39b 10.128.51.14:7736@17736 master - 0 1646180963254 46 connected
fb28e95eb37b8d827c0f48800c4d08e64c6fc335 10.128.51.14:7731@17731 slave d46f032ea50763de8353fd530535412df6ffdc00 0 1646180966505 60 connected
41767372c36cc268872e86d96660a32bf540624f 10.128.51.14:7732@17732 slave 91b7f8c79c91c7edd77458c332f0b9299bdb94d4 0 1646180965504 63 connected
32ee19af1f6a534c4014b9c41d387666f049354b 10.129.51.30:7735@17735 master - 0 1646180963497 0 connected
d1c7d99e13a2d7317baf883dffa906470a606641 10.129.51.30:7733@17733 master - 0 1646180964497 62 connected 10923-16383
ff34a5fadf42078332445bcf41a2b59416fd92da 10.128.51.14:7734@17734 slave d46f032ea50763de8353fd530535412df6ffdc00 0 1646180966005 60 connected
b9a55f739a65e6277e40cdbb806b5443c8aad66e 10.128.51.14:7735@17735 slave 91b7f8c79c91c7edd77458c332f0b9299bdb94d4 0 1646180961496 63 connected
d46f032ea50763de8353fd530535412df6ffdc00 10.129.51.30:7731@17731 master - 0 1646180962494 60 connected 0-5460
bc5051aef1a756c274907d23198fbb932c384151 10.129.51.30:7736@17736 myself,master - 0 1646180962000 6 connected
91b7f8c79c91c7edd77458c332f0b9299bdb94d4 10.129.51.30:7732@17732 master - 0 1646180965000 63 connected 5461-10922
槽位已经全部转移到新的集群节点上。
再测试一下。

 10.129.51.30:7736> set test1 testtest
 -> Redirected to slot [4768] located at 10.129.51.30:7731
 OK
 10.129.51.30:7731> cluster getkeysinslot 4768 2
 1) "login_eyJ0eXAiOiJsb2dpbkpXVCIsImxvZ2luX3Rva2VuIjoibG9naW50b2tlbiIsImFsZyI6IkhTMjU2In0.eyJ1c2VyaWQiOiI0MjM5OTI5NTI4IiwiZX
hwIjoxNjQ2NDQzOTcwLCJpYXQiOjE2NDU1Nzk5NzB9.nc-ESGj4Od8A61D_90bNfxKSvlJ1mu2hFXPXym6cxh4stp_org"
 2) "test1"

 我们用另外的方式验证下,k8s pod里面程序端输出:

//迁移前
d1c7d99e13a2d7317baf883dffa906470a606641 10.129.51.30:7733@17733 slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1646131344055 46 connected
d46f032ea50763de8353fd530535412df6ffdc00 10.129.51.30:7731@17731 myself,slave ff34a5fadf42078332445bcf41a2b59416fd92da 0 1646131343000 1 connected
07c99eae8176eb0e985c5a9914f42900d18fb39b 10.128.51.14:7736@17736 master - 0 1646131341857 46 connected 10923-16383
27f7e8dede5b68581486ce8cefdd656032baed70 10.129.51.30:7734@17734 master - 0 1646131341000 4 connected
ff34a5fadf42078332445bcf41a2b59416fd92da 10.128.51.14:7734@17734 master - 0 1646131340000 49 connected 0-5460
41767372c36cc268872e86d96660a32bf540624f 10.128.51.14:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1646131343859 43 connected
91b7f8c79c91c7edd77458c332f0b9299bdb94d4 10.129.51.30:7732@17732 slave b9a55f739a65e6277e40cdbb806b5443c8aad66e 0 1646131342000 43 connected
fb28e95eb37b8d827c0f48800c4d08e64c6fc335 10.128.51.14:7731@17731 slave ff34a5fadf42078332445bcf41a2b59416fd92da 0 1646131342859 49 connected
32ee19af1f6a534c4014b9c41d387666f049354b 10.129.51.30:7735@17735 master - 0 1646131342554 0 connected
bc5051aef1a756c274907d23198fbb932c384151 10.129.51.30:7736@17736 master - 0 1646131343000 6 connected
4bee54a5049524a6280fdb8a8de00a5bb94ccaa1 10.128.51.14:7733@17733 slave 07c99eae8176eb0e985c5a9914f42900d18fb39b 0 1646131341000 46 connected
b9a55f739a65e6277e40cdbb806b5443c8aad66e 10.128.51.14:7735@17735 master - 0 1646131343000 43 connected 5461-10922

//迁移后,所有的h4上的节点全部变成slave, ht20上的节点变成master,这个输出结果是我整体排序过了

d1c7d99e13a2d7317baf883dffa906470a606641 10.129.51.30:7733@17733 master - 0 1646182634648 62 connected 10923-16383
91b7f8c79c91c7edd77458c332f0b9299bdb94d4 10.129.51.30:7732@17732 master - 0 1646182632142 63 connected 5461-10922
d46f032ea50763de8353fd530535412df6ffdc00 10.129.51.30:7731@17731 myself,master - 0 1646182631000 60 connected 0-5460
27f7e8dede5b68581486ce8cefdd656032baed70 10.129.51.30:7734@17734 master - 0 1646182631000 4 connected
32ee19af1f6a534c4014b9c41d387666f049354b 10.129.51.30:7735@17735 master - 0 1646182630538 0 connected
bc5051aef1a756c274907d23198fbb932c384151 10.129.51.30:7736@17736 master - 0 1646182630000 6 connected

07c99eae8176eb0e985c5a9914f42900d18fb39b 10.128.51.14:7736@17736 slave d1c7d99e13a2d7317baf883dffa906470a606641 0 1646182634151 62 connected
ff34a5fadf42078332445bcf41a2b59416fd92da 10.128.51.14:7734@17734 slave d46f032ea50763de8353fd530535412df6ffdc00 0 1646182632000 60 connected
41767372c36cc268872e86d96660a32bf540624f 10.128.51.14:7732@17732 slave 91b7f8c79c91c7edd77458c332f0b9299bdb94d4 0 1646182633149 63 connected
fb28e95eb37b8d827c0f48800c4d08e64c6fc335 10.128.51.14:7731@17731 slave d46f032ea50763de8353fd530535412df6ffdc00 0 1646182630000 60 connected
4bee54a5049524a6280fdb8a8de00a5bb94ccaa1 10.128.51.14:7733@17733 slave d1c7d99e13a2d7317baf883dffa906470a606641 0 1646182636154 62 connected
b9a55f739a65e6277e40cdbb806b5443c8aad66e 10.128.51.14:7735@17735 slave 91b7f8c79c91c7edd77458c332f0b9299bdb94d4 0 1646182635153 63 connected

我们回到控制台,验证是对应验证,否则,你在ht20的7735上执行下面语句是空的,因为槽位(slot)不对。
下面所有槽位都分配在了新的集群节点上了, 他们两个节点是互为主从关系。

 [root@ht4 redis]# /data/redis_fpmai1/redis/redis-cli -c -h 10.128.51.14 -p 7736
  10.128.51.14:7736> cluster getkeysinslot 10924 3
  1) "card_bin_data_623022"
  2) "login_eyJ0eXAiOiJsb2dpbkpXVCIsImxvZ2luX3Rva2VuIjoibG9naW50b2tlbiIsImFsZyI6IkhTMjU2In0.eyJ1c2VyaWDk2NTgyNDEwMzIiLC
     JleHAiOjE2Mzg5Mzk4MTYsImlhdCI6MTYzODkzOTgxNn0.1KV4uLqavTRBD4U_549j-7BehiXlA8HtFyHJbg6g3cc_bnz_user_type

 [root@ht20 redis]# ./redis-cli -c -h 10.129.51.30 -p 7733
  10.129.51.30:7733> cluster getkeysinslot 10924 3
  1) "card_bin_data_623022"
  2) "login_eyJ0eXAiOiJsb2dpbkpXVCIsImxvZ2luX3Rva2VuIjoibG9naW50b2tlbiIsImFsZyI6IkhTMjU2In0.eyJ1c2VyaWQiOiI5MDU4NjgyNDk2NTgyNDEwMzIiLC
  JleHAiOjE2Mzg5Mzk4MTYsImlhdCI6MTYzODkzOTgxNn0.1KV4uLqavTRBD4U_549j-7BehiXlA8HtFyHJbg6g3cc_bnz_user_type"

  程序端因为是用pod部署,所以重新部署pod即可, 我们现在使用的就是新的集群,老的数据了。

 

 

  

posted @ 2022-03-01 15:51  jinzi  阅读(3582)  评论(0编辑  收藏  举报