redis cluster 搭建集群与集群管理
##################
redis-cli --cluster 命令:
[work@a8-dba-cloud-db00.wh cdrom]$ ./redis_7001/bin/redis-cli --cluster help Cluster Manager Commands: create host1:port1 ... hostN:portN --cluster-replicas <arg> check <host:port> or <host> <port> - separated by either colon or space --cluster-search-multiple-owners info <host:port> or <host> <port> - separated by either colon or space fix <host:port> or <host> <port> - separated by either colon or space --cluster-search-multiple-owners --cluster-fix-with-unreachable-masters reshard <host:port> or <host> <port> - separated by either colon or space --cluster-from <arg> --cluster-to <arg> --cluster-slots <arg> --cluster-yes --cluster-timeout <arg> --cluster-pipeline <arg> --cluster-replace rebalance <host:port> or <host> <port> - separated by either colon or space --cluster-weight <node1=w1...nodeN=wN> --cluster-use-empty-masters --cluster-timeout <arg> --cluster-simulate --cluster-pipeline <arg> --cluster-threshold <arg> --cluster-replace add-node new_host:new_port existing_host:existing_port --cluster-slave --cluster-master-id <arg> del-node host:port node_id call host:port command arg arg .. arg --cluster-only-masters --cluster-only-replicas set-timeout host:port milliseconds import host:port --cluster-from <arg> --cluster-from-user <arg> --cluster-from-pass <arg> --cluster-from-askpass --cluster-copy --cluster-replace backup host:port backup_directory help For check, fix, reshard, del-node, set-timeout, info, rebalance, call, import, backup you can specify the host and port of any working node in the cluster. Cluster Manager Options: --cluster-yes Automatic yes to cluster commands prompts
查看集群节点信息:
# redis-cli -c -a jJAV0kTokNb8iZvwfqniCxmFZEsbOH5n -h 192.168.31.33 -p 7006 Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. 127.0.0.1:7006> cluster nodes 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002@17002 master - 0 1676568002000 2 connected 5461-10922 3fd90616dada7055d874dc92b6c5ef15fc060b8e 192.168.31.33:7005@17005 slave 3b2010b088f81db2dbae5f674f133aeef293d2e8 0 1676568004000 2 connected 3c2b55d4b06fc23510bfe94d803763e6f1c72d2a 192.168.31.33:7006@17006 myself,slave 331628fcc89591eafd378d5c4bf15c67be4a60d7 0 1676568002000 3 connected 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004@17004 slave 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 0 1676568002984 1 connected 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003@17003 master - 0 1676568004010 3 connected 10923-16383 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001@17001 master - 0 1676568005032 1 connected 0-5460 127.0.0.1:7006>
每一行得解读:
- 40位的节点编号;
- 节点的ip和端口号@集群总线端口;
- 标志位:该节点是master还是slave?是否为本次登录的节点myself?
- 若该节点为从库slave,则为该从库的主库的节点编号;若该节点为主库master,则用“-”表示;
- 上次挂起的 PING 仍在等待回复的时间;
- 最后一次收到 PONG 的时间;
- 此节点的链接状态
- 该节点的链接状态
- 若该节点为主库,则表示分配的插槽slots,若有多个分段slots则用空格隔开;若该节点为从库,则为空;
查看集群健康状况:redis-cli --cluster check ip:port
- ip:port 可以是该集群中的任意一个节点
[work@a8-dba-cloud-db00.wh cdrom]$ ./redis_7001/bin/redis-cli -a jJAV0kTokNb8iZvwfqniCxmFZEsbOH5n --cluster check 192.168.31.33:7001 Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. 192.168.31.33:7001 (06f1a071...) -> 8557242 keys | 8192 slots | 2 slaves. 192.168.31.33:7002 (3b2010b0...) -> 4278581 keys | 4096 slots | 1 slaves. 192.168.31.33:7003 (331628fc...) -> 4278607 keys | 4096 slots | 1 slaves. [OK] 17114430 keys in 3 masters. 1044.58 keys per slot on average. >>> Performing Cluster Check (using node 192.168.31.33:7001) M: 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001 slots:[0-6826],[10923-12287] (8192 slots) master 2 additional replica(s) S: 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004 slots: (0 slots) slave replicates 06f1a071ffac504d14f6448fd08bc56a12ee4c2e S: 3fd90616dada7055d874dc92b6c5ef15fc060b8e 192.168.31.33:7005 slots: (0 slots) slave replicates 3b2010b088f81db2dbae5f674f133aeef293d2e8 S: 3c2b55d4b06fc23510bfe94d803763e6f1c72d2a 192.168.31.33:7006 slots: (0 slots) slave replicates 331628fcc89591eafd378d5c4bf15c67be4a60d7 M: 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002 slots:[6827-10922] (4096 slots) master 1 additional replica(s) S: 2c545c46c6c3d148bf6500e06b1fdf5887416c40 192.168.31.33:7007 slots: (0 slots) slave replicates 06f1a071ffac504d14f6448fd08bc56a12ee4c2e M: 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003 slots:[12288-16383] (4096 slots) master 1 additional replica(s) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. [work@a8-dba-cloud-db00.wh cdrom]$
查看集群slots分配情况:
[work@a8-dba-cloud-db00.wh ~]$ ./redis_7001/bin/redis-cli --cluster info 192.168.31.33:7001 -a jJAV0kTokNb8iZvwfqniCxmFZEsbOH5n Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. 192.168.31.33:7001 (06f1a071...) -> 523551 keys | 8192 slots | 1 slaves. 192.168.31.33:7002 (3b2010b0...) -> 261542 keys | 4096 slots | 2 slaves. 192.168.31.33:7003 (331628fc...) -> 261696 keys | 4096 slots | 1 slaves. [OK] 1046789 keys in 3 masters. 63.89 keys per slot on average.
添加新主库节点:redis-cli --cluster add-node new_ip:new_port ip:port
说明:
- ip:port为该集群中的任意一个节点
[work@a8-dba-cloud-db00.wh cdrom]$ ./redis_7001/bin/redis-cli -a jJAV0kTokNb8iZvwfqniCxmFZEsbOH5n --cluster add-node 192.168.31.33:7007 192.168.31.33:7001 Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. >>> Adding node 192.168.31.33:7007 to cluster 192.168.31.33:7001 >>> Performing Cluster Check (using node 192.168.31.33:7001) M: 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001 slots:[0-5460] (5461 slots) master 1 additional replica(s) S: 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004 slots: (0 slots) slave replicates 06f1a071ffac504d14f6448fd08bc56a12ee4c2e S: 3fd90616dada7055d874dc92b6c5ef15fc060b8e 192.168.31.33:7005 slots: (0 slots) slave replicates 3b2010b088f81db2dbae5f674f133aeef293d2e8 S: 3c2b55d4b06fc23510bfe94d803763e6f1c72d2a 192.168.31.33:7006 slots: (0 slots) slave replicates 331628fcc89591eafd378d5c4bf15c67be4a60d7 M: 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002 slots:[5461-10922] (5462 slots) master 1 additional replica(s) M: 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003 slots:[10923-16383] (5461 slots) master 1 additional replica(s) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. >>> Getting functions from cluster >>> Send FUNCTION LIST to 192.168.31.33:7007 to verify there is no functions in it >>> Send FUNCTION RESTORE to 192.168.31.33:7007 >>> Send CLUSTER MEET to node 192.168.31.33:7007 to make it join the cluster. [OK] New node added correctly. [work@a8-dba-cloud-db00.wh cdrom]$
查看集群节点和信息:
说明:
- 7007还没有分配任何slots,它没有数据,因为它没有分配的哈希槽:
[work@a8-dba-cloud-db00.wh ~]$ ./redis_7001/bin/redis-cli -c -a jJAV0kTokNb8iZvwfqniCxmFZEsbOH5n -p 7007 Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. 127.0.0.1:7007> cluster nodes 3fd90616dada7055d874dc92b6c5ef15fc060b8e 192.168.31.33:7005@17005 slave 3b2010b088f81db2dbae5f674f133aeef293d2e8 0 1676571158534 2 connected 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003@17003 master - 0 1676571160548 3 connected 10923-16383 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002@17002 master - 0 1676571158000 2 connected 5461-10922 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004@17004 slave 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 0 1676571159000 1 connected 2c545c46c6c3d148bf6500e06b1fdf5887416c40 192.168.31.33:7007@17007 myself,master - 0 1676571157000 0 connected 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001@17001 master - 0 1676571159540 1 connected 0-5460 3c2b55d4b06fc23510bfe94d803763e6f1c72d2a 192.168.31.33:7006@17006 slave 331628fcc89591eafd378d5c4bf15c67be4a60d7 0 1676571158000 3 connected 127.0.0.1:7007> cluster info cluster_state:ok cluster_slots_assigned:16384 cluster_slots_ok:16384 cluster_slots_pfail:0 cluster_slots_fail:0 cluster_known_nodes:7 cluster_size:3 cluster_current_epoch:3 cluster_my_epoch:0 cluster_stats_messages_ping_sent:55257 cluster_stats_messages_pong_sent:52603 cluster_stats_messages_meet_sent:4 cluster_stats_messages_fail_sent:5 cluster_stats_messages_sent:107869 cluster_stats_messages_ping_received:52603 cluster_stats_messages_pong_received:55260 cluster_stats_messages_received:107863 total_cluster_links_buffer_limit_exceeded:0 127.0.0.1:7007>
给新主库节点分配槽位:
(1)交互式reshard:
- slots个数,会根据master个数(算上新加入的master)求出迁移多少个slots
- 目的node_id,就是新主库的node_id
- slots来源:all,表示来源为所有现有的master节点
redis-cli --cluster reshard new_ip:new_port
(2)自动化执行reshard:
命令:redis-cli --cluster reshard new_ip:new_port --cluster-from node_id1,node_id2, .... --cluster-to node_id --cluster-slots N --cluster-yes
说明:如果有上百万的键值,这个过程估计要十几分钟才能完成槽位的迁移
- new_ip:new_port表示新主库节点的ip和端口号;
- --cluster-from node_id1,node_id2, .... 表示从该集群中哪些主库节点分出slots,node_id1,node_id2, ....是主库节点的列表,逗号分隔,根据需要指定即可,通常是指定该集群现有的所有主库节点;
- --cluster-to node_id 表示新主库节点的node_id。而每次reshard只能写一个目的节点;
- --cluster-slots N 表示给该新主库分配N个slots,这个值一般就是16384/所有主库个数,这里原来有三个主库,现在新增了一个主库,也就是有4个主库,所以16384/4=4096;
- --cluster-yes 表示重新分片可以自动执行,无需以交互方式手动输入参数,不加这个选项的话,则执行reshard时会跟你交互,没必要交互。该
--cluster-yes
选项指示集群管理器自动对命令的提示回答“是”,允许它以非交互模式运行; - 当有大量数据需要迁移时,reshard过程将很耗时,此时可以通过不断执行 cluster nodes命令来查看迁移进度。
从7001,7002,7003中一共需要拿出4096个slots槽位给新主库7007:
# redis-cli -a jJAV0kTokNb8iZvwfqniCxmFZEsbOH5n --cluster reshard 192.168.31.33:7007 --cluster-from 06f1a071ffac504d14f6448fd08bc56a12ee4c2e,3b2010b088f81db2dbae5f674f133aeef293d2e8,331628fcc89591eafd378d5c4bf15c67be4a60d7 --cluster-to 2c545c46c6c3d148bf6500e06b1fdf5887416c40 --cluster-slots 4096 ...... Moving slot 12280 from 192.168.31.33:7003 to 192.168.31.33:7007: .............. Moving slot 12281 from 192.168.31.33:7003 to 192.168.31.33:7007: .............. Moving slot 12282 from 192.168.31.33:7003 to 192.168.31.33:7007: ..............
......
查看集群节点和信息:
说明:
- 原7001,7002,7003为主库的三个节点的slots范围分别是:0-5460,5461-10922,10923-16383
- 从原3个主库节点中分得4096个slots,每个原主库平均得拿出4096/3=1635.333......,因此 有2个主库节点拿出1365个slots,1主库个节点拿出1366个slots
- 7007节点从7001得到了0-1364共1365个slots,从7002节点得到了5461-6826共1366个slots,共7003得到了10923-12287共1365个slots
[work@a8-dba-cloud-db00.wh ~]$ ./redis_7001/bin/redis-cli -c -a jJAV0kTokNb8iZvwfqniCxmFZEsbOH5n -p 7007 Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. 127.0.0.1:7007> cluster nodes 3fd90616dada7055d874dc92b6c5ef15fc060b8e 192.168.31.33:7005@17005 slave 3b2010b088f81db2dbae5f674f133aeef293d2e8 0 1676601317000 2 connected 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003@17003 master - 0 1676601317680 3 connected 12288-16383 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002@17002 master - 0 1676601315617 2 connected 6827-10922 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004@17004 slave 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 0 1676601318705 1 connected 2c545c46c6c3d148bf6500e06b1fdf5887416c40 192.168.31.33:7007@17007 myself,master - 0 1676601316000 4 connected 0-1364 5461-6826 10923-12287 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001@17001 master - 0 1676601315000 1 connected 1365-5460 3c2b55d4b06fc23510bfe94d803763e6f1c72d2a 192.168.31.33:7006@17006 slave 331628fcc89591eafd378d5c4bf15c67be4a60d7 0 1676601316000 3 connected 127.0.0.1:7007> cluster info cluster_state:ok cluster_slots_assigned:16384 cluster_slots_ok:16384 cluster_slots_pfail:0 cluster_slots_fail:0 cluster_known_nodes:7 cluster_size:4 cluster_current_epoch:4 cluster_my_epoch:4 cluster_stats_messages_ping_sent:84778 cluster_stats_messages_pong_sent:104809 cluster_stats_messages_meet_sent:4 cluster_stats_messages_fail_sent:5 cluster_stats_messages_update_sent:47 cluster_stats_messages_sent:189643 cluster_stats_messages_ping_received:80233 cluster_stats_messages_pong_received:84781 cluster_stats_messages_received:165014 total_cluster_links_buffer_limit_exceeded:0 127.0.0.1:7007>
添加从库点给指定主库:
- redis-cli --cluster add-node new_ip:new_port ip:port --cluster-slave 若不指定新添加的从库到哪个主库时,则将新从库添加到从库比较少的主库中(若主库包含的从库个数最少且有多个主库有相同个数从库,则随机选择其中的一个主库)
- redis-cli --cluster add-node new_ip:new_port ip:port --cluster-slave --cluster-master-id master_node_id 明确指定新添加的从库给指定的主库
# 将7004从库添加到7001主库下面作为从库
[work@a8-dba-cloud-db00.wh redis_7001]$ ./bin/redis-cli -a 'jJAV0kTokNb8iZvwfqniCxmFZEsbOH5n' --cluster add-node 192.168.31.33:7004 192.168.31.33:7001 --cluster-slave --cluster-master-id 06f1a071ffac504d14f6448fd08bc56a12ee4c2e Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. >>> Adding node 192.168.31.33:7004 to cluster 192.168.31.33:7001 >>> Performing Cluster Check (using node 192.168.31.33:7001) M: 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001 slots:[0-5460] (5461 slots) master M: 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002 slots:[5461-10922] (5462 slots) master M: 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003 slots:[10923-16383] (5461 slots) master [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. >>> Send CLUSTER MEET to node 192.168.31.33:7004 to make it join the cluster. Waiting for the cluster to join >>> Configure node as replica of 192.168.31.33:7001. [OK] New node added correctly.
# 将7005添加到7002主库下面作为从库 [work@a8-dba-cloud-db00.wh redis_7001]$ ./bin/redis-cli -a 'jJAV0kTokNb8iZvwfqniCxmFZEsbOH5n' --cluster add-node 192.168.31.33:7005 192.168.31.33:7002 --cluster-slave --cluster-master-id 3b2010b088f81db2dbae5f674f133aeef293d2e8 Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. >>> Adding node 192.168.31.33:7005 to cluster 192.168.31.33:7002 >>> Performing Cluster Check (using node 192.168.31.33:7002) M: 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002 slots:[5461-10922] (5462 slots) master S: 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004 slots: (0 slots) slave replicates 06f1a071ffac504d14f6448fd08bc56a12ee4c2e M: 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003 slots:[10923-16383] (5461 slots) master M: 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001 slots:[0-5460] (5461 slots) master 1 additional replica(s) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. >>> Send CLUSTER MEET to node 192.168.31.33:7005 to make it join the cluster. Waiting for the cluster to join >>> Configure node as replica of 192.168.31.33:7002. [OK] New node added correctly.
# 将7006添加到7003主库下面作为从库 [work@a8-dba-cloud-db00.wh redis_7001]$ ./bin/redis-cli -a 'jJAV0kTokNb8iZvwfqniCxmFZEsbOH5n' --cluster add-node 192.168.31.33:7006 192.168.31.33:7003 --cluster-slave --cluster-master-id 331628fcc89591eafd378d5c4bf15c67be4a60d7 Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. >>> Adding node 192.168.31.33:7006 to cluster 192.168.31.33:7003 >>> Performing Cluster Check (using node 192.168.31.33:7003) M: 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003 slots:[10923-16383] (5461 slots) master M: 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002 slots:[5461-10922] (5462 slots) master 1 additional replica(s) S: 3fd90616dada7055d874dc92b6c5ef15fc060b8e 192.168.31.33:7005 slots: (0 slots) slave replicates 3b2010b088f81db2dbae5f674f133aeef293d2e8 M: 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001 slots:[0-5460] (5461 slots) master 1 additional replica(s) S: 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004 slots: (0 slots) slave replicates 06f1a071ffac504d14f6448fd08bc56a12ee4c2e [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. >>> Send CLUSTER MEET to node 192.168.31.33:7006 to make it join the cluster. Waiting for the cluster to join >>> Configure node as replica of 192.168.31.33:7003. [OK] New node added correctly. [work@a8-dba-cloud-db00.wh redis_7001]$
查看集群节点和信息:
127.0.0.1:7001> cluster nodes 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004@17004 slave 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 0 1676509675000 1 connected 3fd90616dada7055d874dc92b6c5ef15fc060b8e 192.168.31.33:7005@17005 slave 3b2010b088f81db2dbae5f674f133aeef293d2e8 0 1676509676153 2 connected 3c2b55d4b06fc23510bfe94d803763e6f1c72d2a 192.168.31.33:7006@17006 slave 331628fcc89591eafd378d5c4bf15c67be4a60d7 0 1676509674133 3 connected 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002@17002 master - 0 1676509673122 2 connected 5461-10922 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001@17001 myself,master - 0 1676509674000 1 connected 0-5460 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003@17003 master - 0 1676509675143 3 connected 10923-16383
127.0.0.1:7001> cluster info cluster_state:ok cluster_slots_assigned:16384 cluster_slots_ok:16384 cluster_slots_pfail:0 cluster_slots_fail:0 cluster_known_nodes:6 cluster_size:3 cluster_current_epoch:3 cluster_my_epoch:1 cluster_stats_messages_ping_sent:1924 cluster_stats_messages_pong_sent:1939 cluster_stats_messages_sent:3863 cluster_stats_messages_ping_received:1936 cluster_stats_messages_pong_received:1924 cluster_stats_messages_meet_received:3 cluster_stats_messages_received:3863 total_cluster_links_buffer_limit_exceeded:0
更改从库的所属主库:任意指定即可。
- cluster replicate 指定的主节点node_id
举例说明:将7007这个从库的原主库为7003(node_id为331628fcc89591eafd378d5c4bf15c67be4a60d7),将7007这个从库的主库变更为新主库7002(node_id为3b2010b088f81db2dbae5f674f133aeef293d2e8),用redis -c登录7007并执行cluster replicate 3b2010b088f81db2dbae5f674f133aeef293d2e8
192.168.31.33:7007> cluster nodes 2c545c46c6c3d148bf6500e06b1fdf5887416c40 192.168.31.33:7007@17007 myself,slave 331628fcc89591eafd378d5c4bf15c67be4a60d7 0 1676638277000 7 connected 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003@17003 master - 0 1676638279607 7 connected 12288-16383 3fd90616dada7055d874dc92b6c5ef15fc060b8e 192.168.31.33:7005@17005 slave 3b2010b088f81db2dbae5f674f133aeef293d2e8 0 1676638276000 2 connected 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002@17002 master - 0 1676638275000 2 connected 6827-10922 3c2b55d4b06fc23510bfe94d803763e6f1c72d2a 192.168.31.33:7006@17006 slave 331628fcc89591eafd378d5c4bf15c67be4a60d7 0 1676638275000 7 connected 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004@17004 slave 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 0 1676638277000 9 connected 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001@17001 master - 0 1676638278601 9 connected 0-6826 10923-12287
192.168.31.33:7007> cluster replicate 3b2010b088f81db2dbae5f674f133aeef293d2e8 OK
192.168.31.33:7007> cluster nodes 2c545c46c6c3d148bf6500e06b1fdf5887416c40 192.168.31.33:7007@17007 myself,slave 3b2010b088f81db2dbae5f674f133aeef293d2e8 0 1676638306000 2 connected 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003@17003 master - 0 1676638309000 7 connected 12288-16383 3fd90616dada7055d874dc92b6c5ef15fc060b8e 192.168.31.33:7005@17005 slave 3b2010b088f81db2dbae5f674f133aeef293d2e8 0 1676638309998 2 connected 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002@17002 master - 0 1676638311004 2 connected 6827-10922 3c2b55d4b06fc23510bfe94d803763e6f1c72d2a 192.168.31.33:7006@17006 slave 331628fcc89591eafd378d5c4bf15c67be4a60d7 0 1676638308000 7 connected 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004@17004 slave 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 0 1676638309000 9 connected 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001@17001 master - 0 1676638307000 9 connected 0-6826 10923-12287 192.168.31.33:7007>
删除一个从库节点 :redis-cli --cluster del-node ip:port node_id
说明:
- ip:port为该从库节点所在集群中的任意一位成员即可;
- node_id为该从库的node_id
# 获取指定从库节点的node_id:
[work@a8-dba-cloud-db00.wh cdrom]$ ./redis_7001/bin/redis-cli -c -a jJAV0kTokNb8iZvwfqniCxmFZEsbOH5n -p 7007 -h 192.168.31.33 Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. 192.168.31.33:7007> cluster nodes 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003@17003 master - 0 1676566699000 3 connected 10923-16383 3fd90616dada7055d874dc92b6c5ef15fc060b8e 192.168.31.33:7005@17005 slave 3b2010b088f81db2dbae5f674f133aeef293d2e8 0 1676566699501 2 connected 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004@17004 slave 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 0 1676566700534 1 connected 2c545c46c6c3d148bf6500e06b1fdf5887416c40 192.168.31.33:7007@17007 myself,slave 3b2010b088f81db2dbae5f674f133aeef293d2e8 0 1676566697000 2 connected 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002@17002 master - 0 1676566697443 2 connected 5461-10922 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001@17001 master - 0 1676566698471 1 connected 0-5460 3c2b55d4b06fc23510bfe94d803763e6f1c72d2a 192.168.31.33:7006@17006 slave 331628fcc89591eafd378d5c4bf15c67be4a60d7 0 1676566697000 3 connected 192.168.31.33:7007>
# 删除指定从库节点:
[work@a8-dba-cloud-db00.wh cdrom]$ ./redis_7001/bin/redis-cli -a jJAV0kTokNb8iZvwfqniCxmFZEsbOH5n -h 192.168.31.33 -p 7007 --cluster del-node 192.168.31.33:7003 2c545c46c6c3d148bf6500e06b1fdf5887416c40 Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. >>> Removing node 2c545c46c6c3d148bf6500e06b1fdf5887416c40 from cluster 192.168.31.33:7003 >>> Sending CLUSTER FORGET messages to the cluster... >>> Sending CLUSTER RESET SOFT to the deleted node. [work@a8-dba-cloud-db00.wh cdrom]$
# 登录集群其他节点查看集群节点信息:
[work@a8-dba-cloud-db00.wh ~]$ ./redis_7001/bin/redis-cli -c -a jJAV0kTokNb8iZvwfqniCxmFZEsbOH5n -p 7006 Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. 127.0.0.1:7006> cluster nodes 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002@17002 master - 0 1676568002000 2 connected 5461-10922 3fd90616dada7055d874dc92b6c5ef15fc060b8e 192.168.31.33:7005@17005 slave 3b2010b088f81db2dbae5f674f133aeef293d2e8 0 1676568004000 2 connected 3c2b55d4b06fc23510bfe94d803763e6f1c72d2a 192.168.31.33:7006@17006 myself,slave 331628fcc89591eafd378d5c4bf15c67be4a60d7 0 1676568002000 3 connected 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004@17004 slave 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 0 1676568002984 1 connected 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003@17003 master - 0 1676568004010 3 connected 10923-16383 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001@17001 master - 0 1676568005032 1 connected 0-5460 127.0.0.1:7006>
cluster reset命令:
删除从库节点之后,若想要再次加入该集群,则需要以redis-cli -c 登录该节点后执行cluster reset后才能再次加入集群:
# 对已经删除的7007节点进行cluster reset操作:
[work@a8-dba-cloud-db00.wh ~]$ ./redis_7001/bin/redis-cli -c -a jJAV0kTokNb8iZvwfqniCxmFZEsbOH5n -p 7007 Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. 127.0.0.1:7007> cluster reset OK # 再将7007节点以从库身份加入原集群: [work@a8-dba-cloud-db00.wh cdrom]$ ./redis_7001/bin/redis-cli -a jJAV0kTokNb8iZvwfqniCxmFZEsbOH5n --cluster add-node 192.168.31.33:7007 192.168.31.33:7001 --cluster-slave Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. >>> Adding node 192.168.31.33:7007 to cluster 192.168.31.33:7001 >>> Performing Cluster Check (using node 192.168.31.33:7001) M: 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001 slots:[0-5460] (5461 slots) master 1 additional replica(s) S: 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004 slots: (0 slots) slave replicates 06f1a071ffac504d14f6448fd08bc56a12ee4c2e S: 3fd90616dada7055d874dc92b6c5ef15fc060b8e 192.168.31.33:7005 slots: (0 slots) slave replicates 3b2010b088f81db2dbae5f674f133aeef293d2e8 S: 3c2b55d4b06fc23510bfe94d803763e6f1c72d2a 192.168.31.33:7006 slots: (0 slots) slave replicates 331628fcc89591eafd378d5c4bf15c67be4a60d7 M: 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002 slots:[5461-10922] (5462 slots) master 1 additional replica(s) M: 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003 slots:[10923-16383] (5461 slots) master 1 additional replica(s) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. Automatically selected master 192.168.31.33:7001 >>> Send CLUSTER MEET to node 192.168.31.33:7007 to make it join the cluster. Waiting for the cluster to join >>> Configure node as replica of 192.168.31.33:7001. [OK] New node added correctly. [work@a8-dba-cloud-db00.wh cdrom]$ # 查看集群节点和信息: [work@a8-dba-cloud-db00.wh ~]$ ./redis_7001/bin/redis-cli -c -a jJAV0kTokNb8iZvwfqniCxmFZEsbOH5n -p 7007 Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. 127.0.0.1:7007> cluster nodes 3fd90616dada7055d874dc92b6c5ef15fc060b8e 192.168.31.33:7005@17005 slave 3b2010b088f81db2dbae5f674f133aeef293d2e8 0 1676570820000 2 connected 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003@17003 master - 0 1676570821052 3 connected 10923-16383 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004@17004 slave 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 0 1676570820022 1 connected 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002@17002 master - 0 1676570823111 2 connected 5461-10922 2c545c46c6c3d148bf6500e06b1fdf5887416c40 192.168.31.33:7007@17007 myself,slave 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 0 1676570819000 1 connected 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001@17001 master - 0 1676570820000 1 connected 0-5460 3c2b55d4b06fc23510bfe94d803763e6f1c72d2a 192.168.31.33:7006@17006 slave 331628fcc89591eafd378d5c4bf15c67be4a60d7 0 1676570822082 3 connected 127.0.0.1:7007> cluster info cluster_state:ok cluster_slots_assigned:16384 cluster_slots_ok:16384 cluster_slots_pfail:0 cluster_slots_fail:0 cluster_known_nodes:7 cluster_size:3 cluster_current_epoch:3 cluster_my_epoch:1 cluster_stats_messages_ping_sent:55041 cluster_stats_messages_pong_sent:52405 cluster_stats_messages_meet_sent:3 cluster_stats_messages_fail_sent:5 cluster_stats_messages_sent:107454 cluster_stats_messages_ping_received:52405 cluster_stats_messages_pong_received:55043 cluster_stats_messages_received:107448 total_cluster_links_buffer_limit_exceeded:0 127.0.0.1:7007>
删除一个主库节点 :redis-cli --cluster del-node ip:port node_id
(1)删除主库前都会先删除掉该主库下面的所有从库:
(2)重新分片:将该主库上的所有slots迁移到其他主节点上,但每次执行reshard一次只能迁移到一个目标主库,如果你需要将该主库的slots分别迁移到多个其他主库,则需要分别执行
# redis-cli --cluster reshard master_ip:master_port --cluster-from master_node_id --cluster-to desc_node_id --cluster-slots N --cluster-yes
-
master_ip:master_port 表示需要被删除的主库的ip和port;
-
--cluster-from master_node_id 表示该主库的id编号;
-
--cluster-to desc_node_id 表示一个其他目标主库的id编号;
-
--cluster-slots N 表示迁移的slots个数;
-
--cluster-yes 表示自动yes,无需交互;
举例说明:将7007的4096个slots全部迁移到7001上。
# redis-cli -a jJAV0kTokNb8iZvwfqniCxmFZEsbOH5n --cluster reshard 192.168.31.33:7007 --cluster-from 2c545c46c6c3d148bf6500e06b1fdf5887416c40 --cluster-to 06f1a071ffac504d14f6448fd08bc56a12ee4c2e --cluster-slots 4096 --cluster-yes
(3)删除主库节点:
# redis-cli --cluster del-node ip:port master_node_id
手动故障转移:
- 查看节点信息可知:7003主库节点仅有1个7006从库节点,只能在主库下面的某个从库上执行cluster failover命令:
- 在从库节点上执行cluster failover后,会将7006提升为主库,而将7003降为7006的从库
- 与实际主节点故障导致的故障转移相比,手动故障转移是特殊的并且更安全。它们以一种避免过程中数据丢失的方式发生,只有当系统确定新主服务器处理了旧主服务器的所有复制流时,才将客户端从原始主服务器切换到新主服务器
- 用在升级主库或者主库有故障需要进行主从切换时,比如你想将7003主库升级,那么你就在7003主库下面的任意一个从库上执行cluster failover即可
[work@a8-dba-cloud-db00.wh ~]$ ./redis_7001/bin/redis-cli -c -a jJAV0kTokNb8iZvwfqniCxmFZEsbOH5n -h 192.168.31.33 -p 7006 192.168.31.33:7006> cluster nodes 2c545c46c6c3d148bf6500e06b1fdf5887416c40 192.168.31.33:7007@17007 slave 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 0 1676613260878 5 connected 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002@17002 master - 0 1676613258000 2 connected 6827-10922 3fd90616dada7055d874dc92b6c5ef15fc060b8e 192.168.31.33:7005@17005 slave 3b2010b088f81db2dbae5f674f133aeef293d2e8 0 1676613258000 2 connected 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004@17004 slave 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 0 1676613258000 5 connected 3c2b55d4b06fc23510bfe94d803763e6f1c72d2a 192.168.31.33:7006@17006 myself,slave 331628fcc89591eafd378d5c4bf15c67be4a60d7 0 1676613256000 3 connected 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003@17003 master - 0 1676613259858 3 connected 12288-16383 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001@17001 master - 0 1676613259000 5 connected 0-6826 10923-12287192.168.31.33:7006> cluster failover OK 192.168.31.33:7006> cluster nodes 2c545c46c6c3d148bf6500e06b1fdf5887416c40 192.168.31.33:7007@17007 slave 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 0 1676613588468 5 connected 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002@17002 master - 0 1676613589000 2 connected 6827-10922 3fd90616dada7055d874dc92b6c5ef15fc060b8e 192.168.31.33:7005@17005 slave 3b2010b088f81db2dbae5f674f133aeef293d2e8 0 1676613589480 2 connected 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004@17004 slave 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 0 1676613588000 5 connected 3c2b55d4b06fc23510bfe94d803763e6f1c72d2a 192.168.31.33:7006@17006 myself,master - 0 1676613586000 6 connected 12288-16383 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003@17003 slave 3c2b55d4b06fc23510bfe94d803763e6f1c72d2a 0 1676613588000 6 connected 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001@17001 master - 0 1676613587457 5 connected 0-6826 10923-12287 192.168.31.33:7006>
cluster failover出现(error) LOADING Redis is loading the dataset in memory
192.168.31.33:7001> cluster nodes 3c2b55d4b06fc23510bfe94d803763e6f1c72d2a 192.168.31.33:7006@17006 slave 331628fcc89591eafd378d5c4bf15c67be4a60d7 0 1676634996971 7 connected 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004@17004 master - 0 1676634996993 8 connected 0-6826 10923-12287 2c545c46c6c3d148bf6500e06b1fdf5887416c40 192.168.31.33:7007@17007 slave 930b0fb946cd67aaae5642d4cf306638e11913a3 0 1676634996977 8 connected 3fd90616dada7055d874dc92b6c5ef15fc060b8e 192.168.31.33:7005@17005 slave 3b2010b088f81db2dbae5f674f133aeef293d2e8 0 1676634997917 2 connected 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002@17002 master - 0 1676634997000 2 connected 6827-10922 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001@17001 myself,slave 930b0fb946cd67aaae5642d4cf306638e11913a3 0 1676634996000 8 connected 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003@17003 master - 0 1676634996971 7 connected 12288-16383 192.168.31.33:7001> cluster failover (error) LOADING Redis is loading the dataset in memory (4.76s) 192.168.31.33:7001> cluster nodes LOADING Redis is loading the dataset in memory 192.168.31.33:7001> cluster nodes LOADING Redis is loading the dataset in memory 192.168.31.33:7001> cluster nodes LOADING Redis is loading the dataset in memory 192.168.31.33:7001> cluster nodes LOADING Redis is loading the dataset in memory 192.168.31.33:7001> cluster nodes LOADING Redis is loading the dataset in memory 192.168.31.33:7001> cluster nodes LOADING Redis is loading the dataset in memory 192.168.31.33:7001> cluster nodes LOADING Redis is loading the dataset in memory 192.168.31.33:7001> cluster nodes LOADING Redis is loading the dataset in memory 192.168.31.33:7001> cluster nodes LOADING Redis is loading the dataset in memory 192.168.31.33:7001> cluster nodes LOADING Redis is loading the dataset in memory 192.168.31.33:7001> cluster nodes LOADING Redis is loading the dataset in memory 192.168.31.33:7001> cluster nodes LOADING Redis is loading the dataset in memory 192.168.31.33:7001> cluster nodes 3c2b55d4b06fc23510bfe94d803763e6f1c72d2a 192.168.31.33:7006@17006 slave 331628fcc89591eafd378d5c4bf15c67be4a60d7 0 1676635036000 7 connected 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004@17004 master - 0 1676635036000 8 connected 0-6826 10923-12287 2c545c46c6c3d148bf6500e06b1fdf5887416c40 192.168.31.33:7007@17007 slave 930b0fb946cd67aaae5642d4cf306638e11913a3 0 1676635038000 8 connected 3fd90616dada7055d874dc92b6c5ef15fc060b8e 192.168.31.33:7005@17005 slave 3b2010b088f81db2dbae5f674f133aeef293d2e8 0 1676635037000 2 connected 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002@17002 master - 0 1676635038000 2 connected 6827-10922 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001@17001 myself,slave 930b0fb946cd67aaae5642d4cf306638e11913a3 0 1676635036000 8 connected 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003@17003 master - 0 1676635037000 7 connected 12288-16383 192.168.31.33:7001> cluster nodes 3c2b55d4b06fc23510bfe94d803763e6f1c72d2a 192.168.31.33:7006@17006 slave 331628fcc89591eafd378d5c4bf15c67be4a60d7 0 1676635048557 7 connected 930b0fb946cd67aaae5642d4cf306638e11913a3 192.168.31.33:7004@17004 master - 0 1676635051573 8 connected 0-6826 10923-12287 2c545c46c6c3d148bf6500e06b1fdf5887416c40 192.168.31.33:7007@17007 slave 930b0fb946cd67aaae5642d4cf306638e11913a3 0 1676635050567 8 connected 3fd90616dada7055d874dc92b6c5ef15fc060b8e 192.168.31.33:7005@17005 slave 3b2010b088f81db2dbae5f674f133aeef293d2e8 0 1676635050000 2 connected 3b2010b088f81db2dbae5f674f133aeef293d2e8 192.168.31.33:7002@17002 master - 0 1676635049562 2 connected 6827-10922 06f1a071ffac504d14f6448fd08bc56a12ee4c2e 192.168.31.33:7001@17001 myself,slave 930b0fb946cd67aaae5642d4cf306638e11913a3 0 1676635047000 8 connected 331628fcc89591eafd378d5c4bf15c67be4a60d7 192.168.31.33:7003@17003 master - 0 1676635048000 7 connected 12288-16383 192.168.31.33:7001> cluster failover OK 192.168.31.33:7001>
Redis报错:(error) LOADING Redis is loading the dataset in memory
问题分析:
redis将之持久化的数据从新写入,等待数据写入完成之后便可正常访问app
redis.conf中maxmemory默认是3G,当redis中dump.rdb文件到达3G时,所有redis的操作都会抛出此异常。
可用内存太小,修改 redis.conf 中的 maxmemory 即可解决
redis 在启动时正在加载 dump.rdb 文件,由于加载比较慢导致 redis 在启动时不可用
总结:这是因为数据量过大,redis刚启动需要将之持久化的数据重新写入,等待数据写入完成以后即可正常访问。
解决办法:
删除 dump.rdb文件,redis.conf中maxmemoy设置大一些
同时思考,业务往redis中存储的数据量是不是过大了。
#################################