Redis-3.x Cluster安装配置

Redis-3.x Cluster安装配置

官方文档: 
phpredis扩展

环境:
CentOS6.5 x64
redis-3.0.6
master node1: 192.168.192.10
master node2: 192.168.192.11
master node3: 192.168.192.12
slave node1: 192.168.192.20
slave node2: 192.168.192.21
slave node3: 192.168.192.22
3主3辅
有兴趣的朋友可以参看个人之前写的拙篇Redis-sentinel集群

一.安装编译依赖库
yum -y install gcc gcc-c++ make tcl-devel 

二.安装
tar -xvf redis-3.0.6.tar.gz -C /usr/local/src
cd /usr/local/src/redis-3.0.6
make -j4 && make PREFIX=/opt/redis install
cp /usr/local/src/redis-3.0.6/src/redis-trib.rb /opt/redis/bin
echo 'export PATH=$PATH:/opt/redis/bin' >>/etc/profile
source /etc/profile

三.sysv脚本
可以借助源码包自带的utils交互式工具来生成,不过还要略加修改
a.交互式
root@master:redis-3.0.6#/usr/local/src/redis-3.0.6/utils/install_server.sh
Welcome to the redis service installer
This script will help you easily set up a running redis server

Please select the redis port for this instance: [6379]
Selecting default: 6379
Please select the redis config file name [/etc/redis/6379.conf] /opt/redis/conf/redis.conf
Please select the redis log file name [/var/log/redis_6379.log] /opt/redis/log/redis.log
Please select the data directory for this instance [/var/lib/redis/6379] /opt/redis/data
Please select the redis executable path [] /opt/redis/bin/redis-server
Selected config:
Port           : 6379
Config file    : /opt/redis/conf/redis.conf
Log file       : /opt/redis/log/redis.log
Data dir       : /opt/redis/data
Executable     : /opt/redis/bin/redis-server
Cli Executable : /opt/redis/bin/redis-cli
Is this ok? Then press ENTER to go on or Ctrl-C to abort.
Copied /tmp/6379.conf => /etc/init.d/redis_6379
Installing service...
Successfully added to chkconfig!
Successfully added to runlevels 345!
Starting Redis server...
Installation successful!
可以看到, 脚本帮我们自动配置好了相关路径和文件
b.静默
bash /usr/local/src/redis-3.0.7/utils/install_server.sh <<EOF
6379
/opt/redis/conf/redis.conf
/opt/redis/log/redis.log
/opt/redis/data
/opt/redis/bin/redis-server
EOF

Redis-3.x <wbr>Cluster安装配置
不过init脚本的名字没能帮我们自定义,有强迫症的朋友可以修改下
ln -s /etc/init.d/redis_6379 /etc/init.d/redis


四.内核参数优化
echo 'net.core.somaxconn = 511' >> /etc/sysctl.conf
echo 'vm.overcommit_memory = 1'  >> /etc/sysctl.conf
cat >>/etc/rc.d/rc.local <<HERE
echo never > /sys/kernel/mm/transparent_hugepage/enabled
HERE


五.配置集群
简述
Redis 集群的键空间被分割为 16384 (2^14)个槽(slot), 集群的最大节点数量也是 16384 个(推荐的最大节点数量为 1000 个),同理每个主节点可以负责处理1到16384个槽位。
每个节点在集群中由一个独一无二的 ID标识, 该 ID 是一个十六进制表示的 160 位随机数,在节点第一次启动时由 /dev/urandom 生成。节点会将它的 ID 保存到配置文件, 只要这个配置文件不被删除, 节点就会一直沿用这个 ID 。一个节点可以改变它的 IP 和端口号, 而不改变节点 ID 。 集群可以自动识别出IP/端口号的变化, 并将这一信息通过 Gossip协议广播给其他节点知道。

准备
master node1: 192.168.192.10
master node2: 192.168.192.11
master node3: 192.168.192.12
slave node1: 192.168.192.20
slave node2: 192.168.192.21
slave node3: 192.168.192.22
确保以上节点都己成功安装redis,安装方法同上
Note that the minimal cluster that works as expected requires to contain at least three master nodes. For your first tests it is strongly suggested to start a six nodes cluster with three masters and three slaves.

A.集群配置文件
master主配置文件
启用cluster模式,需要在原配置文件的基础上增加(或修改)如下几行
cluster-enabled yes #是否启用集群模式
cluster-config-file nodes-6379.conf #集群节点配置文件,自动生成并通过Gossip协议同步到各节点
cluster-node-timeout 15000 
cluster-slave-validity-factor 10
cluster-migration-barrier 1
cluster-require-full-coverage yes
slave主配置文件
直接沿用master的主配置文件,最后手动指定给某台master来作为slave
注意:只有所有节点都运行在cluster模式redis cluster才能生效

B.添加master节点
redis-cli cluster meet 192.168.192.10 6379
redis-cli cluster meet 192.168.192.11 6379
redis-cli cluster meet 192.168.192.12 6379
Redis-3.x <wbr>Cluster安装配置
注意:3节点集群,默认都为master,节点己添加成功但处于fail状态


C.启用集群
master node1: 192.168.192.10
redis-cli cluster addslots {0..5500}
master node2: 192.168.192.11
redis-cli cluster addslots {5501..11000}
master node3: 192.168.192.12
redis-cli cluster addslots {11001..16383}
Redis-3.x <wbr>Cluster安装配置
各字段对应含义 
node id, address:port, flags, last ping sent, last pong received, configuration epoch, link state, slots
集群配置信息会写入到主配置文件中定义的节点配置文件(/opt/redis/data/nodes-6379.conf), 因此,也可以直接在配置文件里修改好slots区间后同步到各节点,再重启redis
注意:集群可用的必备条件
1.有slots分配到node
2.集群成员选举后认为集群可用


D.slots在线分片
需要借助ruby工具包---redis-trib.rb
yum -y install ruby rubygems
gem install redis
redis-trib.rb reshard --from ca5fb0605fa3efbf62d1c8367489101cccfe0883 --to c16ba7f364b038e532c398d92b24d35ad5e23369 --slots 5 --yes 192.168.192.10:6379
Redis-3.x <wbr>Cluster安装配置
如果不代详细参数,redis-trib.rb会交互式地问"移什么","怎么移",作为测试,这里我只从192.168.192.11上移了5个slots到192.168.192.12这台主机, 通过slots栏位可以很清楚的看到

E.添加slave节点
redis-trib.rb add-node --slave --master-id 19bcc3b19b1325fac5c7647c1431f46299609079 192.168.192.20:6379 192.168.192.10:6379
redis-trib.rb add-node --slave --master-id ca5fb0605fa3efbf62d1c8367489101cccfe0883 192.168.192.21:6379 192.168.192.11:6379
redis-trib.rb add-node --slave --master-id c16ba7f364b038e532c398d92b24d35ad5e23369 192.168.192.22:6379 192.168.192.12:6379
Redis-3.x <wbr>Cluster安装配置
为确保主辅一致,请在各slave节点上执行同步操作
redis-cli cluster replicate 19bcc3b19b1325fac5c7647c1431f46299609079
redis-cli cluster replicate ca5fb0605fa3efbf62d1c8367489101cccfe0883
redis-cli cluster replicate c16ba7f364b038e532c398d92b24d35ad5e23369 


F.failover测试
1.一台master宕机
redis-cli -h 192.168.192.12 -p 6379 debug segfault
redis-cli cluster info
redis-cli cluster nodes
Redis-3.x <wbr>Cluster安装配置
可以看到,当其中一台master宕机时,该master的slave在很短的时间内就提升自己为master,并将原master的所有slots全部接管过来
以下是slave切换为master的详细日志
2782:S 25 Dec 11:29:29.384 # Connection with master lost.
2782:S 25 Dec 11:29:29.384 * Caching the disconnected master state.
2782:S 25 Dec 11:29:29.384 * Discarding previously cached master state.
2782:S 25 Dec 11:29:29.649 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:29:29.650 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:29:29.650 * Non blocking connect for SYNC fired the event.
2782:S 25 Dec 11:29:29.650 * Master replied to PING, replication can continue...
2782:S 25 Dec 11:29:29.651 * Partial resynchronization not possible (no cached master)
2782:S 25 Dec 11:29:29.651 * Full resync from master: cbb2f04d66e95d799f8dabbeaa90c3a293ca9e28:1121
2782:S 25 Dec 11:29:29.758 * MASTER <-> SLAVE sync: receiving 18 bytes from master
2782:S 25 Dec 11:29:29.758 * MASTER <-> SLAVE sync: Flushing old data
2782:S 25 Dec 11:29:29.758 * MASTER <-> SLAVE sync: Loading DB in memory
2782:S 25 Dec 11:29:29.758 * MASTER <-> SLAVE sync: Finished with success
2782:S 25 Dec 11:30:08.012 # Connection with master lost.
2782:S 25 Dec 11:30:08.012 * Caching the disconnected master state.
2782:S 25 Dec 11:30:08.193 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:30:08.193 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:30:08.193 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:09.212 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:30:09.212 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:30:09.212 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:10.222 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:30:10.222 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:30:10.222 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:11.232 * Connecting to MASTER 192.168.192.12:6379
... ...
2782:S 25 Dec 11:30:15.284 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:16.296 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:30:16.296 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:30:19.329 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:20.349 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:30:20.349 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:30:20.350 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:21.370 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:30:21.370 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:30:21.371 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:22.396 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:30:22.396 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:30:22.396 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:23.411 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:30:23.412 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:30:23.412 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:23.674 * FAIL message received from ca5fb0605fa3efbf62d1c8367489101cccfe0883 about c16ba7f364b038e532c398d92b24d35ad5e23369
2782:S 25 Dec 11:30:23.674 # Cluster state changed: fail
2782:S 25 Dec 11:30:23.715 # Start of election delayed for 690 milliseconds (rank #0, offset 1163).
2782:S 25 Dec 11:30:24.425 * Connecting to MASTER 192.168.192.12:6379
2782:S 25 Dec 11:30:24.425 * MASTER <-> SLAVE sync started
2782:S 25 Dec 11:30:24.425 # Starting a failover election for epoch 7.
2782:S 25 Dec 11:30:24.469 # Error condition on socket for SYNC: Connection refused
2782:S 25 Dec 11:30:24.470 # Failover election won: I'm the new master.
2782:S 25 Dec 11:30:24.471 # configEpoch set to 7 after successful failover
2782:M 25 Dec 11:30:24.471 * Discarding previously cached master state.
2782:M 25 Dec 11:30:24.471 # Cluster state changed: ok
2.宕机的master修复后重新启动
Redis-3.x <wbr>Cluster安装配置
这里,集群会自动将原master(192.168.192.12)变为新master(192.168.192.22)的slave
2411:M 25 Dec 11:44:44.139 # Server started, Redis version 3.0.6
2411:M 25 Dec 11:44:44.139 * DB loaded from disk: 0.000 seconds
2411:M 25 Dec 11:44:44.140 * The server is now ready to accept connections on port 6379
2411:M 25 Dec 11:44:44.194 # Configuration change detected. Reconfiguring myself as a replica of e3bacaf98d2eee3259275c8751cd4757f8ca0b64
2411:S 25 Dec 11:44:44.195 # Cluster state changed: ok
2411:S 25 Dec 11:44:45.215 * Connecting to MASTER 192.168.192.22:6379
2411:S 25 Dec 11:44:45.216 * MASTER <-> SLAVE sync started
2411:S 25 Dec 11:44:45.216 * Non blocking connect for SYNC fired the event.
2411:S 25 Dec 11:44:45.216 * Master replied to PING, replication can continue...
2411:S 25 Dec 11:44:45.217 * Partial resynchronization not possible (no cached master)
2411:S 25 Dec 11:44:45.217 * Full resync from master: 163f91176e281346fefb0935221343b174b40843:1
2411:S 25 Dec 11:44:45.239 * MASTER <-> SLAVE sync: receiving 18 bytes from master
2411:S 25 Dec 11:44:45.239 * MASTER <-> SLAVE sync: Flushing old data
2411:S 25 Dec 11:44:45.240 * MASTER <-> SLAVE sync: Loading DB in memory
2411:S 25 Dec 11:44:45.240 * MASTER <-> SLAVE sync: Finished with success


补充:常用管理命令
集群(cluster)
cluster info 打印集群的信息  
cluster nodes 列出集群当前已知的所有节点(node),以及这些节点的相关信息。     
节点 (node)
cluster meet 将IP和PORT所指定的节点添加到集群当中,让它成为集群的一份子。     
cluster forget 从集群中移除node_id指定的节点。  
cluster replicate 将当前节点设置为node_id指定的节点的从节点。  
cluster saveconfig 将节点的配置文件保存到硬盘里面。
槽(slot)  
cluster addslots [slot ...] 将一个或多个槽(slot)指派(assign)给当前节点。      
cluster delslots [slot ...] 移除一个或多个槽对当前节点的指派。
cluster flushslots 移除指派给当前节点的所有槽,让当前节点变成一个没有指派任何槽的节点。 
cluster setslot node 将槽slot指派给node_id指定的节点,如果槽已经指派给另一个节点,那么先让另一个节点删除该槽,然后再进行指派。       
cluster setslot migrating 将本节点的槽slot迁移到node_id指定的节点中。    
cluster setslot importing 从node_id指定的节点中导入槽slot到本节点。    
cluster setslot stable 取消对槽slot的导入(import)或者迁移(migrate)。      
键 (key)
cluster keyslot 计算键key应该被放置在哪个槽上。  
cluster countkeysinslot 返回槽slot 目前包含的键值对数量。 
cluster getkeysinslot 返回count个slot 槽中的键。
posted @ 2015-12-25 10:25  李庆喜  阅读(259)  评论(0编辑  收藏  举报