RabbitMQ-3.11.2 Cluster Deployment
一、初始化配置
CentOS 8.3 XXX Cluster Initialization Configuration:https://www.cnblogs.com/huaxiayuyi/p/16862622.html
二、安装 RabbitMQ
Installing RabbitMQ-3.11.2 on CentOS 8.3: https://www.cnblogs.com/huaxiayuyi/p/16866645.html
三、集群开始
1 复制cookie内容
erlang.cookie是erlang实现分布式的必要文件,erlang 分布式的每个节点上要保持相同的 .erlang.cookie 文件,同时保证文件的权限是400,不然节点之间就无法通信。
打开文件然后需要先把其中的一台服务器的 .erlang.cookie 中的内容复制到别的机器上,最好是复制内容,因为文件权限不对的话会出现问题,在最后退出保存的时候使用wq!用!来进行强制保存即可。
使用scp传过去
scp /root/.erlang.cookie root@192.168.80.32:/root
scp /root/.erlang.cookie root@192.168.80.33:/root
2 全部节点安装管理插件
# 查看插件列表
rabbitmq-plugins list
# web管理插件
rabbitmq-plugins enable rabbitmq_management
3 添加到集群
将 master01 作为集群主节点,在slave01节点和slave02节点上面分别执行如下命令,以加入集群中。
[root@slave01 ~]# rabbitmqctl stop_app
Stopping rabbit application on node rabbit@slave01 ...
[root@slave01 ~]# rabbitmqctl join_cluster rabbit@master01
Clustering node rabbit@slave01 with rabbit@master01
[root@slave01 ~]# rabbitmqctl start_app
Starting node rabbit@slave01 ...
4 查看集群状态
[root@master01 ~]# rabbitmqctl cluster_status
Cluster status of node rabbit@master01 ...
Basics
Cluster name: rabbit@master01
Disk Nodes
rabbit@master01
rabbit@slave01
rabbit@slave02
Running Nodes
rabbit@master01
rabbit@slave01
rabbit@slave02
Versions
rabbit@master01: RabbitMQ 3.11.2 on Erlang 25.1
rabbit@slave01: RabbitMQ 3.11.2 on Erlang 25.1
rabbit@slave02: RabbitMQ 3.11.2 on Erlang 25.1
Maintenance status
Node: rabbit@master01, status: not under maintenance
Node: rabbit@slave01, status: not under maintenance
Node: rabbit@slave02, status: not under maintenance
Alarms
(none)
Network Partitions
(none)
Listeners
Node: rabbit@master01, interface: [::], port: 15672, protocol: http, purpose: HTTP API
Node: rabbit@master01, interface: [::], port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication
Node: rabbit@master01, interface: [::], port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0
Node: rabbit@slave01, interface: [::], port: 15672, protocol: http, purpose: HTTP API
Node: rabbit@slave01, interface: [::], port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication
Node: rabbit@slave01, interface: [::], port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0
Node: rabbit@slave02, interface: [::], port: 15672, protocol: http, purpose: HTTP API
Node: rabbit@slave02, interface: [::], port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication
Node: rabbit@slave02, interface: [::], port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0
Feature flags
Flag: classic_mirrored_queue_version, state: enabled
Flag: classic_queue_type_delivery_support, state: enabled
Flag: direct_exchange_routing_v2, state: enabled
Flag: drop_unroutable_metric, state: disabled
Flag: empty_basic_get_metric, state: disabled
Flag: feature_flags_v2, state: enabled
Flag: implicit_default_bindings, state: enabled
Flag: listener_records_in_ets, state: enabled
Flag: maintenance_mode_status, state: enabled
Flag: quorum_queue, state: enabled
Flag: stream_queue, state: enabled
Flag: stream_single_active_consumer, state: enabled
Flag: tracking_records_in_ets, state: enabled
Flag: user_limits, state: enabled
Flag: virtual_host_metadata, state: enabled
5 访问WEB地址
服务器地址:15672
使用 iyuyixyz/iyuyixyz 登录即可
四、集群相关操作
1 集群移除节点
# 移除节点前 将要移除的节点停机
[root@slave02 ~]# rabbitmqctl -n rabbit@slave02 stop_app
Stopping rabbit application on node rabbit@slave02 ...
# 当然在master01节点上可以停止从节点运行
[root@master01 ~]# rabbitmqctl -n rabbit@slave01 stop_app
Stopping rabbit application on node rabbit@slave01 ...
# 在集群节点上移除要下线的节点
[root@master01 ~]# rabbitmqctl forget_cluster_node rabbit@slave02
Removing node rabbit@slave02 from the cluster
# 重置当前服务数据,还原成默认配置 (可忽略)
[root@slave01 ~]# rabbitmqctl reset
Resetting node rabbit@slave02 ...
2 改变集群节点类型,加入集群时指定节点类型,在从节点执行
rabbitmqctl stop_app
rabbitmqctl join_cluster --ram rabbit@master01
rabbitmqctl start_app
# --ram 指定内存节点类型,该节点所有信息保存在内存中
# --disk 磁盘节点类型,默认
3 修改节点类型
[root@slave02 ~]# rabbitmqctl stop_app
Stopping rabbit application on node rabbit@slave02 ...
[root@slave02 ~]# rabbitmqctl change_cluster_node_type disc
Turning rabbit@slave02 into a disc node
[root@slave02 ~]# rabbitmqctl start_app
Starting node rabbit@slave02 ...
N、报错
1 ERROR: could not bind to distribution port 25672, it is in use by another node: rabbit@slave01
[root@slave01 ~]# rabbitmq-server
2022-11-08 17:48:11.788563+08:00 [error] <0.132.0>
2022-11-08 17:48:11.788563+08:00 [error] <0.132.0> BOOT FAILED
2022-11-08 17:48:11.788563+08:00 [error] <0.132.0> ===========
2022-11-08 17:48:11.788563+08:00 [error] <0.132.0> ERROR: could not bind to distribution port 25672, it is in use by another node: rabbit@slave01
2022-11-08 17:48:11.788563+08:00 [error] <0.132.0>
BOOT FAILED
===========
ERROR: could not bind to distribution port 25672, it is in use by another node: rabbit@slave01
2022-11-08 17:48:12.852432+08:00 [error] <0.132.0> supervisor: {local,rabbit_prelaunch_sup}
2022-11-08 17:48:12.852432+08:00 [error] <0.132.0> errorContext: start_error
2022-11-08 17:48:12.852432+08:00 [error] <0.132.0> reason: {dist_port_already_used,25672,"rabbit","slave01"}
2022-11-08 17:48:12.852432+08:00 [error] <0.132.0> offender: [{pid,undefined},
2022-11-08 17:48:12.852432+08:00 [error] <0.132.0> {id,prelaunch},
2022-11-08 17:48:12.852432+08:00 [error] <0.132.0> {mfargs,{rabbit_prelaunch,run_prelaunch_first_phase,[]}},
2022-11-08 17:48:12.852432+08:00 [error] <0.132.0> {restart_type,transient},
2022-11-08 17:48:12.852432+08:00 [error] <0.132.0> {significant,false},
2022-11-08 17:48:12.852432+08:00 [error] <0.132.0> {shutdown,5000},
2022-11-08 17:48:12.852432+08:00 [error] <0.132.0> {child_type,worker}]
2022-11-08 17:48:12.852432+08:00 [error] <0.132.0>
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> crasher:
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> initial call: application_master:init/4
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> pid: <0.130.0>
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> registered_name: []
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> exception exit: {{shutdown,
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> {failed_to_start_child,prelaunch,
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> {dist_port_already_used,25672,"rabbit",
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> "slave01"}}},
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> {rabbit_prelaunch_app,start,[normal,[]]}}
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> in function application_master:init/4 (application_master.erl, line 142)
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> ancestors: [<0.129.0>]
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> message_queue_len: 1
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> messages: [{'EXIT',<0.131.0>,normal}]
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> links: [<0.129.0>,<0.44.0>]
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> dictionary: []
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> trap_exit: true
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> status: running
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> heap_size: 376
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> stack_size: 28
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> reductions: 169
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0> neighbours:
2022-11-08 17:48:12.853536+08:00 [error] <0.130.0>
2022-11-08 17:48:12.905594+08:00 [notice] <0.44.0> Application rabbitmq_prelaunch exited with reason: {{shutdown,{failed_to_start_child,prelaunch,{dist_port_already_used,25672,"rabbit","slave01"}}},{rabbit_prelaunch_app,start,[normal,[]]}}
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbitmq_prelaunch,{{shutdown,{failed_to_start_child,prelaunch,{dist_port_already_used,25672,\"rabbit\",\"slave01\"}}},{rabbit_prelaunch_app,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,rabbitmq_prelaunch,{{shutdown,{failed_to_start_child,prelaunch,{dist_port_already_used,25672,"rabbit","slave01"}}},{rabbit_prelaunch_app,start,[normal,[]]}}})
Crash dump is being written to: erl_crash.dump...done
解决 kill 掉,或者重启机器!!!
2 RabbitMQ 集群信息残留,导致服务启动失败。
error:{badmatch,{error,{normal,{mnesia_app,start,[normal,[]]}}}}
[root@slave02 ~]# rabbitmq-server
2022-11-08 23:20:19.122434+08:00 [notice] <0.44.0> Application syslog exited with reason: stopped
2022-11-08 23:20:19.131099+08:00 [notice] <0.229.0> Logging: switching to configured handler(s); following messages may not be visible in this log output
BOOT FAILED
===========
Exception during startup:
error:{badmatch,{error,{normal,{mnesia_app,start,[normal,[]]}}}}
rabbit:run_prelaunch_second_phase/0, line 380
rabbit:start/2, line 849
application_master:start_it_old/4, line 293
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{{badmatch,{error,{normal,{mnesia_app,start,[normal,[]]}}}},{rabbit,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,rabbit,{{badmatch,{error,{normal,{mnesia_app,start,[normal,[]]}}}},{rabbit,start,[normal,[]]}}})
Crash dump is being written to: /opt/rabbitmq_server-3.11.2/var/log/rabbitmq/erl_crash.dump...done
解决 将 mnesia 目录移动到 /tmp 目录,或者删除该目录。
[root@slave02 mnesia]# pwd
/opt/rabbitmq_server-3.11.2/var/lib/rabbitmq/mnesia
[root@slave02 mnesia]# ll
total 12
drwxr-xr-x 5 root root 4096 Nov 8 23:32 rabbit@slave02
-rw-r--r-- 1 root root 358 Nov 8 23:32 rabbit@slave02-feature_flags
-rw-r--r-- 1 root root 4 Nov 8 23:31 rabbit@slave02.pid
drwxr-xr-x 2 root root 6 Nov 8 23:31 rabbit@slave02-plugins-expand
[root@slave02 mnesia]# rm -rf *