corosync v1 + pacemaker高可用集群部署(一)基础安装
corosync v1 + pacemaker
Corosync: OpenAIS发展到Wilson版本后衍生出来的开放性集群引擎工程,提供心跳消息检测及成员管理。
Pacemaker: 集群资源管理器。它利用集群基础构件(OpenAIS 、heartbeat或corosync)提供的消息和成员管理能力来探测并从节点或资源级别的故障中恢复,以实现群集服务(亦称资源)的最大可用性。
Corosync:用来做集群,Pacemaker:用来管理资源。
本实验使用Linux6.8系统,FileSystem资源服务器,NA1节点1,NA2节点2,VIP192.168.94.222
目录结构:
1、基础配置
2、软件安装(corosync、pacemaker)
3、corosync配置文件配置
4、启动corosync服务
5、查看日志信息
1、基础配置
1、修改主机名,设置主机名解析
2、建立SSH互信
3、NTP时间同步
这里不做演示,参考
HeartBeat基础配置(实现Web服务双机热备)
2、软件安装(corosync、pacemaker)
这里我使用epel6的yum进行安装,安装的版本分别是
corosync-1.4.7-6.el6.x86_64 pacemaker-1.1.18-3.el6.x86_64
NA1&NA2
yum安装软件
yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-6.noarch.rpm -y yum install -y corosync* yum install -y pacemaker*
3、corosync配置文件配置
NA1
cd /etc/corosync/ 默认没有配置文件,拷贝配置样例,并取消注释 grep -v '#' corosync.conf.example >> corosync.conf
1、编辑配置文件
vim /etc/corosync/corosync.conf
compatibility: whitetank totem { version: 2 secauth: off compatibility: whitetank totem { version: 2 # 是否认证,我们启用 secauth: on threads: 0 # 定义集群信息传递接口 interface { ringnumber: 0 # 心跳线网段 bindnetaddr: 192.168.94.0 # 组播传播心跳信息 mcastaddr: 239.255.1.1 mcastport: 5405 ttl: 1 } } # 记录日志 logging { fileline: off to_stderr: no to_logfile: yes # /var/lib/log/cluster 目录默认不存在,需要手动创建 logfile: /var/log/cluster/corosync.log to_syslog: yes debug: off # 记录时间戳 timestamp: on # 子系统相关设置 logger_subsys { subsys: AMF debug: off } } 定义服务,使用pacemaker service { ver:0 name:pacemaker } # 定义corosync的工作用户,我们使用root组下的root用户,管理员。 aisexec{ user:root group:root }
2、生成秘钥文件
corosync-keygen
corosync生成key文件会默认调用/dev/random随机数设备,操作不够多的花,会等待很长时间。
执行corosync-keygen命令后,打开一个新的图形化窗口,随便敲击键盘,输入内容,很快就好了。
[root@na1 corosync]# corosync-keygen Corosync Cluster Engine Authentication key generator. Gathering 1024 bits for key from /dev/random. Press keys on your keyboard to generate entropy. Press keys on your keyboard to generate entropy (bits = 48). Press keys on your keyboard to generate entropy (bits = 1008). Writing corosync key to /etc/corosync/authkey.会在当前目录生成authkey秘钥文件。
[root@na1 corosync]# ls authkey corosync.conf.example service.d corosync.conf corosync.conf.example.udpu uidgid.d [root@na1 corosync]#
NA2
NA1配置完毕,将NA1的配置文件复制到NA2上。
[root@na1 corosync]# scp -p authkey corosync.conf na2:/etc/corosync/ authkey 100% 128 0.1KB/s 00:00 corosync.conf 100% 447 0.4KB/s 00:00
4、启动corosync服务
NA1&NA2
[root@na1 corosync]# service corosync start Starting Corosync Cluster Engine (corosync): [确定] [root@na1 corosync]# ssh na2 'service corosync start' Starting Corosync Cluster Engine (corosync): [确定] [root@na1 corosync]#
5、查看日志信息
查看corosync引擎是否正常启动
[root@na1 corosync]# grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/messages May 24 19:11:18 study corosync[5935]: [MAIN ] Corosync Cluster Engine ('1.4.7'): started and ready to provide service. May 24 19:11:18 study corosync[5935]: [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'. [root@na1 corosync]#
查看初始化成员节点通知是否正常发出
[root@na1 corosync]# grep TOTEM /var/log/messages May 24 19:11:18 study corosync[5935]: [TOTEM ] Initializing transport (UDP/IP Multicast). May 24 19:11:18 study corosync[5935]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). May 24 19:11:18 study corosync[5935]: [TOTEM ] The network interface [192.168.94.129] is now up. May 24 19:11:18 study corosync[5935]: [TOTEM ] A processor joined or left the membership and a new membership was formed. May 24 19:11:40 study corosync[5935]: [TOTEM ] A processor joined or left the membership and a new membership was formed. [root@na1 corosync]#
检查启动过程中是否有错误产生
[root@na1 corosync]# grep ERROR: /var/log/cluster/corosync.log May 24 19:11:18 corosync [pcmk ] ERROR: process_ais_conf: You have configured a cluster using the Pacemaker plugin for Corosync. The plugin is not supported in this environment and will be removed very soon. May 24 19:11:18 corosync [pcmk ] ERROR: process_ais_conf: Please see Chapter 8 of 'Clusters from Scratch' (http://www.clusterlabs.org/doc) for details on using Pacemaker with CMAN [root@na1 corosync]#
这里的错误信息表示packmaker不久之后将不再作为corosync的插件运行,因此,建议使用cman作为集群基础架构服务;此处可安全忽略。
查看pacemaker是否正常启动
[root@na1 corosync]# grep pcmk_startup /var/log/messages May 24 19:11:18 study corosync[5935]: [pcmk ] info: pcmk_startup: CRM: Initialized May 24 19:11:18 study corosync[5935]: [pcmk ] Logging: Initialized pcmk_startup May 24 19:11:18 study corosync[5935]: [pcmk ] info: pcmk_startup: Maximum core file size is: 18446744073709551615 May 24 19:11:18 study corosync[5935]: [pcmk ] info: pcmk_startup: Service: 9 May 24 19:11:18 study corosync[5935]: [pcmk ] info: pcmk_startup: Local hostname: na1.server.com [root@na1 corosync]#
查看集群状态
2个节点在线,0个资源
[root@na1 corosync]# crm_mon --one-shot Stack: classic openais (with plugin) Current DC: na1.server.com (version 1.1.18-3.el6-bfe4e80420) - partition with quorum Last updated: Sun May 24 19:24:13 2020 Last change: Sun May 24 19:07:27 2020 by hacluster via crmd on na1.server.com 2 nodes configured (2 expected votes) 0 resources configured Online: [ na1.server.com na2.server.com ] No active resources [root@na1 corosync]#
环境配置到这里,下一篇使用pacemaker进行资源管理。
读书和健身总有一个在路上