KingbaseES集群典型案例---通过securecmdd工具一键部署集群
案例说明:
对于KingbaseES V8R6在部署集群时,需要建立kingbase、root用户在节点间的ssh互信,如果在生产环境禁用root用户ssh登录或禁止节点建立ssh互信,则通过ssh部署会失败;在使用图形工具或脚本一键部署时可以通过数据库自带的securecmdd工具建立节点之间互信连接,并执行集群部署。
数据库版本:
KingbaseES V8R6
操作步骤:
- 在集群节点部署securecmdd工具,并测试节点之间securecmdd的互信连接。
- 编辑部署脚本,通过securecmdd分发部署集群。
- 部署完成后,查看集群状态并执行集群启停、切换等测试。
一、在节点部署securecmdd工具(all nodes)
sys_securecmdd是集群中自带的工具,集群监控、管理集群时通过sys_securecmdd安全执行命令。
sys_securecmdd主要包含以下文件:
- sys_securecmdd,服务端二进制,集群中每个节点都有sys_securecmdd进程运行,默认监听8890端口,接受sys_securecmd的连接并执行指定的命令。
- sys_securecmd,客户端二进制,集群通过sys_securecmd发送指令给服务端并执行。
1、查看数据库软件安装包(自带securecmdd工具)
[kingbase@node1 zip]$ pwd
/opt/Kingbase/ES/V8R6_054/ClientTools/guitools/DeployTools/zip
[kingbase@node1 zip]$ ls -lh
total 341M
-rw-rw-r--. 1 kingbase kingbase 338M Apr 7 16:18 db.zip
-rw-rw-r--. 1 kingbase kingbase 9.7K Apr 7 16:18 install.conf
-rw-rw-r--. 1 kingbase kingbase 2.1M Apr 7 16:18 securecmdd.zip
......
2、将securecmdd.zip拷贝到/home/kingbase/cluster下(目录自定义)
# 解压安装包
[kingbase@node1 zip]$ cp securecmdd.zip /home/kingbase/cluster/
[kingbase@node1 cluster]$ unzip securecmdd.zip
[root@node2 ~]# cd /home/kingbase/cluster/securecmdd/share
# 查看可执行文件
[root@node2 bin]# ls -lh
total 2.0M
-rwxr-xr-x 1 kingbase kingbase 34K Apr 7 16:18 sys_HAscmdd.sh
-rwxr-xr-x 1 kingbase kingbase 856K Apr 7 16:18 sys_securecmd
-rwxr-xr-x 1 kingbase kingbase 938K Apr 7 16:18 sys_securecmdd
-rwxr-xr-x 1 kingbase kingbase 149K Apr 7 16:18 sys_secureftp
3、执行securecmdd初始化
[root@node2 bin]# sh sys_HAscmdd.sh init
successfully initialized the sys_securecmdd, please use "sys_HAscmdd.sh start" to start the sys_securecmdd
4、启动securecmdd服务
如下所示,securecmdd服务启动成功:
[root@node2 bin]# sh sys_HAscmdd.sh start
[root@node2 bin]# ps -ef |grep secure
root 30443 1 0 15:23 ? 00:00:00 sys_securecmdd: /home/kingbase/cluster/securecmdd/bin/sys_securecmdd -f /etc/.kes/securecmdd_config [listener] 0 of 128-256 startups
[root@node2 bin]# netstat -antlp |grep 8890
tcp 0 0 0.0.0.0:8890 0.0.0.0:* LISTEN 30443/sys_securecmd
tcp6 0 0 :::8890 :::* LISTEN 30443/sys_securecmd
# 测试securecmdd连接
[root@node102 bin]# ./sys_securecmd root@127.0.0.1
......
---如果客户端可以免密连接到服务端,则securecmdd部署成功。
二、集群部署前的准备
1、数据库软件安装后,从相应目录下获取部署文件
[kingbase@node1 zip]$ pwd
/opt/Kingbase/ES/V8R6_054/ClientTools/guitools/DeployTools/zip`
[kingbase@node1 zip]$ ls -lh
-rwxrwxr-x 1 kingbase kingbase 176K Sep 2 2023 cluster_install.sh
-rw-rw-r-- 1 kingbase kingbase 308M Sep 2 2023 db.zip
-rw-rw-r-- 1 kingbase kingbase 15K Sep 2 2023 install.conf
-rw-rw-r-- 1 kingbase kingbase 2.5M Sep 2 2023 securecmdd.zip
-rwxrwxr-x 1 kingbase kingbase 7.3K Sep 2 2023 trust_cluster.sh
2、将所需部署文件存储到一个目录下包括license
[kingbase@node1 r6_install]$ ls -lh
-rwxrwxr-x 1 kingbase kingbase 176K Sep 2 2023 cluster_install.sh
-rw-rw-r-- 1 kingbase kingbase 308M Sep 2 2023 db.zip
-rw-rw-r-- 1 kingbase kingbase 15K Sep 2 2023 install.conf
-rw-rw-r-- 1 kingbase kingbase 2.5M Sep 2 2023 securecmdd.zip
-rwxrwxr-x 1 kingbase kingbase 7.3K Sep 2 2023 trust_cluster.sh
-rw-r--r--. 1 kingbase kingbase 3.4K Mar 1 2023 license.dat
3、创建集群部署目录(all nodes)
[kingbase@node1 r6_install]$ mkdir -p /home/kingbase/cluster/R6P/R6H/kingbase
4、将db.zip包上传到每个节点的集群安装目录下并解压(all nodes)
[kingbase@node2 kingbase]$ pwd
/home/kingbase/cluster/R6P/R6H/kingbase
[kingbase@node2 kingbase]$ unzip db.zip
[kingbase@node2 kingbase]$ ls -lh
total 338M
drwxr-xr-x 2 kingbase kingbase 4.0K Apr 7 16:16 bin
-rw-rw-r-- 1 kingbase kingbase 338M May 23 16:57 db.zip
drwxrwxr-x 5 kingbase kingbase 8.0K Apr 7 16:17 lib
drwxrwxr-x 8 kingbase kingbase 4.0K Apr 7 16:17 share
4、将license文件上传到每个节点的集群安装bin目录下(all nodes)
[kingbase@node1 r6_install]$ cp license.dat /home/kingbase/cluster/R6P/R6H/kingbase/bin/license.dat
[kingbase@node1 r6_install]$ scp license.dat node2:/home/kingbase/cluster/R6P/R6H/kingbase/bin/license.dat
二、编辑install.conf配置文件
如下所示,脚本部署配置文件:
[kingbase@node1 r6_install]$ cat install.conf |grep -v ^$|grep -v ^#
[install]
on_bmj=0
all_ip=(192.168.8.200 192.168.8.201)
witness_ip=""
production_ip=()
local_disaster_recovery_ip=()
remote_disaster_recovery_ip=()
install_dir="/home/kingbase/cluster/R6P/R6H"
zip_package="/home/kingbase/r6_install/db.zip"
license_file=(license.dat)
db_user="system" # the user name of database
db_port="54321" # the port of database, defaults is 54321
db_mode="oracle" # database mode: pg, oracle
db_auth="scram-sha-256" # database authority: scram-sha-256, md5, default is scram-sha-256
db_case_sensitive="yes" # database case sensitive settings: yes, no. default is yes - case sensitive; no - case insensitive (NOTE. cannot set to 'no' when db_mode="pg").
trusted_servers="192.168.8.1"
data_directory="/home/kingbase/cluster/R6P/R6H/kingbase/data"
virtual_ip="192.168.8.240/24"
net_device=(enp0s3 enp0s3)
net_device_ip=(192.168.8.200 192.168.8.201)
ipaddr_path="/sbin"
arping_path="/opt/Kingbase/ES/V8R6_054/Server/bin/"
ping_path="/bin"
super_user="root"
execute_user="kingbase"
deploy_by_sshd=0 # choose whether to use sshd when deploy, 0 means not to use (deploy by sys_securecmdd), 1 means to use (deploy by sshd), default value is 1; when on_bmj=1, it will auto set to no(deploy_by_sshd=0)
use_scmd=1 # Is the cluster running on sys_securecmdd or sshd? 1 means yes (on sys_securecmdd), 0 means no (on sshd), default value is 1; when on_bmj=1, it will auto set to yes(use_scmd=1)
reconnect_attempts="10" # the number of retries in the event of an error
reconnect_interval="6" # retry interval
recovery="standby" # the way of cluster recovery: standby/automatic/manual
ssh_port="22" # the port of ssh, default is 22
scmd_port="8890" # the port of sys_securecmdd, default is 8890
auto_cluster_recovery_level='1'
use_check_disk='off'
synchronous='quorum'
......
Tips:
配置参数deploy_by_sshd=0 ,use_scmd=1在数据包分发时选择securecmdd而不是ssh。
deploy_by_sshd=0 # choose whether to use sshd when deploy, 0 means not to use (deploy by sys_securecmdd), 1 means to use (deploy by sshd), default value is 1; when on_bmj=1, it will auto set to no(deploy_by_sshd=0)
use_scmd=1 # Is the cluster running on sys_securecmdd or sshd? 1 means yes (on sys_securecmdd), 0 means no (on sshd), default value is 1; when on_bmj=1, it will auto set to yes(use_scmd=1)
三、执行部署脚本
[kingbase@node1 r6_install]$ sh cluster_install.sh
[CONFIG_CHECK] will deploy the cluster of DG
[CONFIG_CHECK] check if the virtual ip "192.168.8.240" already exist ...
[CONFIG_CHECK] there is no "192.168.8.240" on any host, OK
[CONFIG_CHECK] the number of net_device matches the length of all_ip or the number of net_device is 1 ... OK
[RUNNING] check if the host can be reached ...
[RUNNING] success connect to the target "192.168.8.200" ..... OK
.......
2022-05-23 17:12:02 repmgrd on "[192.168.8.201]" start success.
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
1 | node1 | primary | * running | | running | 27062 | no | n/a
2 | node2 | standby | running | node1 | running | 16079 | no | 1 second(s) ago
[2022-05-23 17:12:15] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6P/R6H/kingbase/log/kbha.log"
[2022-05-23 17:12:29] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6P/R6H/kingbase/log/kbha.log"
2022-05-23 17:12:30 Done.
[INSTALL] start up the whole cluster ... OK
四、查看集群状态
如下所示,集群部署成功:
[kingbase@node1 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100 | 1 | host=192.168.8.200 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node2 | standby | running | node1 | default | 100 | 1 | host=192.168.8.201 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
五、总结
1、对于生产环境不允许root用户ssh登录时,可以采用手工脚本方式部署集群,但是必须提前在所有节点部署和启动securecmdd服务。
2、然后在install.conf中配置选择securecmdd部署。
3、部署完成后,经测试,在root用户不能ssh登录系统,不影响集群的切换和启动及关闭。