KingbaseES集群典型案例---通过securecmdd工具一键部署集群

案例说明:
对于KingbaseES V8R6在部署集群时,需要建立kingbase、root用户在节点间的ssh互信,如果在生产环境禁用root用户ssh登录或禁止节点建立ssh互信,则通过ssh部署会失败;在使用图形工具或脚本一键部署时可以通过数据库自带的securecmdd工具建立节点之间互信连接,并执行集群部署。

数据库版本:
KingbaseES V8R6

操作步骤:

  1. 在集群节点部署securecmdd工具,并测试节点之间securecmdd的互信连接。
  2. 编辑部署脚本,通过securecmdd分发部署集群。
  3. 部署完成后,查看集群状态并执行集群启停、切换等测试。

一、在节点部署securecmdd工具(all nodes)

sys_securecmdd是集群中自带的工具,集群监控、管理集群时通过sys_securecmdd安全执行命令。
sys_securecmdd主要包含以下文件:

  • sys_securecmdd,服务端二进制,集群中每个节点都有sys_securecmdd进程运行,默认监听8890端口,接受sys_securecmd的连接并执行指定的命令。
  • sys_securecmd,客户端二进制,集群通过sys_securecmd发送指令给服务端并执行。

1、查看数据库软件安装包(自带securecmdd工具)

[kingbase@node1 zip]$ pwd
/opt/Kingbase/ES/V8R6_054/ClientTools/guitools/DeployTools/zip

[kingbase@node1 zip]$ ls -lh
total 341M
-rw-rw-r--. 1 kingbase kingbase 338M Apr  7 16:18 db.zip
-rw-rw-r--. 1 kingbase kingbase 9.7K Apr  7 16:18 install.conf
-rw-rw-r--. 1 kingbase kingbase 2.1M Apr  7 16:18 securecmdd.zip
......

2、将securecmdd.zip拷贝到/home/kingbase/cluster下(目录自定义)

# 解压安装包
[kingbase@node1 zip]$ cp securecmdd.zip /home/kingbase/cluster/
[kingbase@node1 cluster]$ unzip securecmdd.zip 
[root@node2 ~]# cd /home/kingbase/cluster/securecmdd/share

# 查看可执行文件
[root@node2 bin]# ls -lh
total 2.0M
-rwxr-xr-x 1 kingbase kingbase  34K Apr  7 16:18 sys_HAscmdd.sh
-rwxr-xr-x 1 kingbase kingbase 856K Apr  7 16:18 sys_securecmd
-rwxr-xr-x 1 kingbase kingbase 938K Apr  7 16:18 sys_securecmdd
-rwxr-xr-x 1 kingbase kingbase 149K Apr  7 16:18 sys_secureftp

3、执行securecmdd初始化

[root@node2 bin]# sh sys_HAscmdd.sh  init
successfully initialized the sys_securecmdd, please use "sys_HAscmdd.sh start" to start the sys_securecmdd

4、启动securecmdd服务

如下所示,securecmdd服务启动成功:

[root@node2 bin]# sh sys_HAscmdd.sh start
[root@node2 bin]# ps -ef |grep secure
root     30443     1  0 15:23 ?        00:00:00 sys_securecmdd: /home/kingbase/cluster/securecmdd/bin/sys_securecmdd -f /etc/.kes/securecmdd_config [listener] 0 of 128-256 startups

[root@node2 bin]# netstat -antlp |grep 8890
tcp        0      0 0.0.0.0:8890            0.0.0.0:*               LISTEN      30443/sys_securecmd 
tcp6       0      0 :::8890                 :::*                    LISTEN      30443/sys_securecmd 

# 测试securecmdd连接

[root@node102 bin]# ./sys_securecmd  root@127.0.0.1 
......

---如果客户端可以免密连接到服务端,则securecmdd部署成功。

二、集群部署前的准备

1、数据库软件安装后,从相应目录下获取部署文件

[kingbase@node1 zip]$ pwd
/opt/Kingbase/ES/V8R6_054/ClientTools/guitools/DeployTools/zip`

[kingbase@node1 zip]$ ls -lh

-rwxrwxr-x 1 kingbase kingbase 176K Sep  2  2023 cluster_install.sh
-rw-rw-r-- 1 kingbase kingbase 308M Sep  2  2023 db.zip
-rw-rw-r-- 1 kingbase kingbase  15K Sep  2  2023 install.conf
-rw-rw-r-- 1 kingbase kingbase 2.5M Sep  2  2023 securecmdd.zip
-rwxrwxr-x 1 kingbase kingbase 7.3K Sep  2  2023 trust_cluster.sh

2、将所需部署文件存储到一个目录下包括license

[kingbase@node1 r6_install]$ ls -lh

-rwxrwxr-x 1 kingbase kingbase 176K Sep  2  2023 cluster_install.sh
-rw-rw-r-- 1 kingbase kingbase 308M Sep  2  2023 db.zip
-rw-rw-r-- 1 kingbase kingbase  15K Sep  2  2023 install.conf
-rw-rw-r-- 1 kingbase kingbase 2.5M Sep  2  2023 securecmdd.zip
-rwxrwxr-x 1 kingbase kingbase 7.3K Sep  2  2023 trust_cluster.sh
-rw-r--r--. 1 kingbase kingbase 3.4K Mar  1  2023 license.dat

3、创建集群部署目录(all nodes)
[kingbase@node1 r6_install]$ mkdir -p /home/kingbase/cluster/R6P/R6H/kingbase

4、将db.zip包上传到每个节点的集群安装目录下并解压(all nodes)

[kingbase@node2 kingbase]$ pwd
/home/kingbase/cluster/R6P/R6H/kingbase

[kingbase@node2 kingbase]$ unzip db.zip

[kingbase@node2 kingbase]$ ls -lh
total 338M
drwxr-xr-x 2 kingbase kingbase 4.0K Apr  7 16:16 bin
-rw-rw-r-- 1 kingbase kingbase 338M May 23 16:57 db.zip
drwxrwxr-x 5 kingbase kingbase 8.0K Apr  7 16:17 lib
drwxrwxr-x 8 kingbase kingbase 4.0K Apr  7 16:17 share

4、将license文件上传到每个节点的集群安装bin目录下(all nodes)

[kingbase@node1 r6_install]$ cp license.dat /home/kingbase/cluster/R6P/R6H/kingbase/bin/license.dat
[kingbase@node1 r6_install]$ scp license.dat node2:/home/kingbase/cluster/R6P/R6H/kingbase/bin/license.dat

二、编辑install.conf配置文件

如下所示,脚本部署配置文件:

[kingbase@node1 r6_install]$ cat install.conf |grep -v ^$|grep -v ^#
[install]
on_bmj=0
all_ip=(192.168.8.200 192.168.8.201)
witness_ip=""
production_ip=()
local_disaster_recovery_ip=()
remote_disaster_recovery_ip=()
install_dir="/home/kingbase/cluster/R6P/R6H"
zip_package="/home/kingbase/r6_install/db.zip"
license_file=(license.dat)
db_user="system"                 # the user name of database
db_port="54321"                  # the port of database, defaults is 54321
db_mode="oracle"                 # database mode: pg, oracle
db_auth="scram-sha-256"          # database authority: scram-sha-256, md5, default is scram-sha-256
db_case_sensitive="yes"          # database case sensitive settings: yes, no. default is yes - case sensitive; no - case insensitive (NOTE. cannot set to 'no' when db_mode="pg").
trusted_servers="192.168.8.1"
data_directory="/home/kingbase/cluster/R6P/R6H/kingbase/data"
virtual_ip="192.168.8.240/24"
net_device=(enp0s3 enp0s3)
net_device_ip=(192.168.8.200 192.168.8.201)
ipaddr_path="/sbin"
arping_path="/opt/Kingbase/ES/V8R6_054/Server/bin/"
ping_path="/bin"
super_user="root"
execute_user="kingbase"
deploy_by_sshd=0                # choose whether to use sshd when deploy, 0 means not to use (deploy by sys_securecmdd), 1 means to use (deploy by sshd), default value is 1; when on_bmj=1, it will auto set to no(deploy_by_sshd=0)
use_scmd=1                       # Is the cluster running on sys_securecmdd or sshd? 1 means yes (on sys_securecmdd), 0 means no (on sshd), default value is 1; when on_bmj=1, it will auto set to yes(use_scmd=1)
reconnect_attempts="10"          # the number of retries in the event of an error
reconnect_interval="6"           # retry interval
recovery="standby"               # the way of cluster recovery: standby/automatic/manual
ssh_port="22"                    # the port of ssh, default is 22
scmd_port="8890"                 # the port of sys_securecmdd, default is 8890
auto_cluster_recovery_level='1'
use_check_disk='off'
synchronous='quorum'
......

Tips:
配置参数deploy_by_sshd=0 ,use_scmd=1在数据包分发时选择securecmdd而不是ssh。

deploy_by_sshd=0                # choose whether to use sshd when deploy, 0 means not to use (deploy by sys_securecmdd), 1 means to use (deploy by sshd), default value is 1; when on_bmj=1, it will auto set to no(deploy_by_sshd=0)
use_scmd=1                       # Is the cluster running on sys_securecmdd or sshd? 1 means yes (on sys_securecmdd), 0 means no (on sshd), default value is 1; when on_bmj=1, it will auto set to yes(use_scmd=1)

三、执行部署脚本

[kingbase@node1 r6_install]$ sh cluster_install.sh
[CONFIG_CHECK] will deploy the cluster of DG
[CONFIG_CHECK] check if the virtual ip "192.168.8.240" already exist ...
[CONFIG_CHECK] there is no "192.168.8.240" on any host, OK
[CONFIG_CHECK] the number of net_device matches the length of all_ip or the number of net_device is 1 ... OK
[RUNNING] check if the host can be reached ...
[RUNNING] success connect to the target "192.168.8.200" ..... OK
.......
2022-05-23 17:12:02 repmgrd on "[192.168.8.201]" start success.
 ID | Name  | Role    | Status    | Upstream | repmgrd | PID   | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
 1  | node1 | primary | * running |          | running | 27062 | no      | n/a                
 2  | node2 | standby |   running | node1    | running | 16079 | no      | 1 second(s) ago    
[2022-05-23 17:12:15] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6P/R6H/kingbase/log/kbha.log"

[2022-05-23 17:12:29] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6P/R6H/kingbase/log/kbha.log"

2022-05-23 17:12:30 Done.
[INSTALL] start up the whole cluster ... OK

四、查看集群状态

如下所示,集群部署成功:

[kingbase@node1 bin]$ ./repmgr cluster show
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                                                                                                                
----+-------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node1 | primary | * running |          | default  | 100      | 1        | host=192.168.8.200 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
 2  | node2 | standby |   running | node1    | default  | 100      | 1        | host=192.168.8.201 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

五、总结

1、对于生产环境不允许root用户ssh登录时,可以采用手工脚本方式部署集群,但是必须提前在所有节点部署和启动securecmdd服务。
2、然后在install.conf中配置选择securecmdd部署。
3、部署完成后,经测试,在root用户不能ssh登录系统,不影响集群的切换和启动及关闭。

posted @ 2024-07-03 14:56  天涯客1224  阅读(110)  评论(0编辑  收藏  举报