KingbaseES V8R3集群部署案例之---通用机无ssh环境脚本部署集群

案例说明:
在一些通用机的生产环境,不允许主机之间通过ssh通讯,或者不允许root用户建立ssh互信或登录。默认KingbaseES V8R3集群通用机环境部署需要建立数据库用户及root用户,在集群节点之间建立ssh互信,如果生产环境不允许,可以使用集群自带的es_server工具建立节点之间的通讯,部署集群。
适用版本:
KingbaseES V8R3(本案例使用较新版本 V008R003C002B0370,较早版本不支持。)

集群节点信息:

一、部署集群软件环境

1、安装数据库软件(集群任一节点)

2、查看集群安装所需文件

# 如下所示,集群部署所需的压缩包
[kingbase@node101 Lin64]$ pwd
/opt/Kingbase/ES/V8R3_370/DeployTools/zip/Lin64
[kingbase@node101 Lin64]$ ls
db.zip  kingbasecluster.zip

3、创建集群安装目录及分发集群所需文件(所有节点)

# 创建集群安装目录
[kingbase@node102 ~]$  mkdir -p /home/kingbase/cluster/HAR3/
# 分发安装包到集群安装目录下
[kingbase@node101 Lin64]$ cp *.zip /home/kingbase/cluster/HAR3/
[kingbase@node101 Lin64]$ cp /data/soft/license_V8R3_2022-01-04-365.dat /home/kingbase/cluster/HAR3/license.dat

[kingbase@node101 r3_install]$ ls -lh /home/kingbase/cluster/HAR3/
total 37M
-rw-r--r-- 1 kingbase kingbase  32M Sep 28 13:13 db.zip
-rw-r--r-- 1 kingbase kingbase 5.3M Sep 28 13:13 kingbasecluster.zip
-rw-r--r-- 1 kingbase kingbase 3.1K Sep 28 13:14 license.dat

4、解压集群部署压缩包

[kingbase@node101 HAR3]$ unzip db.zip
[kingbase@node101 HAR3]$ unzip kingbasecluster.zip

[kingbase@node101 HAR3]$ ls
db  db.zip  kingbasecluster  kingbasecluster.zip  license.dat

[kingbase@node101 db]$ ls
bin  data  es_server  etc  kb_scripts  lib  share

5、执行NEWHA.sh脚本(更新集群部署所需的脚本,以适用于es_server部署)

[kingbase@node101 es_server]$ sh NEWHA.sh
[CHECK] check old files in /home/kingbase/cluster/HAR3/db/bin ...
[INFO] /home/kingbase/cluster/HAR3/db/bin/install.conf exist, will rename it
......
[UPDATE] update files in /home/kingbase/cluster/HAR3/kingbasecluster ... OK
[UPDATE] DONE

二、部署和配置es_server服务环境(root用户)

1、查看es_server配置文件(es_server默认使用8890端口,可以修改此配置文件,更改端口号)

[kingbase@node101 share]$ cat esHA.conf
# it can be 'systemd' or 'crontab'
# systemd: start the es_server by service of systemctl
# crontab: start the es_server by crontab
start_method=systemd

# the port of es_server
# if it is null, will be default 8890
es_port=8890

2、初始化和启动es_server服务

[root@node101 bin]# sh esHAservice.sh --help
Usage: esHAservice.sh { init | start | stop | status }
# 初始化es_server服务环境
[root@node101 bin]# sh esHAservice.sh init
successfully initialized the es_server, please use "esHAservice.sh start" to start the es_server
[root@node101 bin]# sh esHAservice.sh start
Created symlink from /etc/systemd/system/multi-user.target.wants/es_server.service to /etc/systemd/system/es_server.service.

[root@node101 bin]# ps -ef |grep es_server
root     20196     1  0 13:28 ?        00:00:00 /home/kingbase/cluster/HAR3/db/bin/es_server -f /home/kingbase/cluster/HAR3/db/share/es_server.conf

[root@node101 bin]# netstat -antlp |grep 8890
tcp        0      0 0.0.0.0:8890            0.0.0.0:*               LISTEN      20196/es_server
[root@node101 bin]#

3、systemctl管理es_server

# 停止es_server服务
[root@node101 ~]# systemctl stop es_server
[root@node101 ~]# systemctl status es_server
● es_server.service - KingbaseES - es_server daemon
   Loaded: loaded (/etc/systemd/system/es_server.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Wed 2022-09-28 14:12:02 CST; 2s ago
  Process: 20795 ExecStart=/home/kingbase/cluster/HAR3/db/bin/es_server -f /home/kingbase/cluster/HAR3/db/share/es_server.conf (code=exited, status=0/SUCCESS)
 Main PID: 20795 (code=exited, status=0/SUCCESS)

Sep 28 14:11:50 node101 systemd[1]: Started KingbaseES - es_server daemon.
Sep 28 14:12:02 node101 systemd[1]: Stopping KingbaseES - es_server daemon...
Sep 28 14:12:02 node101 systemd[1]: Stopped KingbaseES - es_server daemon.

#启动es_server服务
[root@node101 ~]# systemctl start es_server
[root@node101 ~]# systemctl status es_server
● es_server.service - KingbaseES - es_server daemon
   Loaded: loaded (/etc/systemd/system/es_server.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2022-09-28 14:12:10 CST; 3s ago
 Main PID: 20819 (es_server)
    Tasks: 1
   CGroup: /system.slice/es_server.service
           └─20819 /home/kingbase/cluster/HAR3/db/bin/es_server -f /home/kingbase/cluster/HAR3/db/share/es_server.conf

Sep 28 14:12:10 node101 systemd[1]: Started KingbaseES - es_server daemon.

4、测试节点间通讯

[root@node102 es_server]# ./es_client  root@192.168.1.101 'hostname'
node101

5、查看es_server服务配置

# systemctl管理配置
[root@node101 bin]# cat /etc/systemd/system/es_server.service
[Unit]
Description=KingbaseES - es_server daemon
After=network.target

[Service]
Type=simple
ExecStart=/home/kingbase/cluster/HAR3/db/bin/es_server -f /home/kingbase/cluster/HAR3/db/share/es_server.conf
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=always
RestartSec=10s

[Install]
WantedBy=multi-user.target

#用户密钥认证配置
[root@node101 ~]# ls .es/ -lh
total 8.0K
-rw------- 1 root root  381 Sep 28 13:28 accept_hosts
-rw------- 1 root root 1.7K Sep 28 13:28 key_file

[root@node101 ~]# ls -lh /home/kingbase/.es
total 8.0K
-rw------- 1 kingbase kingbase  381 Sep 28 13:28 accept_hosts
-rw------- 1 kingbase kingbase 1.7K Sep 28 13:28 key_file

三、执行脚本部署集群

1、部署配置文件

[kingbase@node101 bin]$ cat install.conf |grep -v ^$|grep -v ^#
on_bmj=0
all_node_ip=(192.168.1.101 192.168.1.102)
cluster_path="/home/kingbase/cluster/HAR3"
db_package="/home/kingbase/cluster/HAR3/db.zip"
cluster_package="/home/kingbase/cluster/HAR3/kingbasecluster.zip"
license_file=(license.dat)
db_user="SYSTEM"                 # the user name of database
db_password="123456"             # the password of database, since the R3 has a special feature that password cannot be stored in clear text, please delete the password after the cluster deployment is complete.
db_port="54321"                  # the port of database, defaults is 54321
trust_ip="192.168.1.1"
db_vip="192.168.1.204"
cluster_vip="192.168.1.205"
net_device=(enp0s3 enp0s3)
ipaddr_path="/sbin"
arping_path="/home/kingbase/cluster/HAR3/db/bin/"
super_user="root"
cluster_user="kingbase"
use_sshd=0                       # choose whether to use sshd service, 0 means not to use, 1 means to use, default value is 0
wd_deadtime="30"                 # cluster heartbeats timeout, unit: seconds
check_retries="6"                # number of detection retries in case of database failure
check_delay="10"                 # detection retry interval in case of database failure
connect_timeout="10000"          # timeout value in milliseconds before giving up to connect to backend
auto_primary_recovery="0"        # automatic recovery parameter of cluster primary host, default value is 0
ssh_port="22"                    # the port of sshd [if on_bmj=1 or use_sshd=0, you do not need to configure this parameter]
es_port="8890"                   # the port of es_server
case_sensitive="on"              # select whether database is case sensitive, off means case insensitive, on means case sensitive,the default value is on
max_available_level="1"          # when all databases are down, should the cluster be automatically started. 1 means yes, 0 means no, default value is 1

如下图所示:关闭ssh分发的部署。

2、执行部署脚本

[kingbase@node101 bin]$ sh V8R3_cluster_install.sh
[INFO]-Check if the cluster_vip "192.168.1.205" is already exist ...
.......
[INSTALL] start up the slave on "192.168.1.102" ... OK
[INSTALL] Create physical_replication_slot on 192.168.1.101
 SYS_CREATE_PHYSICAL_REPLICATION_SLOT
--------------------------------------
 (slot_node1,)
(1 row)

[INSTALL] Create physical_replication_slot on 192.168.1.101 ... OK
[INSTALL] Create physical_replication_slot on 192.168.1.101
 SYS_CREATE_PHYSICAL_REPLICATION_SLOT
--------------------------------------
 (slot_node2,)
(1 row)

[INSTALL] Create physical_replication_slot on 192.168.1.101 ... OK
[INSTALL] Create physical_replication_slot on 192.168.1.102
 SYS_CREATE_PHYSICAL_REPLICATION_SLOT
--------------------------------------
 (slot_node1,)
(1 row)

[INSTALL] Create physical_replication_slot on 192.168.1.102 ... OK
[INSTALL] Create physical_replication_slot on 192.168.1.102
 SYS_CREATE_PHYSICAL_REPLICATION_SLOT
--------------------------------------
 (slot_node2,)
(1 row)

[INSTALL] Create physical_replication_slot on 192.168.1.102 ... OK
[INSTALL] start up the whole cluster ...
-----------------------------------------------------------------------
2022-09-28 14:26:38 KingbaseES automation beging...
......
2022-09-28 14:26:48 Del kingbase VIP [192.168.1.204/24] ...
DEL VIP NOW AT 2022-09-28 14:26:46 ON enp0s3
No VIP on my dev, nothing to do.
2022-09-28 14:26:48 Done...
......................
all stop..
ping trust ip 192.168.1.1 success ping times :[3], success times:[2]
......
Redirecting to /bin/systemctl restart crond.service
Redirecting to /bin/systemctl restart crond.service
......................
all started..
...
now we check again
=======================================================================
|             ip |                       program|              [status]
[  192.168.1.101]|             [kingbasecluster]|              [active]
[  192.168.1.102]|             [kingbasecluster]|              [active]
[  192.168.1.101]|                    [kingbase]|              [active]
[  192.168.1.102]|                    [kingbase]|              [active]
=======================================================================
[INSTALL] start up the whole cluster ... OK

---如上所示:集群部署成功。

四、验证集群

1、查看集群节点状态

[kingbase@node101 bin]$ ./ksql -U SYSTEM -W 123456 TEST -p 9999
ksql (V008R003C002B0370)
Type "help" for help.

TEST=# show pool_nodes;
 node_id |   hostname    | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
---------+---------------+-------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.1.101 | 54321 | up     | 0.500000  | primary | 0          | false             | 0
 1       | 192.168.1.102 | 54321 | up     | 0.500000  | standby | 0          | true              | 0
(2 rows)

2、查看流复制状态

[kingbase@node101 bin]$ ./ksql -U SYSTEM -W 123456 TEST
ksql (V008R003C002B0370)
Type "help" for help.

TEST=# select * from sys_stat_replication;
  PID  | USESYSID | USENAME | APPLICATION_NAME |  CLIENT_ADDR  | CLIENT_HOSTNAME | CLIENT_PORT |         BACKEND_START         |BACKEND_XMIN |   STATE   | SENT_LOCATION | WRITE_LOCATION | FLUSH_LOCATION | REPLAY_LOCATION | SYNC_PRIORITY | SYNC_STATE
-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+--------------+-----------+---------------+----------------+----------------+-----------------+---------------+------------
 27905 |       10 | SYSTEM  | node2            | 192.168.1.102 |                 |       59898 | 2022-09-28 14:26:58.204189+08 |             | streaming | 0/30000D0     | 0/30000D0      | 0/30000D0      | 0/30000D0       |             2 | sync
(1 row)

五、总结
通过es_server在无ssh的通用机环境,可以很方便的执行集群的部署,对于一些对安全要求比较严格的生产环境,可以参考以上案例执行集群的部署。

posted @ 2022-12-12 17:25  KINGBASE研究院  阅读(282)  评论(0编辑  收藏  举报