KingbaseES RAC部署案例之---基于SAN存储构建RAC

案例说明:
通过iscsi共享存储作为数据库存储文件系统,构建KingbaseES RAC架构。

适用版本:
KingbaseES V008R006C008M030B0010

操作系统版本:

[root@node201 KingbaseHA]# cat /etc/centos-release
CentOS Linux release 7.9.2009 (Core)

集群架构:
如下所示,node1和node2为集群节点:

节点信息:

[root@node201 KingbaseHA]# vi /etc/hosts
192.168.1.201 node201
192.168.1.202 node202
192.168.1.203 node203    iscsi_Srv

集群软件:

[root@node201 data]# rpm -qa |egrep 'corosync|pacemaker'
corosynclib-2.4.5-7.el7_9.2.x86_64
pacemaker-1.1.23-1.el7_9.1.x86_64
pacemaker-libs-1.1.23-1.el7_9.1.x86_64
pacemaker-doc-1.1.23-1.el7_9.1.x86_64
corosync-qdevice-2.4.5-7.el7_9.2.x86_64
pacemaker-cluster-libs-1.1.23-1.el7_9.1.x86_64
pacemaker-cli-1.1.23-1.el7_9.1.x86_64
corosync-2.4.5-7.el7_9.2.x86_64

一、配置ISCSI共享

1、server端安装软件
[root@node201 ~]# yum install targetd targetcli -y

2、启动target服务

[root@node201 ~]# systemctl start targetd
[root@node201 ~]# systemctl enable targetd
Created symlink from /etc/systemd/system/multi-user.target.wants/targetd.service to /usr/lib/systemd/system/targetd.service.

3、配置iscsi共享
如下所示,通过targetli工具配置iscsi的共享:

1)targetli帮助信息

[root@node203 ~]# targetcli

AVAILABLE COMMANDS
==================
The following commands are available in the
current path:

  - bookmarks action [bookmark]
  - cd [path]
  - clearconfig [confirm]
  - exit
  - get [group] [parameter...]
  - help [topic]
  - ls [path] [depth]
  - pwd
  - refresh
  - restoreconfig [savefile] [clear_existing] [target] [storage_object]
  - saveconfig [savefile]
  - sessions [action] [sid]
  - set [group] [parameter=value...]
  - status
  - version
/> exit
Global pref auto_save_on_exit=true
Configuration saved to /etc/target/saveconfig.json

2)创建iscsi磁盘

/> /backstores/block create idisk1 /dev/sdb
Created block storage object idisk1 using /dev/sdb.
/> /backstores/block create idisk2 /dev/sdc
Created block storage object idisk2 using /dev/sdc.

/> ls
o- / ........................................................................................... [...]
  o- backstores ................................................................................ [...]
  | o- block .................................................................... [Storage Objects: 2]
  | | o- idisk1 .......................................... [/dev/sdb (10.7GiB) write-thru deactivated]
  | | | o- alua ..................................................................... [ALUA Groups: 1]
  | | |   o- default_tg_pt_gp ......................................... [ALUA state: Active/optimized]
  | | o- idisk2 ......................................... [/dev/sdc (512.0MiB) write-thru deactivated]
  | |   o- alua ..................................................................... [ALUA Groups: 1]
  | |     o- default_tg_pt_gp ......................................... [ALUA state: Active/optimized]
  | o- fileio ................................................................... [Storage Objects: 0]
  | o- pscsi .................................................................... [Storage Objects: 0]
  | o- ramdisk .................................................................. [Storage Objects: 0]
  o- iscsi .............................................................................. [Targets: 0]
  o- loopback ........................................................................... [Targets: 0]

3)创建iSCSI服务端

/> iscsi/ create iqn.2024-08.pip.cc:server
Created target iqn.2024-08.pip.cc:server.
Created TPG 1.
Global pref auto_add_default_portal=true
Created default portal listening on all IPs (0.0.0.0), port 3260.
/> ls
o- / ........................................................................................... [...]
  o- backstores ................................................................................ [...]
  | o- block .................................................................... [Storage Objects: 2]
  | | o- idisk1 .......................................... [/dev/sdb (10.7GiB) write-thru deactivated]
  | | | o- alua ..................................................................... [ALUA Groups: 1]
  | | |   o- default_tg_pt_gp ......................................... [ALUA state: Active/optimized]
  | | o- idisk2 ......................................... [/dev/sdc (512.0MiB) write-thru deactivated]
  | |   o- alua ..................................................................... [ALUA Groups: 1]
  | |     o- default_tg_pt_gp ......................................... [ALUA state: Active/optimized]
  | o- fileio ................................................................... [Storage Objects: 0]
  | o- pscsi .................................................................... [Storage Objects: 0]
  | o- ramdisk .................................................................. [Storage Objects: 0]
  o- iscsi .............................................................................. [Targets: 1]
  | o- iqn.2024-08.pip.cc:server ........................................................... [TPGs: 1]
  |   o- tpg1 ................................................................. [no-gen-acls, no-auth]
  |     o- acls ............................................................................ [ACLs: 0]
  |     o- luns ............................................................................ [LUNs: 0]
  |     o- portals ...................................................................... [Portals: 1]
  |       o- 0.0.0.0:3260 ....................................................................... [OK]
  o- loopback ........................................................................... [Targets: 0]

4)创建lun(共享卷)

 /> cd iscsi/iqn.2024-08.pip.cc:server/tpg1/
/iscsi/iqn.20...c:server/tpg1> luns/ create /backstores/block/idisk1
Created LUN 0.
/iscsi/iqn.20...c:server/tpg1> luns/ create /backstores/block/idisk2
Created LUN 1.
/iscsi/iqn.20...c:server/tpg1> ls
o- tpg1 ....................................................................... [no-gen-acls, no-auth]
  o- acls .................................................................................. [ACLs: 0]
  o- luns .................................................................................. [LUNs: 2]
  | o- lun0 ............................................. [block/idisk1 (/dev/sdb) (default_tg_pt_gp)]
  | o- lun1 ............................................. [block/idisk2 (/dev/sdc) (default_tg_pt_gp)]
  o- portals ............................................................................ [Portals: 1]
    o- 0.0.0.0:3260 ............................................................................. [OK]

5)创建客户端认证(chap)

/iscsi/iqn.20...c:server/tpg1>  acls/ create iqn.2024-08.pip.cc:client
Created Node ACL for iqn.2024-08.pip.cc:client
Created mapped LUN 1.
Created mapped LUN 0.
/iscsi/iqn.20...c:server/tpg1>  cd acls/iqn.2024-08.pip.cc:client/
/iscsi/iqn.20...pip.cc:client> set auth userid=root
Parameter userid is now 'root'.
/iscsi/iqn.20...pip.cc:client> set auth password=123456
Parameter password is now '123456'.
/iscsi/iqn.20...pip.cc:client> info
chap_password: 123456
chap_userid: root
wwns:
iqn.2024-08.pip.cc:client

6)创建portal

/> cd iscsi/iqn.2024-08.pip.cc:server/tpg1/
/iscsi/iqn.20...c:server/tpg1> ls
o- tpg1 ....................................................................... [no-gen-acls, no-auth]
  o- acls .................................................................................. [ACLs: 1]
  | o- iqn.2024-08.pip.cc:client .................................................... [Mapped LUNs: 2]
  |   o- mapped_lun0 ........................................................ [lun0 block/idisk1 (rw)]
  |   o- mapped_lun1 ........................................................ [lun1 block/idisk2 (rw)]
  o- luns .................................................................................. [LUNs: 2]
  | o- lun0 ............................................. [block/idisk1 (/dev/sdb) (default_tg_pt_gp)]
  | o- lun1 ............................................. [block/idisk2 (/dev/sdc) (default_tg_pt_gp)]
  o- portals ............................................................................ [Portals: 1]
    o- 0.0.0.0:3260 ............................................................................. [OK]
/iscsi/iqn.20...c:server/tpg1> cd portals
/iscsi/iqn.20.../tpg1/portals>  delete 0.0.0.0 3260
Deleted network portal 0.0.0.0:3260
/iscsi/iqn.20.../tpg1/portals> create 192.168.1.203 3260
Using default IP port 3260
Created network portal 192.168.1.203:3260.
/iscsi/iqn.20.../tpg1/portals> ls
o- portals .............................................................................. [Portals: 1]
  o- 192.168.1.203:3260 ......................................................................... [OK]

7)查看iscsi server配置

/iscsi/iqn.20...pip.cc:client> cd /
/> ls
o- / ........................................................................................... [...]
  o- backstores ................................................................................ [...]
  | o- block .................................................................... [Storage Objects: 2]
  | | o- idisk1 ............................................ [/dev/sdb (10.7GiB) write-thru activated]
  | | | o- alua ..................................................................... [ALUA Groups: 1]
  | | |   o- default_tg_pt_gp ......................................... [ALUA state: Active/optimized]
  | | o- idisk2 ........................................... [/dev/sdc (512.0MiB) write-thru activated]
  | |   o- alua ..................................................................... [ALUA Groups: 1]
  | |     o- default_tg_pt_gp ......................................... [ALUA state: Active/optimized]
  | o- fileio ................................................................... [Storage Objects: 0]
  | o- pscsi .................................................................... [Storage Objects: 0]
  | o- ramdisk .................................................................. [Storage Objects: 0]
  o- iscsi .............................................................................. [Targets: 1]
  | o- iqn.2024-08.pip.cc:server ........................................................... [TPGs: 1]
  |   o- tpg1 ................................................................. [no-gen-acls, no-auth]
  |     o- acls ............................................................................ [ACLs: 1]
  |     | o- iqn.2024-08.pip.cc:client .............................................. [Mapped LUNs: 2]
  |     |   o- mapped_lun0 .................................................. [lun0 block/idisk1 (rw)]
  |     |   o- mapped_lun1 .................................................. [lun1 block/idisk2 (rw)]
  |     o- luns ............................................................................ [LUNs: 2]
  |     | o- lun0 ....................................... [block/idisk1 (/dev/sdb) (default_tg_pt_gp)]
  |     | o- lun1 ....................................... [block/idisk2 (/dev/sdc) (default_tg_pt_gp)]
  |     o- portals ...................................................................... [Portals: 1]
  |       o- 192.168.1.203:3260 ................................................................. [OK]
  o- loopback ........................................................................... [Targets: 0]
/> saveconfig
Last 10 configs saved in /etc/target/backup/.
Configuration saved to /etc/target/saveconfig.json
/> exit
Global pref auto_save_on_exit=true
Configuration saved to /etc/target/saveconfig.json

4、重启target服务

[root@node203 ~]# systemctl restart target
[root@node203 ~]# systemctl status target
● target.service - Restore LIO kernel target configuration
   Loaded: loaded (/usr/lib/systemd/system/target.service; disabled; vendor preset: disabled)
   Active: active (exited) since Fri 2024-08-02 10:50:45 CST; 4s ago
  Process: 18476 ExecStop=/usr/bin/targetctl clear (code=exited, status=0/SUCCESS)
  .......

5、配置客户端访问iscsi共享(all nodes)
1)客户端安装软件
[root@node201 ~]# yum install iscsi-initiator-utils-iscsiuio -y

2)配置iscsi配置文件
如下所示,在客户端配置访问server端认证文件:

[root@node201 ~]# cat /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.2024-08.pip.cc:client
node.session.auth.authmethod = CHAP
node.session.auth.username = root
node.session.auth.password = 123456

启动服务:
[root@node201 ~]# systemctl restart iscsid
[root@node201 ~]# systemctl enable iscsid

3)客户端访问iscsi共享

# 查看iscsi共享
[root@node201 ~]# iscsiadm -m discovery -t st -p 192.168.1.203
192.168.1.203:3260,1 iqn.2024-08.pip.cc:server

# 建立到iscsi server的共享访问
[root@node201 ~]# iscsiadm -m node -T iqn.2024-08.pip.cc:server -p 192.168.1.203 --login
Logging in to [iface: default, target: iqn.2024-08.pip.cc:server, portal: 192.168.1.203,3260] (multiple)
Login to [iface: default, target: iqn.2024-08.pip.cc:server, portal: 192.168.1.203,3260] successful.

# 查看共享存储信息
[root@node201 ~]# lsblk
NAME            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda               8:0    0 102.9G  0 disk
├─sda1            8:1    0   500M  0 part /boot
└─sda2            8:2    0 102.4G  0 part
  ├─centos-root 253:0    0    50G  0 lvm  /
  ├─centos-swap 253:1    0     3G  0 lvm  [SWAP]
  └─centos-home 253:2    0  49.3G  0 lvm  /home
sdb               8:16   0  10.7G  0 disk
sdc               8:32   0   512M  0 disk
sr0              11:0    1  1024M  0 rom

# 如下所示,在客户端可以看到共享存储已作为本地磁盘访问
[root@node202 iscsi]# fdisk -l
Disk /dev/sdb: 11.5 GB, 11499421696 bytes, 22459808 sectors
......
Disk /dev/sdc: 536 MB, 536870912 bytes, 1048576 sectors
......
# 至此,共享存储配置完成。

4)通过udev对iscsi设备绑定

# 查看节点磁盘uuid
[root@node201 soft]# /usr/lib/udev/scsi_id -g -u /dev/sdb
360014052894c914c81040b4a87e59fb2
[root@node201 soft]# /usr/lib/udev/scsi_id -g -u /dev/sdc
36001405bcd67f428faf49eb9fc8c80dd

[root@node202 ~]# /usr/lib/udev/scsi_id -g -u /dev/sdb
360014052894c914c81040b4a87e59fb2
[root@node202 ~]# /usr/lib/udev/scsi_id -g -u /dev/sdc
36001405bcd67f428faf49eb9fc8c80dd

# 执行udev绑定
[root@node202 ~]# cat /etc/udev/rules.d/75-persist-iscsi.rules
KERNEL=="sd*",ENV{ID_SERIAL}=="360014052894c914c81040b4a87e59fb2",NAME:="qdisk",MODE:="0644"
KERNEL=="sd*",ENV{ID_SERIAL}=="36001405bcd67f428faf49eb9fc8c80dd",NAME:="kdata",MODE:="0644"

二、部署和配置RAC
1、系统环境准备
1)关闭系统防火墙(all nodes)

[root@node201 KingbaseHA]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:firewalld(1)

2)配置selinux

[root@node201 KingbaseHA]# cat /etc/sysconfig/selinux |grep -v ^$|grep -v ^#
SELINUXTYPE=targeted
SELINUX=disabled
#查看selinux
[root@node201 KingbaseHA]# setenforce 0 ; getenforce
setenforce: SELinux is disabled
Disabled

3)配置ntp时钟同步(建议集群节点时钟同步)

# ntp server配置
[root@node201 KingbaseHA]# cat /etc/ntp.conf
server 127.127.1.0 prefer
fudge 127.127.1.0 stratum 10
restrict 192.168.1.0 255.255.255.0

[root@node202 KingbaseHA]# systemctl start ntpd

# ntp client配置
[root@node202 KingbaseHA]# cat /etc/ntp.conf
server 192.168.1.201
fudge 192.168.1.202 stratum 10

[root@node202 KingbaseHA]# systemctl start ntpd

# 查看时钟同步状态
[root@node202 KingbaseHA]# ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 ntp5.flashdance 194.58.202.20    2 u   30 1024   17  151.904   32.166   8.276
*time.neu.edu.cn .PTP.            1 u 1025 1024    7   56.031   25.537   3.294
+117.80.112.205  133.243.238.163  2 u 1029 1024    7   30.364   42.776   2.418
+111.230.189.174 100.122.36.196   2 u  883 1024    7   40.486   40.348   0.661
-node201         LOCAL(0)        11 u  507 1024  377    0.231   -7.570   7.571

4) 创建数据库用户

[kingbase@node201 bin]$ id
uid=200(kingbase) gid=1001(kingbase) groups=1001(kingbase)

[kingbase@node202 bin]$ id
uid=200(kingbase) gid=1001(kingbase) groups=1001(kingbase)

2、部署RAC集群
1)安装数据库软件(所有节点)

[root@node201 soft]# mount -o loop KingbaseES_V008R006C008M030B0010_Lin64_install.iso /mnt
mount: /dev/loop0 is write-protected, mounting read-only

[kingbase@node201 mnt]$ sh setup.sh
Now launch installer...
Choose the server type
----------------------
Please choose the server type :
  ->1- default
    2- rac

  Default Install Folder: /opt/Kingbase/ES/V8

2)创建集群部署目录 (all nodes)
如下所示,进入数据库软件部署目录,执行集群脚本,默认创建"/opt/KingbaseHA"目录:

[root@node201 script]# pwd
/opt/Kingbase/ES/V8/install/script
[root@node201 script]# ls -lh
total 32K
-rwxr-xr-x 1 kingbase kingbase  321 Jul 18 14:17 consoleCloud-uninstall.sh
-rwxr-x--- 1 kingbase kingbase 3.6K Jul 18 14:17 initcluster.sh
-rwxr-x--- 1 kingbase kingbase  289 Jul 18 14:17 javatools.sh
-rwxr-xr-x 1 kingbase kingbase  553 Jul 18 14:17 rootDeployClusterware.sh
-rwxr-x--- 1 kingbase kingbase  767 Jul 18 14:17 root.sh
-rwxr-x--- 1 kingbase kingbase  627 Jul 18 14:17 rootuninstall.sh
-rwxr-x--- 1 kingbase kingbase 3.7K Jul 18 14:17 startupcfg.sh
-rwxr-x--- 1 kingbase kingbase  252 Jul 18 14:17 stopserver.sh

# 执行脚本
[root@node201 script]# sh rootDeployClusterware.sh
cp: cannot stat ‘@@INSTALL_DIR@@/KingbaseHA/*’: No such file or directory

# 修改脚本变量
[root@node201 V8]# head  install/script/rootDeployClusterware.sh
#!/bin/sh
# copy KingbaseHA to /opt/KingbaseHA
ROOT_UID=0
#INSTALLDIR=@@INSTALL_DIR@@
INSTALLDIR=/opt/Kingbase/ES/V8/KESRealPro/V008R006C008M030B0010

# 执行脚本(创建/opt/KingbaseHA)
[root@node201 V8]# sh install/script/rootDeployClusterware.sh
/opt/KingbaseHA has existed. Do you want to override it?(y/n)y
y
[root@node201 V8]# ls -lh /opt/KingbaseHA/
total 64K
-rw-r--r--  1 root root 3.8K Jul 30 17:38 cluster_manager.conf
-rwxr-xr-x  1 root root  54K Jul 30 17:38 cluster_manager.sh
drwxr-xr-x  9 root root  121 Jul 30 17:38 corosync
drwxr-xr-x  7 root root  122 Jul 30 17:38 corosync-qdevice
drwxr-xr-x  8 root root   68 Jul 30 17:38 crmsh
drwxr-xr-x  7 root root   65 Jul 30 17:38 dlm-dlm
drwxr-xr-x  5 root root   39 Jul 30 17:38 fence_agents
drwxr-xr-x  5 root root   60 Jul 30 17:38 gfs2
drwxr-xr-x  6 root root   53 Jul 30 17:38 gfs2-utils
drwxr-xr-x  5 root root   39 Jul 30 17:38 ipmi_tool
drwxr-xr-x  7 root root   84 Jul 30 17:38 kingbasefs
drwxr-xr-x  5 root root   42 Jul 30 17:38 kronosnet
drwxr-xr-x  2 root root 4.0K Jul 30 17:38 lib
drwxr-xr-x  2 root root   28 Jul 30 17:38 lib64
drwxr-xr-x  7 root root   63 Jul 30 17:38 libqb
drwxr-xr-x 10 root root  136 Jul 30 17:38 pacemaker
drwxr-xr-x  6 root root   52 Jul 30 17:38 python2.7

3)配置cluster_manager.conf (all nodes)
如下所示,基于SAN的部署,需要初始化仲裁和votedisk磁盘(iscsi共享存储):

[root@node202 KingbaseHA]# cat cluster_manager.conf |grep -v ^$|grep -v ^#
cluster_name=krac
node_name=(node201 node202)    # 和/etc/hosts配置一致
node_ip=(192.168.1.201 192.168.1.202)
enable_qdisk=1                     # 使用仲裁磁盘和votedisk
votingdisk=/dev/sdc
sharedata_disk=/dev/sdb       # 数据库共享存储磁盘
sharedata_dir=/sharedata/data_gfs2        # 指定data存储位置(共享存储)
install_dir=/opt/KingbaseHA
env_bash_file=/root/.bashrc
pacemaker_daemon_group=haclient
pacemaker_daemon_user=hacluster
kingbaseowner=kingbase
kingbasegroup=kingbase
kingbase_install_dir=/opt/Kingbase/ES/V8/Server
database="test"
username="system"
password="123456"
initdb_options="-A trust -U $username"
enable_fence=1                     # 使用fence
enable_qdisk_fence=1
install_rac=1
rac_port=54321
rac_lms_port=53444
rac_lms_count=7
.......

4)初始化仲裁盘(任意节点)

[root@node201 KingbaseHA]# ./cluster_manager.sh --qdisk_init
qdisk init start
Writing new quorum disk label 'krac' to /dev/sdc.
WARNING: About to destroy all data on /dev/sdc; proceed? (Y/N):
y
/dev/block/8:32:
/dev/disk/by-id/scsi-360014050da191d8d53b4d04a277aa8f5:
/dev/disk/by-id/wwn-0x60014050da191d8d53b4d04a277aa8f5:
/dev/disk/by-path/ip-192.168.1.203:3260-iscsi-iqn.2024-08.pip.cc:server-lun-1:
/dev/sdc:
        Magic:                eb7a62c2
        Label:                krac
        Created:              Fri Aug  2 11:37:36 2024
        Host:                 node201
        Kernel Sector Size:   512
        Recorded Sector Size: 512

qdisk init success

5)初始化数据磁盘(任意节点)

[root@node201 KingbaseHA]# ./cluster_manager.sh --cluster_disk_init
rac disk init start
It appears to contain a partition table (dos).
This will destroy any data on /dev/sdb
Are you sure you want to proceed? (Y/N): y
Adding journals: Done
Building resource groups: Done
Creating quota file: Done
Writing superblock and syncing: Done
Device:                    /dev/sdb
Block size:                4096
Device size:               10.71 GB (2807476 blocks)
Filesystem size:           10.71 GB (2807475 blocks)
Journals:                  3
Journal size:              32MB
Resource groups:           46
Locking protocol:          "lock_dlm"
Lock table:                "krac:gfs2"
UUID:                      3e934629-a2b8-4b7d-a153-ded2dbec7a28
rac disk init success

6)基础组件初始化(all nodes)
在所有节点执行如下命令,初始化所有基础组件,如corosync,pacemaker,corosync-qdevice。

[root@node201 KingbaseHA]# ./cluster_manager.sh --base_configure_init
init kernel soft watchdog start
init kernel soft watchdog success
config host start
Host entry 192.168.1.201 node201 found, skiping...
config host success
add env varaible in /root/.bashrc
add env variable success
config corosync.conf start
config corosync.conf success
Starting Corosync Cluster Engine (corosync): [WARNING]
add pacemaker daemon user start
add pacemaker daemon user success
config pacemaker success
Starting Pacemaker Cluster Manager[  OK  ]
config qdevice start
config qdevice success
Starting Qdisk Fenced daemon (qdisk-fenced): [  OK  ]
Starting Corosync Qdevice daemon (corosync-qdevice): [  OK  ]
Please note the configuration: superuser(system) and port(36321) for database(test) of resource(DB0)
Please note the configuration: superuser(system) and port(36321) for database(test) of resource(DB1)
config kingbase rac start
config kingbase rac success
add_udev_rule start
add_udev_rule success
insmod dlm.ko success
check and mknod for dlm start
check and mknod for dlm success

# 应用环境变量
[root@node201 data]# cat /root/.bashrc
export install_dir=/opt/KingbaseHA
export PATH=/opt/KingbaseHA/python2.7/bin:/opt/KingbaseHA/pacemaker/sbin/:$PATH
export PATH=/opt/KingbaseHA/crmsh/bin:/opt/KingbaseHA/pacemaker/libexec/pacemaker/:$PATH
export PATH=/opt/KingbaseHA/corosync/sbin:/opt/KingbaseHA/corosync-qdevice/sbin:$PATH
export PYTHONPATH=/opt/KingbaseHA/python2.7/lib/python2.7/site-packages/:/opt/KingbaseHA/crmsh/lib/python2.7/site-packages:$PYTHONPATH
export COROSYNC_MAIN_CONFIG_FILE=/opt/KingbaseHA/corosync/etc/corosync/corosync.conf
export CRM_CONFIG_FILE=/opt/KingbaseHA/crmsh/etc/crm/crm.conf
export OCF_ROOT=/opt/KingbaseHA/pacemaker/ocf
export HA_SBIN_DIR=/opt/KingbaseHA/pacemaker/sbin/
export QDEVICE_SBIN_DIR=/opt/KingbaseHA/corosync-qdevice/sbin/
export LD_LIBRARY_PATH=/opt/KingbaseHA/lib64/:$LD_LIBRARY_PATH
export HA_INSTALL_PATH=/opt/KingbaseHA
export PATH=/opt/KingbaseHA/dlm-dlm/sbin:/opt/KingbaseHA/gfs2-utils/sbin:$PATH
export LD_LIBRARY_PATH=/opt/KingbaseHA/corosync/lib/:$LD_LIBRARY_PATH

[root@node201 KingbaseHA]# source /root/.bashrc
# 查看corosync和pacemaker进程
[root@node201 KingbaseHA]# ps -ef |grep corosync
root     10779     1  0 11:39 ?        00:00:00 corosync -c /opt/KingbaseHA/corosync/etc/corosync/corosync.conf -p /opt/KingbaseHA/corosync/var/
root     10930     1  0 11:40 ?        00:00:00 corosync-qdevice -p /opt/KingbaseHA/corosync-qdevice/var/run/corosync-qdevice/corosync-qdevice.sock
root     10931     1  0 11:40 ?        00:00:00 corosync-qdevice -p /opt/KingbaseHA/corosync-qdevice/var/run/corosync-qdevice/corosync-qdevice.sock

[root@node201 KingbaseHA]# ps -ef |grep pacemaker
root     10819     1  0 11:39 pts/0    00:00:00 pacemakerd -d /opt/KingbaseHA/pacemaker
haclust+ 10821 10819  0 11:39 ?        00:00:00 /opt/KingbaseHA/pacemaker/libexec/pacemaker/pacemaker-based -d /opt/KingbaseHA/pacemaker
root     10822 10819  0 11:39 ?        00:00:00 /opt/KingbaseHA/pacemaker/libexec/pacemaker/pacemaker-fenced -d /opt/KingbaseHA/pacemaker
root     10823 10819  0 11:39 ?        00:00:00 /opt/KingbaseHA/pacemaker/libexec/pacemaker/pacemaker-execd -d /opt/KingbaseHA/pacemaker
haclust+ 10824 10819  0 11:39 ?        00:00:00 /opt/KingbaseHA/pacemaker/libexec/pacemaker/pacemaker-attrd
haclust+ 10825 10819  0 11:39 ?        00:00:00 /opt/KingbaseHA/pacemaker/libexec/pacemaker/pacemaker-schedulerd -d /opt/KingbaseHA/pacemaker
haclust+ 10826 10819  0 11:39 ?        00:00:00 /opt/KingbaseHA/pacemaker/libexec/pacemaker/pacemaker-controld -d /opt/KingbaseHA/pacemaker

查看集群资源状态:

[root@node201 KingbaseHA]# crm status
Cluster Summary:
  * Stack: corosync
  * Current DC: node201 (version 2.0.3-4b1f869f0f) - partition with quorum
  * Last updated: Mon Aug 12 15:31:41 2024
  * Last change:  Mon Aug 12 15:31:31 2024 by root via cibadmin on node201
  * 2 nodes configured
  * 0 resource instances configured

Node List:
  * Online: [ node201 node202 ]
  
Full List of Resources:
  * No resources

7)gfs2相关资源初始化(all nodes)
如下所示,更新系统gfs2内核模块:

[root@node201 KingbaseHA]# ./cluster_manager.sh --init_gfs2
init gfs2 start
current OS kernel version does not support updating gfs2, please confirm whether to continue? (Y/N):
y
init the OS native gfs2 success

8)配置集群相关资源( fence、dlm 和 gfs2 资源)(任意节点)

[root@node201 KingbaseHA]# ./cluster_manager.sh --config_gfs2_resource
config dlm and gfs2 resource start
3e934629-a2b8-4b7d-a153-ded2dbec7a28
config dlm and gfs2 resource success

查看集群资源状态:
如下所示,增加了dlm和gfs2相关资源状态信息:

[root@node201 1]# crm status
Cluster Summary:
  * Stack: corosync
  * Current DC: node201 (version 2.0.3-4b1f869f0f) - partition with quorum
  * Last updated: Fri Aug  9 18:28:28 2024
  * Last change:  Fri Aug  9 18:17:02 2024 by root via cibadmin on node201
  * 2 nodes configured
  * 6 resource instances configured

Node List:
  * Online: [ node201 node202 ]

Full List of Resources:
  * fence_qdisk_0       (stonith:fence_qdisk):   Started node202
  * fence_qdisk_1       (stonith:fence_qdisk):   Started node201
  * Clone Set: clone-dlm [dlm]:
    * Started: [ node201 node202 ]
  * Clone Set: clone-gfs2 [gfs2]:
    * Started: [ node201 node202 ]

9)创建RAC数据库实例

[root@node201 KingbaseHA]# ./cluster_manager.sh --init_rac
init KingbaseES RAC start
.......
成功。您现在可以用下面的命令开启数据库服务器:
    ./sys_ctl -D /sharedata/data_gfs2/kingbase/data -l 日志文件 start
init KingbaseES RAC success

10)配置数据库资源(初始化 PINGD、FIP、DB资源)

[root@node201 KingbaseHA]# ./cluster_manager.sh --config_rac_resource
crm configure DB resource start
crm configure DB resource end

# 查看资源配置信息

[root@node201 KingbaseHA]# crm config show
node 1: node201
node 2: node202
primitive DB ocf:kingbase:kingbase \
        params sys_ctl="/opt/Kingbase/ES/V8/Server/bin/sys_ctl" ksql="/opt/Kingbase/ES/V8/Server/bin/ksql" sys_isready="/opt/Kingbase/ES/V8/Server/bin/sys_isready" kb_data="/sharedata/data_gfs2/kingbase/data" kb_dba=kingbase kb_host=0.0.0.0 kb_user=system kb_port=55321 kb_db=template1 logfile="/home/kingbase/log/kingbase1.log" \
        op start interval=0 timeout=120 \
        op stop interval=0 timeout=120 \
        op monitor interval=9s timeout=30 on-fail=stop \
        meta failure-timeout=5min
primitive dlm ocf:pacemaker:controld \
        params daemon="/opt/KingbaseHA/dlm-dlm/sbin/dlm_controld" dlm_tool="/opt/KingbaseHA/dlm-dlm/sbin/dlm_tool" args="-s 0 -f 0" allow_stonith_disabled=true \
        op start interval=0 \
        op stop interval=0 \
        op monitor interval=60 timeout=60
primitive gfs2 Filesystem \
        params device="-U 3e934629-a2b8-4b7d-a153-ded2dbec7a28" directory="/sharedata/data_gfs2" fstype=gfs2 \
        op start interval=0 timeout=60 \
        op stop interval=0 timeout=60 \
        op monitor interval=30s timeout=60 OCF_CHECK_LEVEL=20 \
        meta failure-timeout=5min
clone clone-DB DB \
        meta target-role=Started
clone clone-dlm dlm \

        meta interleave=true target-role=Started
clone clone-gfs2 gfs2 \
        meta interleave=true target-role=Started
colocation cluster-colo1 inf: clone-gfs2 clone-dlm
order cluster-order1 clone-dlm clone-gfs2
order cluster-order2 clone-dlm clone-gfs2
property cib-bootstrap-options: \
        have-watchdog=false \
        dc-version=2.0.3-4b1f869f0f \
        cluster-infrastructure=corosync \
        cluster-name=krac \
        no-quorum-policy=freeze \
        stonith-enabled=false

三、访问RAC数据库

1、手工停止和启动数据库服务

# 停止数据库服务
[kingbase@node201 bin]$ ./sys_ctl stop -D /sharedata/data_gfs2/kingbase/data/ -m immediate
waiting for server to shut down.... done
server stopped

# 启动数据库服务
[kingbase@node201 bin]$ ./sys_ctl start -D /sharedata/data_gfs2/kingbase/data/  -t 180 -w
waiting for server to start....2024-08-02 11:55:41.478 CST [20694] LOG:  请尽快配置有效的归档命令做WAL日志文件的归档
......
2024-08-02 11:55:41.586 CST [20694] HINT:  后续的日志输出将出现在目录 "sys_log/1"中.
.......................................................... done
server started

# 数据库服务pid
[root@node202 KingbaseHA]# cd /sharedata/data_gfs2/kingbase/data
[root@node202 data]# ls -lh *.pid
-rw------- 1 kingbase kingbase 100 Aug  2 11:46 kingbase_1.pid
-rw------- 1 kingbase kingbase 100 Aug  2 11:46 kingbase_2.pid

# 访问数据库
[kingbase@node201 bin]$ ./ksql -U system test -p 55321
Type "help" for help.

test=# create database prod;
CREATE DATABASE

test=# \c prod
You are now connected to database "prod" as userName "system".

prod=# create table t1 (id int,name varchar(20));
CREATE TABLE
prod=# insert into t1 values (generate_series(1,1000),'usr'||generate_series(1,1000));
INSERT 0 1000
prod=# select count(*) from t1;
 count
-------
  1000
(1 row)

2、通过集群启动和停止数据库服务

[root@node201 data]# crm resource status clone-DB
resource clone-DB is running on: node201
resource clone-DB is running on: node202

[root@node201 data]# crm resource stop clone-DB

[root@node201 data]# crm resource status clone-DB
resource clone-DB is NOT running
resource clone-DB is NOT running

[root@node201 data]# crm resource start clone-DB
[root@node201 data]# crm resource status clone-DB
resource clone-DB is running on: node201
resource clone-DB is running on: node202
# 查看数据库服务进程
[root@node201 data]# ps -ef |grep kingbase
kingbase 23066     1  0 15:14 ?        00:00:00 /opt/Kingbase/ES/V8/KESRealPro/V008R006C008M030B0010/Server/bin/kingbase -D /sharedata/data_gfs2/kingbase/data -c config_file=/sharedata/data_gfs2/kingbase/data/kingbase.conf -c log_directory=sys_log -h 0.0.0.0
kingbase 23236 23066  0 15:14 ?        00:00:00 kingbase: logger
kingbase 23241 23066  0 15:14 ?        00:00:00 kingbase: lmon
kingbase 23245 23066  0 15:14 ?        00:00:00 kingbase: lms   1
kingbase 23246 23066  0 15:14 ?        00:00:00 kingbase: lms   2
kingbase 23247 23066  0 15:14 ?        00:00:00 kingbase: lms   3
kingbase 23248 23066  0 15:14 ?        00:00:00 kingbase: lms   4
kingbase 23249 23066  0 15:14 ?        00:00:00 kingbase: lms   5
kingbase 23250 23066  0 15:14 ?        00:00:00 kingbase: lms   6
kingbase 23251 23066  0 15:14 ?        00:00:00 kingbase: lms   7
kingbase 23893 23066  0 15:15 ?        00:00:00 kingbase: checkpointer
kingbase 23894 23066  0 15:15 ?        00:00:00 kingbase: background writer
kingbase 23895 23066  0 15:15 ?        00:00:00 kingbase: global deadlock checker
kingbase 23896 23066  0 15:15 ?        00:00:00 kingbase: transaction syncer
kingbase 23897 23066  0 15:15 ?        00:00:00 kingbase: walwriter
kingbase 23898 23066  0 15:15 ?        00:00:00 kingbase: autovacuum launcher
kingbase 23899 23066  0 15:15 ?        00:00:00 kingbase: archiver
kingbase 23900 23066  0 15:15 ?        00:00:00 kingbase: archiver for node 3
kingbase 23901 23066  0 15:15 ?        00:00:00 kingbase: archiver for node 4
kingbase 23902 23066  0 15:15 ?        00:00:00 kingbase: stats collector
kingbase 23903 23066  0 15:15 ?        00:00:00 kingbase: kwr collector
kingbase 23904 23066  0 15:15 ?        00:00:00 kingbase: ksh writer
kingbase 23905 23066  0 15:15 ?        00:00:00 kingbase: ksh collector
kingbase 23906 23066  0 15:15 ?        00:00:00 kingbase: logical replication launcher

四、附件
1、数据存储文件系统
如下所示,数据库data的共性存储采用gfs2的共享文件系统:

[root@node202 data]# df -h
Filesystem               Size  Used Avail Use% Mounted on
........
/dev/sdb                  11G  365M   11G   4% /sharedata/data_gfs2

# 如下所示数据库存储使用gfs2文件系统
[root@node202 data]# mount -v |grep sharedata
/dev/sdb on /sharedata/data_gfs2 type gfs2 (rw,relatime)

2、查看集群资源状态

[root@node202 data]#  pcs status corosync
Membership information
----------------------
    Nodeid      Votes    Qdevice Name
         1          1     A,V,MW node201
         2          1    A,NV,MW node202 (local)
         0          0            Qdevice (votes 1)


         0          0            Qdevice (votes 1)

[root@node202 data]# pcs status quorum
Quorum information
------------------
Date:             Fri Aug  2 14:00:22 2024
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          2
Ring ID:          1/9
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      2
Quorum:           2
Flags:            Quorate Qdevice

Membership information
----------------------
    Nodeid      Votes    Qdevice Name
         1          1     A,V,MW node201
         2          1    A,NV,MW node202 (local)
         0          0            Qdevice (votes 1)

3、查看集群cib配置

[root@node202 data]# pcs cluster cib
......
 <nvpair id="cib-bootstrap-options-cluster-name" name="cluster-name" value="krac"/>
        <nvpair name="load-threshold" value="0%" id="cib-bootstrap-options-load-threshold"/>
      </cluster_property_set>
    </crm_config>
    <nodes>
      <node id="1" uname="node201"/>
      <node id="2" uname="node202"/>
    </nodes>
    <resources>
      <primitive id="fence_qdisk_0" class="stonith" type="fence_qdisk">
        <instance_attributes id="fence_qdisk_0-instance_attributes">
          <nvpair name="qdisk_path" value="/dev/sdc" id="fence_qdisk_0-instance_attributes-qdisk_path"/>
          <nvpair name="qdisk_fence_tool" value="/opt/KingbaseHA/corosync-qdevice/sbin/qdisk-fence-tool" id="fence_qdisk_0-instance_attributes-qdisk_fence_tool"/>
          <nvpair name="pcmk_host_list" value="node201" id="fence_qdisk_0-instance_attributes-pcmk_host_list"/>
        </instance_attributes>
        <operations>
......

五、总结
本案例测试了,在使用iscsi共享存储作为数据节点共享的SAN存储环境, 构建RAC的过程,可以作为RAC测试环境构建的参考。

posted @ 2024-08-05 11:13  天涯客1224  阅读(45)  评论(1编辑  收藏  举报