kingbaseES V8R3集群运维案例之---SSL 配置测试案例

案例说明:
本测试是在非生产环境下,在官方没有明确声明支持KingbaseCluster使用ssl的前提下,建议只能在测试环境使用,避免生产环境下直接使用。

数据库版本:

TEST=# select version();
                                                      version
--------------------------------------------------------------------------------------------------------------------
 Kingbase V008R003C002B0061 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
(1 row)

测试环境:

kingbasecluster SSL测试总结:
1、对于Client数据库访问(54321)支持ssl认证访问(客户端证书方式)
2、对于集群(kingbasecluster)在9999端口测试后台数据库healthy时,无法通过ssl认证访问。
3、需要配置sys_hba.conf规避用户SUPERMANAGER_V8ADMIN和SYSTEM通过ssl访问数据库。

sys_hba.conf配置:

测试过程如下:

1、生成服务端证书

=将产品配置的服务端证书拷贝到数据目录下并配置权限(主库和备库)。=

[kingbase@srv1 soft]$ ls -lh bmjcert
总用量 32K
-rw-r--r-- 1 kingbase kingbase  944 11月  6 16:18 kingbase.crt
-rw-r--r-- 1 kingbase kingbase  891 11月  6 16:18 kingbase.key
-rw-r--r-- 1 kingbase kingbase  637 11月  6 16:18 kingbase.pk8
-rw-r--r-- 1 kingbase kingbase 4.2K 11月  6 16:18 root.crt
-rw-r--r-- 1 kingbase kingbase 4.2K 11月  6 16:18 server.crt
-rw-r--r-- 1 kingbase kingbase 1.7K 11月  6 16:18 server.key

主库服务端证书信息:

[kingbase@srv1 soft]$ cd /home/kingbase/cluster/kdb/db/data/
[kingbase@srv1 data]$ chmod 400 server.* 
[kingbase@srv1 data]$ chmod 400 root.crt
[kingbase@srv1 data]$ ls -lh server.* root.crt
-r-------- 1 kingbase kingbase 4.2K 3月  25 10:20 root.crt
-r-------- 1 kingbase kingbase 4.2K 3月  25 10:20 server.crt
-r-------- 1 kingbase kingbase 1.7K 3月  25 10:21 server.key

备库服务端证书信息:

[kingbase@srv2 cluster]$ cd kdb/db/data
[kingbase@srv2 data]$ ls -lh server.crt server.key root.crt
-r-------- 1 kingbase kingbase 4.2K 3月  25 10:21 root.crt
-r-------- 1 kingbase kingbase 4.2K 3月  25 10:21 server.crt
-r-------- 1 kingbase kingbase 1.7K 3月  25 10:21 server.key

2、配置数据库启用ssl(主备库)
1)配置kingbase.conf

[kingbase@srv1 data]$ cat kingbase.conf |grep ssl
ssl = on                                # (change requires restart)
#ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL' # allowed SSL ciphers
#ssl_prefer_server_ciphers = on         # (change requires restart)
#ssl_ecdh_curve = 'prime256v1'          # (change requires restart)
ssl_cert_file = 'server.crt'            # (change requires restart)
ssl_key_file = 'server.key'             # (change requires restart)
ssl_ca_file = 'root.crt'                        # (change requires restart)
#ssl_crl_file = ''                      # (change requires restart)

2)sys_hba.conf启用hostssl认证

# "local" is for Unix domain socket connections only
local   all             all                                     md5
# IPv4 local connections:
host    all             all             127.0.0.1/32            md5
#host    all             all             0.0.0.0/0               md5
hostssl    all             all             0.0.0.0/0            md5 clientcert=1

3、配置客户端ssl证书(主备库)

[kingbase@srv1 .kingbase]$ ls -lh
总用量 20K
drw------- 3 kingbase kingbase   43 8月  11 2020 deploy
-rw------- 1 kingbase kingbase  944 3月  25 10:58 kingbase.crt
-rw------- 1 kingbase kingbase  891 3月  25 10:58 kingbase.key
-rw------- 1 kingbase kingbase  637 3月  25 10:58 kingbase.pk8
-r-------- 1 kingbase kingbase 4.2K 3月  25 10:58 root.crt

4、重启数据库服务(主备库)
[kingbase@srv1 data]$ sys_ctl restart -D ../data

5、客户端连接测试:

访问54321端口服务:

[kingbase@srv1 data]$ ksql -h 192.168.2.2 -U system -W 123456 prod
ksql (V008R003C002B0061)
SSL connection (protocol: TLSv1, cipher: DHE-RSA-AES256-SHA, bits: 256, compression: on)
Type "help" for help.

prod=#

访问9999端口服务(失败):

[kingbase@srv1 data]$ ksql -h 192.168.2.253 -U SYSTEM -W 123456 prod -p 9999
ksql:
致命错误:  没有用于主机 "192.168.2.2", 用户 "SYSTEM", 数据库 "prod", SSL 关闭 的 sys_hba.conf 记录

6、kingbase_monitor.sh一键重启cluster测试

1)重启集群

[kingbase@srv1 data]$ cd ../bin
[kingbase@srv1 bin]$ ./kingbase_monitor.sh restart
-----------------------------------------------------------------------
2021-03-24 15:30:24 KingbaseES automation beging...
2021-03-24 15:30:24 stop kingbasecluster [192.168.2.2] ...
DEL VIP NOW AT 2021-03-24 15:30:25 ON enp0s8
No VIP on my dev, nothing to do.
......................
all started..
...
now we check again
=======================================================================
|             ip |                       program|              [status]
[    192.168.2.2]|             [kingbasecluster]|              [active]
[    192.168.2.3]|             [kingbasecluster]|              [active]
[    192.168.2.2]|                    [kingbase]|              [active]
[    192.168.2.3]|                    [kingbase]|              [active]
=======================================================================

=Cluster 集群启动正常。=

2)查看主备流复制状态(没有发现备库)

[kingbase@srv1 bin]$ ksql -h 192.168.2.2 -U system -W 123456 TEST
ksql (V008R003C002B0061)
Type "help" for help.

TEST=# select version();
                                                      version
--------------------------------------------------------------------------------------------------------------------
 Kingbase V008R003C002B0061 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
(1 row)

TEST=# select * from sys_stat_replication;
  pid  | usesysid | usename | application_name | client_addr | client_hostname | client_port |         backend_start
  | backend_xmin |   state   | sent_location | write_location | flush_location | replay_location | sync_priority | sync_s
te
-------+----------+---------+------------------+-------------+-----------------+-------------+-----------------------


(0 row)

3)查看日志信息

集群日志(cluster.log):

---- 2021年 03月 25日 星期四 10:26:38 CST monitor up ----
2021-03-25 10:26:38: pid 2138: LOG:  Backend status file /home/kingbase/cluster/kdb/run/kingbasecluster/kingbasecluster_status does not exist
2021-03-25 10:26:38: pid 2138: LOG:  waiting for watchdog to initialize
2021-03-25 10:26:38: pid 2183: LOG:  setting the local watchdog node name to "192.168.2.3:9999 Linux srv2"
2021-03-25 10:26:38: pid 2183: LOG:  watchdog cluster is configured with 1 remote nodes
2021-03-25 10:26:38: pid 2183: LOG:  watchdog remote node:0 on 192.168.2.2:9000
2021-03-25 10:26:38: pid 2183: LOG:  interface monitoring is disabled in watchdog
2021-03-25 10:26:38: pid 2183: LOG:  watchdog is configured to use authentication, but kingbasecluster is built without SSL support
2021-03-25 10:26:38: pid 2183: DETAIL:  The authentication method used by kingbasecluster without the SSL support is known to be weak

2021-03-25 10:27:22: pid 26671: ERROR:  failed to authenticate
2021-03-25 10:27:22: pid 26671: DETAIL:  没有用于主机 "192.168.2.2", 用户 "SUPERMANAGER_V8ADMIN", 数据库 "TEMPLATE2", SSL >关闭 的 sys_hba.conf 记录
2021-03-25 10:27:22: pid 26671: ERROR:  failed to authenticate
2021-03-25 10:27:22: pid 26671: DETAIL:  没有用于主机 "192.168.2.2", 用户 "SUPERMANAGER_V8ADMIN", 数据库 "TEMPLATE2", SSL >关闭 的 sys_hba.conf 记录

数据库日志(sys_log):

2021-03-25 10:27:04.423 CST,"SUPERMANAGER_V8ADMIN","TEST",27257,"192.168.2.2:25561",605bf4f8.6a79,1,"authentication",2021-03-25 10:27:04 CST,4/44,0,致命错误,28000,"没有用于主机 ""192.168.2.2"", 用户 ""SUPERMANAGER_V8ADMIN"", 数据库 ""TEST"", SSL 关闭 的 sys_hba.conf 记录",,,,,,,,,""
2021-03-25 10:27:05.442 CST,"SUPERMANAGER_V8ADMIN","TEMPLATE2",27300,"192.168.2.2:25565",605bf4f9.6aa4,1,"authentication",2021-03-25 10:27:05 CST,4/45,0,致命错误,28000,"没有用于主机 ""192.168.2.2"", 用户 ""SUPERMANAGER_V8ADMIN"", 数据库 ""TEMPLATE2"", SSL 关闭 的 sys_hba.conf 记录",,,,,,,,,""
2021-03-25 10:27:22.687 CST,,,27525,"192.168.2.2:25609",605bf50a.6b85,1,"",2021-03-25 10:27:22 CST,,0,日志,08P01,"无法访问 SSL 联接: tlsv1 alert unknown ca",,,,,,,,,""
2021-03-25 10:27:22.693 CST,"system","TEST",27526,"192.168.2.2:25611",605bf50a.6b86,1,"authentication",2021-03-25 10:27:22 CST,3/24,0,致命错误,28000,"没有用于主机 ""192.168.2.2"", 用户 ""system"", 数据库 ""TEST"", SSL 关闭 的 sys_hba.conf 记录",,,,,,,,,""

7、配置sys_hba.conf规避ssl认证

=在通过9999端口通讯,访问数据库时,无法使用ssl认证,配置规则规避SUPERMANAGER_V8ADMIN和SYSTEM用户在9999端口访问时的ssl认证。=

1)编辑sys_hba.conf

[kingbase@srv1 data]$ cat sys_hba.conf


# "local" is for Unix domain socket connections only
local   all             all                                     md5
# IPv4 local connections:
host    all             all             127.0.0.1/32            md5
#host    all             all             0.0.0.0/0               md5

host     TEMPLATE2     SUPERMANAGER_V8ADMIN  0.0.0.0/0            md5
host     TEST          SUPERMANAGER_V8ADMIN  0.0.0.0/0            md5
host     TEST          SYSTEM             0.0.0.0/0              md5

hostssl all             all             0.0.0.0/0               md5 clientcert=1
......

2)测试9999端口通讯

[kingbase@srv1 data]$ ksql -h 192.168.2.2 -U SYSTEM -W 123456 TEST -p 9999
ksql (V008R003C002B0061)
Type "help" for help.
TEST=# 

[kingbase@srv2 etc]$ ksql -h 192.168.2.2 -U  SUPERMANAGER_V8ADMIN -W KINGBASEADMIN TEMPLATE2 -p 9999
ksql (V008R003C002B0061)
Type "help" for help.

TEMPLATE2=#

8、重启kingbasecluster集群服务测试

1)重启cluster测试

[kingbase@srv1 bin]$ ./kingbase_monitor.sh restart
-----------------------------------------------------------------------
........
start crontab kingbase position : [1]
Redirecting to /bin/systemctl restart crond.service
ADD VIP NOW AT 2021-03-25 14:17:03 ON enp0s8
execute: [/sbin/ip addr add 192.168.2.254/24 dev enp0s8 label enp0s8:2]
execute: /sbin/arping -U 192.168.2.254 -I enp0s8 -w 1
ARPING 192.168.2.254 from 192.168.2.254 enp0s8
Sent 1 probes (1 broadcast(s))
Received 0 response(s)
start crontab kingbase position : [1]
Redirecting to /bin/systemctl restart crond.service
wait kingbase recovery 5 sec...
start crontab kingbasecluster line number: [2]
Redirecting to /bin/systemctl restart crond.service
start crontab kingbasecluster line number: [2]
Redirecting to /bin/systemctl restart crond.service
......................
all started..
...
now we check again
=======================================================================
|             ip |                       program|              [status]
[    192.168.2.2]|             [kingbasecluster]|              [active]
[    192.168.2.3]|             [kingbasecluster]|              [active]
[    192.168.2.2]|                    [kingbase]|              [active]
[    192.168.2.3]|                    [kingbase]|              [active]
=======================================================================

2)查看主备流复制状态(主备流复制正常)

[kingbase@srv1 kdb]$ ksql -h 192.168.2.2 -U  SUPERMANAGER_V8ADMIN -W KINGBASEADMIN TEMPLATE2 -p 9999
ksql (V008R003C002B0061)
Type "help" for help.

TEMPLATE2=# show pool_nodes;
 node_id |  hostname   | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
---------+-------------+-------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.2.2 | 54321 | up     | 0.500000  | primary | 0          | false             | 0
 1       | 192.168.2.3 | 54321 | up     | 0.500000  | standby | 0          | true              | 0
(2 rows)

TEMPLATE2=# select * from sys_stat_replication;
  pid  | usesysid | usename | application_name | client_addr | client_hostname | client_port |         backend_start
    | backend_xmin |   state   | sent_location | write_location | flush_location | replay_location | sync_priority | sync
_state
-------+----------+---------+------------------+-------------+-----------------+-------------+-----------------------
 14018 |       10 | SYSTEM  | node2            | 192.168.2.3 |                 |       36880 | 2021-03-25 14:30:37.096040
+08 |              | streaming | 0/57000A70    | 0/57000A70     | 0/57000A70     | 0/57000A70      |             0 | asyn
c
(1 row)

3)客户端连接数据库测试(连接vip,启用ssl认证)

[kingbase@srv1 data]$ ksql -h 192.168.2.254 -U system -W 123456 prod
ksql (V008R003C002B0061)
SSL connection (protocol: TLSv1, cipher: DHE-RSA-AES256-SHA, bits: 256, compression: on)
Type "help" for help.

prod=# \d
            List of relations
 Schema |    Name    |   Type   | Owner
--------+------------+----------+--------
 PUBLIC | a          | table    | SYSTEM
 PUBLIC | sys_log    | table    | SYSTEM
 PUBLIC | t1         | table    | SYSTEM
......
(11 rows)

prod=# select * from t1 limit 10;
 id |   name
----+----------
 10 | tom
......
 80 | ellen
(10 rows)

cluster vip访问9999端口:

[kingbase@srv1 data]$ ksql -h 192.168.2.253 -U SYSTEM -W 123456 TEST -p 9999
ksql (V008R003C002B0061)
Type "help" for help.

TEST=# show pool_nodes;
 node_id |  hostname   | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
---------+-------------+-------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.2.2 | 54321 | up     | 0.500000  | primary | 1          | false             | 0
 1       | 192.168.2.3 | 54321 | up     | 0.500000  | standby | 0          | true              | 0
(2 rows)

9、集群切换测试

1)kill 主库数据库服务

[kingbase@srv1 data]$ ps -ef |grep kingbase

kingbase 13805     1  0 14:30 ?        00:00:02 /home/kingbase/cluster/kdb/db/bin/kingbase -D /home/kingbase/cluster/kdb/db/data
kingbase 13806 13805  0 14:30 ?        00:00:00 kingbase: logger process
kingbase 13808 13805  0 14:30 ?        00:00:00 kingbase: checkpointer process
kingbase 13809 13805  0 14:30 ?        00:00:00 kingbase: writer process
kingbase 13810 13805  0 14:30 ?        00:00:00 kingbase: wal writer process
kingbase 13811 13805  0 14:30 ?        00:00:00 kingbase: autovacuum launcher process
kingbase 13812 13805  0 14:30 ?        00:00:00 kingbase: archiver process
kingbase 13813 13805  0 14:30 ?        00:00:00 kingbase: stats collector process
kingbase 13814 13805  0 14:30 ?        00:00:00 kingbase: bgworker: syslogical supervisor
kingbase 14018 13805  0 14:30 ?        00:00:00 kingbase: wal sender process SYSTEM 192.168.2.3(36880) streaming 0/57000A70

[kingbase@srv1 data]$ kill -9 13805

2)查看切换结果

[kingbase@srv2 data]$ ksql -h 192.168.2.3 -U system -W 123456 prod
ksql (V008R003C002B0061)
SSL connection (protocol: TLSv1, cipher: DHE-RSA-AES256-SHA, bits: 256, compression: on)
Type "help" for help.

prod=# select sys_is_in_recovery();
 sys_is_in_recovery
--------------------
 f
(1 row)


[kingbase@srv2 log]$ ksql -h 192.168.2.3 -U SYSTEM -W 123456 -p 9999 TEST
ksql (V008R003C002B0061)
Type "help" for help.

TEST=# show pool_nodes;
 node_id |  hostname   | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
---------+-------------+-------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.2.2 | 54321 | down   | 0.500000  | standby | 0          | false             | 0
 1       | 192.168.2.3 | 54321 | up     | 0.500000  | primary | 0          | true              | 0
(2 rows)

=从以上获知,主备切换成功。==

附件:在cluster.log日志出现以下错误

posted @ 2022-04-20 14:54  天涯客1224  阅读(122)  评论(0编辑  收藏  举报