kingbaseES R3 集群配置 SSL
案例说明:
本测试是在非生产环境下,在官方没有明确声明支持KingbaseCluster使用ssl的前提下,建议只能在测试环境使用,避免生产环境下直接使用。
数据库版本:
TEST=# select version();
version
--------------------------------------------------------------------------------------------------------------------
Kingbase V008R003C002B0061 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
(1 row)
测试环境:
kingbasecluster SSL测试总结:
1、对于Client数据库访问(54321)支持ssl认证访问(客户端证书方式)
2、对于集群(kingbasecluster)在9999端口测试后台数据库healthy时,无法通过ssl认证访问。
3、需要配置sys_hba.conf规避用户SUPERMANAGER_V8ADMIN和SYSTEM通过ssl访问数据库。
sys_hba.conf配置:
测试过程如下:
1、生成服务端证书
=将产品配置的服务端证书拷贝到数据目录下并配置权限(主库和备库)。=
[kingbase@srv1 soft]$ ls -lh bmjcert
总用量 32K
-rw-r--r-- 1 kingbase kingbase 944 11月 6 16:18 kingbase.crt
-rw-r--r-- 1 kingbase kingbase 891 11月 6 16:18 kingbase.key
-rw-r--r-- 1 kingbase kingbase 637 11月 6 16:18 kingbase.pk8
-rw-r--r-- 1 kingbase kingbase 4.2K 11月 6 16:18 root.crt
-rw-r--r-- 1 kingbase kingbase 4.2K 11月 6 16:18 server.crt
-rw-r--r-- 1 kingbase kingbase 1.7K 11月 6 16:18 server.key
主库服务端证书信息:
[kingbase@srv1 soft]$ cd /home/kingbase/cluster/kdb/db/data/
[kingbase@srv1 data]$ chmod 400 server.*
[kingbase@srv1 data]$ chmod 400 root.crt
[kingbase@srv1 data]$ ls -lh server.* root.crt
-r-------- 1 kingbase kingbase 4.2K 3月 25 10:20 root.crt
-r-------- 1 kingbase kingbase 4.2K 3月 25 10:20 server.crt
-r-------- 1 kingbase kingbase 1.7K 3月 25 10:21 server.key
备库服务端证书信息:
[kingbase@srv2 cluster]$ cd kdb/db/data
[kingbase@srv2 data]$ ls -lh server.crt server.key root.crt
-r-------- 1 kingbase kingbase 4.2K 3月 25 10:21 root.crt
-r-------- 1 kingbase kingbase 4.2K 3月 25 10:21 server.crt
-r-------- 1 kingbase kingbase 1.7K 3月 25 10:21 server.key
2、配置数据库启用ssl(主备库)
1)配置kingbase.conf
[kingbase@srv1 data]$ cat kingbase.conf |grep ssl
ssl = on # (change requires restart)
#ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL' # allowed SSL ciphers
#ssl_prefer_server_ciphers = on # (change requires restart)
#ssl_ecdh_curve = 'prime256v1' # (change requires restart)
ssl_cert_file = 'server.crt' # (change requires restart)
ssl_key_file = 'server.key' # (change requires restart)
ssl_ca_file = 'root.crt' # (change requires restart)
#ssl_crl_file = '' # (change requires restart)
2)sys_hba.conf启用hostssl认证
# "local" is for Unix domain socket connections only
local all all md5
# IPv4 local connections:
host all all 127.0.0.1/32 md5
#host all all 0.0.0.0/0 md5
hostssl all all 0.0.0.0/0 md5 clientcert=1
3、配置客户端ssl证书(主备库)
[kingbase@srv1 .kingbase]$ ls -lh
总用量 20K
drw------- 3 kingbase kingbase 43 8月 11 2020 deploy
-rw------- 1 kingbase kingbase 944 3月 25 10:58 kingbase.crt
-rw------- 1 kingbase kingbase 891 3月 25 10:58 kingbase.key
-rw------- 1 kingbase kingbase 637 3月 25 10:58 kingbase.pk8
-r-------- 1 kingbase kingbase 4.2K 3月 25 10:58 root.crt
4、重启数据库服务(主备库)
[kingbase@srv1 data]$ sys_ctl restart -D ../data
5、客户端连接测试:
访问54321端口服务:
[kingbase@srv1 data]$ ksql -h 192.168.2.2 -U system -W 123456 prod
ksql (V008R003C002B0061)
SSL connection (protocol: TLSv1, cipher: DHE-RSA-AES256-SHA, bits: 256, compression: on)
Type "help" for help.
prod=#
访问9999端口服务(失败):
[kingbase@srv1 data]$ ksql -h 192.168.2.253 -U SYSTEM -W 123456 prod -p 9999
ksql:
致命错误: 没有用于主机 "192.168.2.2", 用户 "SYSTEM", 数据库 "prod", SSL 关闭 的 sys_hba.conf 记录
6、kingbase_monitor.sh一键重启cluster测试
1)重启集群
[kingbase@srv1 data]$ cd ../bin
[kingbase@srv1 bin]$ ./kingbase_monitor.sh restart
-----------------------------------------------------------------------
2021-03-24 15:30:24 KingbaseES automation beging...
2021-03-24 15:30:24 stop kingbasecluster [192.168.2.2] ...
DEL VIP NOW AT 2021-03-24 15:30:25 ON enp0s8
No VIP on my dev, nothing to do.
......................
all started..
...
now we check again
=======================================================================
| ip | program| [status]
[ 192.168.2.2]| [kingbasecluster]| [active]
[ 192.168.2.3]| [kingbasecluster]| [active]
[ 192.168.2.2]| [kingbase]| [active]
[ 192.168.2.3]| [kingbase]| [active]
=======================================================================
=Cluster 集群启动正常。=
2)查看主备流复制状态(没有发现备库)
[kingbase@srv1 bin]$ ksql -h 192.168.2.2 -U system -W 123456 TEST
ksql (V008R003C002B0061)
Type "help" for help.
TEST=# select version();
version
--------------------------------------------------------------------------------------------------------------------
Kingbase V008R003C002B0061 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
(1 row)
TEST=# select * from sys_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start
| backend_xmin | state | sent_location | write_location | flush_location | replay_location | sync_priority | sync_s
te
-------+----------+---------+------------------+-------------+-----------------+-------------+-----------------------
(0 row)
3)查看日志信息
集群日志(cluster.log):
---- 2021年 03月 25日 星期四 10:26:38 CST monitor up ----
2021-03-25 10:26:38: pid 2138: LOG: Backend status file /home/kingbase/cluster/kdb/run/kingbasecluster/kingbasecluster_status does not exist
2021-03-25 10:26:38: pid 2138: LOG: waiting for watchdog to initialize
2021-03-25 10:26:38: pid 2183: LOG: setting the local watchdog node name to "192.168.2.3:9999 Linux srv2"
2021-03-25 10:26:38: pid 2183: LOG: watchdog cluster is configured with 1 remote nodes
2021-03-25 10:26:38: pid 2183: LOG: watchdog remote node:0 on 192.168.2.2:9000
2021-03-25 10:26:38: pid 2183: LOG: interface monitoring is disabled in watchdog
2021-03-25 10:26:38: pid 2183: LOG: watchdog is configured to use authentication, but kingbasecluster is built without SSL support
2021-03-25 10:26:38: pid 2183: DETAIL: The authentication method used by kingbasecluster without the SSL support is known to be weak
2021-03-25 10:27:22: pid 26671: ERROR: failed to authenticate
2021-03-25 10:27:22: pid 26671: DETAIL: 没有用于主机 "192.168.2.2", 用户 "SUPERMANAGER_V8ADMIN", 数据库 "TEMPLATE2", SSL >关闭 的 sys_hba.conf 记录
2021-03-25 10:27:22: pid 26671: ERROR: failed to authenticate
2021-03-25 10:27:22: pid 26671: DETAIL: 没有用于主机 "192.168.2.2", 用户 "SUPERMANAGER_V8ADMIN", 数据库 "TEMPLATE2", SSL >关闭 的 sys_hba.conf 记录
数据库日志(sys_log):
2021-03-25 10:27:04.423 CST,"SUPERMANAGER_V8ADMIN","TEST",27257,"192.168.2.2:25561",605bf4f8.6a79,1,"authentication",2021-03-25 10:27:04 CST,4/44,0,致命错误,28000,"没有用于主机 ""192.168.2.2"", 用户 ""SUPERMANAGER_V8ADMIN"", 数据库 ""TEST"", SSL 关闭 的 sys_hba.conf 记录",,,,,,,,,""
2021-03-25 10:27:05.442 CST,"SUPERMANAGER_V8ADMIN","TEMPLATE2",27300,"192.168.2.2:25565",605bf4f9.6aa4,1,"authentication",2021-03-25 10:27:05 CST,4/45,0,致命错误,28000,"没有用于主机 ""192.168.2.2"", 用户 ""SUPERMANAGER_V8ADMIN"", 数据库 ""TEMPLATE2"", SSL 关闭 的 sys_hba.conf 记录",,,,,,,,,""
2021-03-25 10:27:22.687 CST,,,27525,"192.168.2.2:25609",605bf50a.6b85,1,"",2021-03-25 10:27:22 CST,,0,日志,08P01,"无法访问 SSL 联接: tlsv1 alert unknown ca",,,,,,,,,""
2021-03-25 10:27:22.693 CST,"system","TEST",27526,"192.168.2.2:25611",605bf50a.6b86,1,"authentication",2021-03-25 10:27:22 CST,3/24,0,致命错误,28000,"没有用于主机 ""192.168.2.2"", 用户 ""system"", 数据库 ""TEST"", SSL 关闭 的 sys_hba.conf 记录",,,,,,,,,""
7、配置sys_hba.conf规避ssl认证
=在通过9999端口通讯,访问数据库时,无法使用ssl认证,配置规则规避SUPERMANAGER_V8ADMIN和SYSTEM用户在9999端口访问时的ssl认证。=
1)编辑sys_hba.conf
[kingbase@srv1 data]$ cat sys_hba.conf
# "local" is for Unix domain socket connections only
local all all md5
# IPv4 local connections:
host all all 127.0.0.1/32 md5
#host all all 0.0.0.0/0 md5
host TEMPLATE2 SUPERMANAGER_V8ADMIN 0.0.0.0/0 md5
host TEST SUPERMANAGER_V8ADMIN 0.0.0.0/0 md5
host TEST SYSTEM 0.0.0.0/0 md5
hostssl all all 0.0.0.0/0 md5 clientcert=1
......
2)测试9999端口通讯
[kingbase@srv1 data]$ ksql -h 192.168.2.2 -U SYSTEM -W 123456 TEST -p 9999
ksql (V008R003C002B0061)
Type "help" for help.
TEST=#
[kingbase@srv2 etc]$ ksql -h 192.168.2.2 -U SUPERMANAGER_V8ADMIN -W XXX TEMPLATE2 -p 9999
ksql (V008R003C002B0061)
Type "help" for help.
TEMPLATE2=#
8、重启kingbasecluster集群服务测试
1)重启cluster测试
[kingbase@srv1 bin]$ ./kingbase_monitor.sh restart
-----------------------------------------------------------------------
........
start crontab kingbase position : [1]
Redirecting to /bin/systemctl restart crond.service
ADD VIP NOW AT 2021-03-25 14:17:03 ON enp0s8
execute: [/sbin/ip addr add 192.168.2.254/24 dev enp0s8 label enp0s8:2]
execute: /sbin/arping -U 192.168.2.254 -I enp0s8 -w 1
ARPING 192.168.2.254 from 192.168.2.254 enp0s8
Sent 1 probes (1 broadcast(s))
Received 0 response(s)
start crontab kingbase position : [1]
Redirecting to /bin/systemctl restart crond.service
wait kingbase recovery 5 sec...
start crontab kingbasecluster line number: [2]
Redirecting to /bin/systemctl restart crond.service
start crontab kingbasecluster line number: [2]
Redirecting to /bin/systemctl restart crond.service
......................
all started..
...
now we check again
=======================================================================
| ip | program| [status]
[ 192.168.2.2]| [kingbasecluster]| [active]
[ 192.168.2.3]| [kingbasecluster]| [active]
[ 192.168.2.2]| [kingbase]| [active]
[ 192.168.2.3]| [kingbase]| [active]
=======================================================================
2)查看主备流复制状态(主备流复制正常)
[kingbase@srv1 kdb]$ ksql -h 192.168.2.2 -U SUPERMANAGER_V8ADMIN -W XXX TEMPLATE2 -p 9999
ksql (V008R003C002B0061)
Type "help" for help.
TEMPLATE2=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay
---------+-------------+-------+--------+-----------+---------+------------+-------------------+-------------------
0 | 192.168.2.2 | 54321 | up | 0.500000 | primary | 0 | false | 0
1 | 192.168.2.3 | 54321 | up | 0.500000 | standby | 0 | true | 0
(2 rows)
TEMPLATE2=# select * from sys_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start
| backend_xmin | state | sent_location | write_location | flush_location | replay_location | sync_priority | sync
_state
-------+----------+---------+------------------+-------------+-----------------+-------------+-----------------------
14018 | 10 | SYSTEM | node2 | 192.168.2.3 | | 36880 | 2021-03-25 14:30:37.096040
+08 | | streaming | 0/57000A70 | 0/57000A70 | 0/57000A70 | 0/57000A70 | 0 | asyn
c
(1 row)
3)客户端连接数据库测试(连接vip,启用ssl认证)
[kingbase@srv1 data]$ ksql -h 192.168.2.254 -U system -W 123456 prod
ksql (V008R003C002B0061)
SSL connection (protocol: TLSv1, cipher: DHE-RSA-AES256-SHA, bits: 256, compression: on)
Type "help" for help.
prod=# \d
List of relations
Schema | Name | Type | Owner
--------+------------+----------+--------
PUBLIC | a | table | SYSTEM
PUBLIC | sys_log | table | SYSTEM
PUBLIC | t1 | table | SYSTEM
......
(11 rows)
prod=# select * from t1 limit 10;
id | name
----+----------
10 | tom
......
80 | ellen
(10 rows)
cluster vip访问9999端口:
[kingbase@srv1 data]$ ksql -h 192.168.2.253 -U SYSTEM -W 123456 TEST -p 9999
ksql (V008R003C002B0061)
Type "help" for help.
TEST=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay
---------+-------------+-------+--------+-----------+---------+------------+-------------------+-------------------
0 | 192.168.2.2 | 54321 | up | 0.500000 | primary | 1 | false | 0
1 | 192.168.2.3 | 54321 | up | 0.500000 | standby | 0 | true | 0
(2 rows)
9、集群切换测试
1)kill 主库数据库服务
[kingbase@srv1 data]$ ps -ef |grep kingbase
kingbase 13805 1 0 14:30 ? 00:00:02 /home/kingbase/cluster/kdb/db/bin/kingbase -D /home/kingbase/cluster/kdb/db/data
kingbase 13806 13805 0 14:30 ? 00:00:00 kingbase: logger process
kingbase 13808 13805 0 14:30 ? 00:00:00 kingbase: checkpointer process
kingbase 13809 13805 0 14:30 ? 00:00:00 kingbase: writer process
kingbase 13810 13805 0 14:30 ? 00:00:00 kingbase: wal writer process
kingbase 13811 13805 0 14:30 ? 00:00:00 kingbase: autovacuum launcher process
kingbase 13812 13805 0 14:30 ? 00:00:00 kingbase: archiver process
kingbase 13813 13805 0 14:30 ? 00:00:00 kingbase: stats collector process
kingbase 13814 13805 0 14:30 ? 00:00:00 kingbase: bgworker: syslogical supervisor
kingbase 14018 13805 0 14:30 ? 00:00:00 kingbase: wal sender process SYSTEM 192.168.2.3(36880) streaming 0/57000A70
[kingbase@srv1 data]$ kill -9 13805
2)查看切换结果
[kingbase@srv2 data]$ ksql -h 192.168.2.3 -U system -W 123456 prod
ksql (V008R003C002B0061)
SSL connection (protocol: TLSv1, cipher: DHE-RSA-AES256-SHA, bits: 256, compression: on)
Type "help" for help.
prod=# select sys_is_in_recovery();
sys_is_in_recovery
--------------------
f
(1 row)
[kingbase@srv2 log]$ ksql -h 192.168.2.3 -U SYSTEM -W 123456 -p 9999 TEST
ksql (V008R003C002B0061)
Type "help" for help.
TEST=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay
---------+-------------+-------+--------+-----------+---------+------------+-------------------+-------------------
0 | 192.168.2.2 | 54321 | down | 0.500000 | standby | 0 | false | 0
1 | 192.168.2.3 | 54321 | up | 0.500000 | primary | 0 | true | 0
(2 rows)
=从以上获知,主备切换成功。==
附件:在cluster.log日志出现以下错误