KingbaseES V8R6集群运维案例之---禁止普通用户su到root对集群管理的影响
案例说明:
在集群管理中,会使用到root权限(如ip、aring命令等),为安全需要,有的生产环境禁止普通用户su切换到root,本案例测试了禁止普通用户su切换到root对集群管理带来的影响。
集群节点信息:
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+---------+---------+-----------+----------+---------+------+---------+--------------------
1 | node200 | primary | * running | | running | 4459 | no | n/a
2 | node201 | standby | running | node200 | running | 3106 | no | 0 second(s) ago
集群状态信息:
[kingbase@node1 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node200 | primary | * running | | default | 100 | 17 | host=192.168.8.200 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node201 | standby | running | node200 | default | 100 | 17 | host=192.168.8.201 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
一、配置系统禁用su切换到root
[kingbase@node1 bin]$ cat /etc/pam.d/su |grep use_uid
#auth sufficient pam_wheel.so trust use_uid
auth required pam_wheel.so use_uid
account sufficient pam_succeed_if.so uid = 0 use_uid quiet
su用户切换测试:
[kingbase@node1 bin]$ su -
Password:
su: Permission denied
二、集群管理测试
1、集群停止测试
[kingbase@node1 bin]$ ./sys_monitor.sh stop
2022-12-05 11:37:53 Ready to stop all DB ...
.......
2022-12-05 11:38:07 Done.
#集群停止后,自动注释KINGBASECRON文件中的计划任务
[kingbase@node1 bin]$ cat /etc/cron.d/KINGBASECRON
#*/1 * * * * kingbase . /etc/profile;/home/kingbase/cluster/R6C/R6HA/kingbase/bin/kbha -A daemon -f /home/kingbase/cluster/R6C/R6HA/kingbase/bin/../etc/repmgr.conf
2、集群启动测试
[kingbase@node1 bin]$ ./sys_monitor.sh start
2022-12-05 11:38:43 Ready to start all DB ...
......
2022-12-05 11:39:19 repmgrd on "[192.168.8.200]" start success.
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+---------+---------+-----------+----------+---------+------+---------+--------------------
1 | node200 | primary | * running | | running | 4459 | no | n/a
2 | node201 | standby | running | node200 | running | 3106 | no | 0 second(s) ago
[2022-12-05 11:39:34] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6C/R6HA/kingbase/log/kbha.log"
[2022-12-05 11:39:27] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6C/R6HA/kingbase/log/kbha.log"
2022-12-05 11:39:29 Done.
#集群启动后,KINGBASECRON计划任务被启动
[kingbase@node1 bin]$ cat /etc/cron.d/KINGBASECRON
*/1 * * * * kingbase . /etc/profile;/home/kingbase/cluster/R6C/R6HA/kingbase/bin/kbha -A daemon -f /home/kingbase/cluster/R6C/R6HA/kingbase/bin/../etc/repmgr.conf
3、主备switchover切换测试
---如下所示,主备switchover可以正常切换。
[kingbase@node2 bin]$ ./repmgr standby switchover -h 192.168.8.200 -U esrep -d esrep
WARNING: following problems with command line parameters detected:
database connection parameters not required when executing UNKNOWN ACTION
NOTICE: executing switchover on node "node201" (ID: 2)
.......
INFO: unpause node "node200" (ID 1) successfully
INFO: unpausing repmgrd on node "node201" (ID 2)
INFO: unpause node "node201" (ID 2) successfully
NOTICE: STANDBY SWITCHOVER has completed successfully
[kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node200 | standby | running | node201 | default | 100 | 17 | host=192.168.8.200 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node201 | primary | * running | | default | 100 | 18 | host=192.168.8.201 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
4、主备failover切换测试
----如下所示,主备failover切换成功。
[kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node200 | standby | running | node201 | default | 100 | 17 | host=192.168.8.200 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node201 | primary | * running | | default | 100 | 18 | host=192.168.8.201 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
[kingbase@node2 bin]$ ./sys_ctl stop -D ../data
waiting for server to shut down...... done
server stopped
[kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------
1 | node200 | primary | * running | | default | 100 | 19 | host=192.168.8.200 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=10 keepalives_idle=10 keepalives_interval=10 keepalives_count=3
2 | node201 | standby | running | node200 | default | 100 | 19 | host=192.168.8.201 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=10 keepalives_idle=10 keepalives_interval=10 keepalives_count=3
5、repmgrd进程管理
---如下所示,在节点repmgrd进程异常退出时,通过KINGBASECRON中计划任务,被kbha进程自动启动 。
#查看节点repmgr进程
[kingbase@node2 sys_log]$ ps -ef |grep repmgr
kingbase 3106 1 0 11:39 ? 00:00:59 /home/kingbase/cluster/R6C/R6HA/kingbase/bin/repmgrd -d -v -f /home/kingbase/cluster/R6C/R6HA/kingbase/bin/../etc/repmgr.conf
kingbase 3610 1 0 11:39 ? 00:00:16 /home/kingbase/cluster/R6C/R6HA/kingbase/bin/kbha -A daemon -f /home/kingbase/cluster/R6C/R6HA/kingbase/bin/../etc/repmgr.conf
#模拟repmgr进程异常退出
[kingbase@node2 sys_log]$ kill -9 3106 3610
#repmgr进程被启动
[kingbase@node2 sys_log]$ ps -ef |grep repmgr
kingbase 14254 1 0 14:28 ? 00:00:00 /home/kingbase/cluster/R6C/R6HA/kingbase/bin/kbha -A daemon -f /home/kingbase/cluster/R6C/R6HA/kingbase/bin/../etc/repmgr.conf
kingbase 14878 1 0 14:28 ? 00:00:00 /home/kingbase/cluster/R6C/R6HA/kingbase/bin/repmgrd -d -v -f /home/kingbase/cluster/R6C/R6HA/kingbase/bin/../etc/repmgr.conf
6、物理备份测试
---如下所示 ,在主库执行sys_backup.sh init的备份初始化成功。
[kingbase@node1 bin]$ ./sys_backup.sh init
# generate single sys_rman.conf...DONE
# update single archive_command with sys_rman.archive-push...DONE
# create stanza and check...(maybe 60+ seconds)
# create stanza and check...DONE
# initial first full backup...(maybe several minutes)
# initial first full backup...DONE
# Initial sys_rman OK.
'sys_backup.sh start' should be executed when need back-rest feature.
#创建物理备份计划任务
[kingbase@node1 bin]$ ./sys_backup.sh start
Enable some sys_rman in crontab-daemon
Set full-backup in 7 days
Set incr-backup in 1 days
0 2 */7 * * kingbase /home/kingbase/cluster/R6C/R6HA/kingbase/bin/sys_rman --config=/home/kingbase/kbbr6_repo/sys_rman.conf --stanza=kingbase --archive-copy --type=full backup >> /home/kingbase/cluster/R6C/R6HA/kingbase/log/sys_rman_backup_full.log 2>&1
0 4 */1 * * kingbase /home/kingbase/cluster/R6C/R6HA/kingbase/bin/sys_rman --config=/home/kingbase/kbbr6_repo/sys_rman.conf --stanza=kingbase --archive-copy --type=incr backup >> /home/kingbase/cluster/R6C/R6HA/kingbase/log/sys_rman_backup_incr.log 2>&1
#查看计划任务
[kingbase@node1 bin]$ cat /etc/cron.d/KINGBASECRON
*/1 * * * * kingbase . /etc/profile;/home/kingbase/cluster/R6C/R6HA/kingbase/bin/kbha -A daemon -f /home/kingbase/cluster/R6C/R6HA/kingbase/bin/../etc/repmgr.conf
0 2 */7 * * kingbase /home/kingbase/cluster/R6C/R6HA/kingbase/bin/sys_rman --config=/home/kingbase/kbbr6_repo/sys_rman.conf --stanza=kingbase --archive-copy --type=full backup >> /home/kingbase/cluster/R6C/R6HA/kingbase/log/sys_rman_backup_full.log 2>&1
0 4 */1 * * kingbase /home/kingbase/cluster/R6C/R6HA/kingbase/bin/sys_rman --config=/home/kingbase/kbbr6_repo/sys_rman.conf --stanza=kingbase --archive-copy --type=incr backup >> /home/kingbase/cluster/R6C/R6HA/kingbase/log/sys_rman_backup_incr.log 2>&1
测试计划任务自动备份:
自动备份完成 :
[kingbase@node1 bin]$ /home/kingbase/cluster/R6C/R6HA/kingbase/bin/sys_rman --config=/home/kingbase/kbbr6_repo/sys_rman.conf --stanza=kingbase info
stanza: kingbase
status: ok
cipher: none
db (current)
wal archive min/max (V008R006C005B0023-1): 000000110000000200000032/000000130000000200000038
full backup: 20221205-113404F
timestamp start/stop: 2022-12-05 11:34:04 / 2022-12-05 11:35:58
wal start/stop: 000000110000000200000033 / 000000110000000200000033
database size: 710.9MB, backup size: 710.9MB
repository size: 54.9MB, repository backup size: 54.9MB
full backup: 20221205-144102F
timestamp start/stop: 2022-12-05 14:41:02 / 2022-12-05 14:42:38
wal start/stop: 000000130000000200000038 / 000000130000000200000038
database size: 807MB, backup size: 807MB
repository size: 61MB, repository backup size: 61MB
三、总结
通过以上对集群管理的测试可知,系统禁用普通用户su切换到root用户,集群日常管理不受影响。