0.环境检查

数据守护系统中各实例使用的 DM 服务器版本应一致,同时还应注意各实例所在主机的操作系统位数、大小端模式、时区及时间设置都应一致。

系统 版本
服务器 阿里云
操作系统 Debian 9
数据库系统 DM Database 64 V8 03134283890-20220525-161267-10045
数据守护 V4.0

1.数据准备

--主库
SQL> BACKUP DATABASE BACKUPSET '/dm8/data/BACKUP_FILE_01';
executed successfully
used time: 00:00:03.785. Execute id is 55404.
SQL> EXIT
dmdba@lxm2:~$ cd /dm8/data
dmdba@lxm2:/dm8/data$ ls
BACKUP_FILE_01    dm_arch  DMDB                                                                        
dmdba@lxm2:/dm8/data$ scp -r BACKUP_FILE_01 dmdba@39.99.248.41:/dm8/data
The authenticity of host '39.99.248.41 (39.99.248.41)' can't be established.
ECDSA key fingerprint is SHA256:6ewGXHplgu8H5fnCMwoI8dcE8PLgR4c74ra1kldues4.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '39.99.248.41' (ECDSA) to the list of known hosts.
dmdba@39.99.248.41's password: 
BACKUP_FILE_01.bak                                                                                           
BACKUP_FILE_01.meta                                                                                          
BACKUP_FILE_01_1.bak               


--备库
root@lxm:/dm8/data# ls -lrth
total 12K
drwxr-xr-x 6 dmdba dinstall 4.0K Jul  1 14:09 DMDB
drwxr-xr-x 2 dmdba dinstall 4.0K Jul  1 14:42 dm_arch
drwxr-xr-x 2 dmdba dinstall 4.0K Jul  1 15:19 BACKUP_FILE_01
root@lxm:/dm8/data# systemctl stop DmServiceDMSERVER.service
root@lxm:/dm8/data# su - dmdba
dmdba@lxm:~$ ls
COL.LOG  DB_DMDB_FULL_2022_06_24_11_54_00  DB_DMDB_FULL_2022_06_24_11_56_00  DB_DMDB_FULL_2022_07_01_11_56_02  dm_arch  dmdbms
dmdba@lxm:~$ rlwrap dmrman
dmrman V8
RMAN> RESTORE DATABASE '/dm8/data/DMDB/dm.ini' FROM BACKUPSET '/dm8/data/BACKUP_FILE_01';
RESTORE DATABASE '/dm8/data/DMDB/dm.ini' FROM BACKUPSET '/dm8/data/BACKUP_FILE_01';
Normal of FAST
Normal of DEFAULT
Normal of RECYCLE
Normal of KEEP
Normal of ROLL
[Percent:100.00%][Speed:0.00M/s][Cost:00:00:02][Remaining:00:00:00]
restore successfully.
time used: 00:00:02.756
RMAN> RECOVER DATABASE '/dm8/data/DMDB/dm.ini' FROM BACKUPSET '/dm8/data/BACKUP_FILE_01';
RECOVER DATABASE '/dm8/data/DMDB/dm.ini' FROM BACKUPSET '/dm8/data/BACKUP_FILE_01';
Database mode = 0, oguid = 0
Normal of FAST
Normal of DEFAULT
Normal of RECYCLE
Normal of KEEP
Normal of ROLL
EP[0]'s cur_lsn[40343], file_lsn[40343]
[Percent:100.00%][Speed:0.00PKG/s][Cost:00:00:00][Remaining:00:00:00]
recover successfully!
time used: 00:00:02.576
RMAN> RECOVER DATABASE '/dm8/data/DMDB/dm.ini' UPDATE DB_MAGIC;
RECOVER DATABASE '/dm8/data/DMDB/dm.ini' UPDATE DB_MAGIC;
Database mode = 0, oguid = 0
Normal of FAST
Normal of DEFAULT
Normal of RECYCLE
Normal of KEEP
Normal of ROLL
EP[0]'s cur_lsn[42046], file_lsn[42046]
recover successfully!
time used: 00:00:01.055
RMAN>

2.配置主库

配置文件

--dm.ini
INSTANCE_NAME = GRP1_RWW_01
PORT_NUM = 5236 #数据库实例监听端口
DW_INACTIVE_INTERVAL = 60 #接收守护进程消息超时时间
ALTER_MODE_STATUS = 0 #不允许手工方式修改实例模式/状态/OGUID
ENABLE_OFFLINE_TS = 2 #不允许备库 OFFLINE 表空间
MAL_INI = 1 #打开 MAL 系统
ARCH_INI = 1 #打开归档配置
RLOG_SEND_APPLY_MON = 64 #统计最近 64 次的日志发送信息

--dmmal.ini
MAL_CHECK_INTERVAL = 5 #MAL 链路检测时间间隔
MAL_CONN_FAIL_INTERVAL = 5 #判定 MAL 链路断开的时间
[MAL_INST1]
MAL_INST_NAME = GRP1_RWW_01 #实例名,和 dm.ini 中的 INSTANCE_NAME 一致
MAL_HOST = 172.21.215.74 #MAL 系统监听 TCP 连接的 IP 地址
MAL_PORT = 61141 #MAL 系统监听 TCP 连接的端口
MAL_INST_HOST = 8.142.141.128  #实例的对外服务 IP 地址
MAL_INST_PORT = 5236 #实例的对外服务端口,和 dm.ini 中的 PORT_NUM 一致
MAL_DW_PORT = 52141 #实例对应的守护进程监听 TCP 连接的端口
MAL_INST_DW_PORT = 33141 #实例监听守护进程 TCP 连接的端口
[MAL_INST2]
MAL_INST_NAME = GRP1_RWW_02
MAL_HOST = 172.17.44.74
MAL_PORT = 61142
MAL_INST_HOST = 39.103.139.80
MAL_INST_PORT = 5236
MAL_DW_PORT = 52142
MAL_INST_DW_PORT = 33142
[MAL_INST3]
MAL_INST_NAME = GRP1_RWW_03
MAL_HOST = 172.21.215.76
MAL_PORT = 61143
MAL_INST_HOST = 39.99.248.41
MAL_INST_PORT = 5236
MAL_DW_PORT = 52143
MAL_INST_DW_PORT = 33143

--dm_arch.ini
[ARCHIVE_TIMELY1] 
ARCH_TYPE = TIMELY #即时归档类型
ARCH_DEST = GRP1_RWW_02 #即时归档目标实例名
[ARCHIVE_TIMELY2] 
ARCH_TYPE = TIMELY #即时归档类型
ARCH_DEST = GRP1_RWW_03 #即时归档目标实例名

--dmwatcher.ini
[GRP1]
DW_TYPE = GLOBAL #全局守护类型
DW_MODE = AUTO #自动切换模式
DW_ERROR_TIME = 10 #远程守护进程故障认定时间
INST_RECOVER_TIME = 60 #主库守护进程启动恢复的间隔时间
INST_ERROR_TIME = 10 #本地实例故障认定时间
INST_OGUID = 453332 #守护系统唯一 OGUID 值
INST_INI = /dm8/data/DMDB/dm.ini #dm.ini 配置文件路径
INST_AUTO_RESTART = 1 #打开实例的自动启动功能
INST_STARTUP_CMD = /home/dmdba/dmdbms/bin/dmserver #命令行方式启动
RLOG_SEND_THRESHOLD = 0 #指定主库发送日志到备库的时间阈值,默认关闭
RLOG_APPLY_THRESHOLD = 0 #指定备库重演日志的时间阈值,默认关闭

启动主库

dmdba@lxm2:~$ dmserver /dm8/data/DMDB/dm.ini mount &
[1] 2611
dmdba@lxm2:~$ file dm.key not found, use default license!
version info: develop
DM Database Server 64 V8 03134283890-20220525-161267-10045 startup...
Normal of FAST
Normal of DEFAULT
Normal of RECYCLE
Normal of KEEP
Normal of ROLL
Database mode = 0, oguid = 0
License will expire on 2023-05-25
begin redo pwr log collect, last ckpt lsn: 41498 ...
redo pwr log collect finished
main rfil[/dm8/data/DMDB/DMDB01.log]'s grp collect 0 valid pwr record, discard 0 invalid pwr record
EP[0]'s cur_lsn[41498], file_lsn[41498]
begin redo log recover, last ckpt lsn: 41498 ...
redo log recover finished
ndct db load finished
ndct second level fill fast pool finished
ndct third level fill fast pool finished
ndct second level fill fast pool finished
ndct third level fill fast pool finished
ndct fill fast pool finished
nsvr_startup end.
aud sys init success.
aud rt sys init success.
systables desc init success.
ndct_db_load_info success.
SYSTEM IS READY.

设置OGUID

dmdba@lxm2:~$ rlwrap disql SYSDBA/SYSDBA

Server[LOCALHOST:5236]:mode is normal, state is mount
login used time : 1.846(ms)
disql V8
SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 1);
sp_set_oguid(453332);
DMSQL executed successfully
used time: 9.607(ms). Execute id is 0.
SQL> DMSQL executed successfully
used time: 8.003(ms). Execute id is 1.
SQL> SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 0);
DMSQL executed successfully
used time: 5.090(ms). Execute id is 2.

修改数据库模式

SQL> alter database primary;
executed successfully
used time: 9.360(ms). Execute id is 0.

3.配置备库01

配置文件

INSTANCE_NAME = GRP1_RWW_02
PORT_NUM = 5236 #数据库实例监听端口
DW_INACTIVE_INTERVAL = 60 #接收守护进程消息超时时间
ALTER_MODE_STATUS = 0 #不允许手工方式修改实例模式/状态/OGUID
ENABLE_OFFLINE_TS = 2 #不允许备库 OFFLINE 表空间
MAL_INI = 1 #打开 MAL 系统
ARCH_INI = 1 #打开归档配置
RLOG_SEND_APPLY_MON = 64 #统计最近 64 次的日志重演信息

MAL_CHECK_INTERVL = 5 #MAL 链路检测时间间隔
MAL_CONN_FAIL_INTERVAL = 5 #判定 MAL 链路断开的时间
[MAL_INST1]
MAL_INST_NAME = GRP1_RWW_01 #实例名,和 dm.ini 中的 INSTANCE_NAME 一致
MAL_HOST = 172.21.215.74 #MAL 系统监听 TCP 连接的 IP 地址
MAL_PORT = 61141 #MAL 系统监听 TCP 连接的端口
MAL_INST_HOST = 8.142.141.128  #实例的对外服务 IP 地址
MAL_INST_PORT = 5236 #实例的对外服务端口,和 dm.ini 中的 PORT_NUM 一致
MAL_DW_PORT = 52141 #实例对应的守护进程监听 TCP 连接的端口
MAL_INST_DW_PORT = 33141 #实例监听守护进程 TCP 连接的端口
[MAL_INST2]
MAL_INST_NAME = GRP1_RWW_02
MAL_HOST = 172.17.44.74
MAL_PORT = 61142
MAL_INST_HOST = 39.103.139.80
MAL_INST_PORT = 5236
MAL_DW_PORT = 52142
MAL_INST_DW_PORT = 33142
[MAL_INST3]
MAL_INST_NAME = GRP1_RWW_03
MAL_HOST = 172.21.215.76
MAL_PORT = 61143
MAL_INST_HOST = 39.99.248.41
MAL_INST_PORT = 5236
MAL_DW_PORT = 52143
MAL_INST_DW_PORT = 33143

[ARCHIVE_TIMELY1] 
ARCH_TYPE = TIMELY #即时归档类型
ARCH_DEST = GRP1_RWW_01 #即时归档目标实例名
[ARCHIVE_TIMELY2] 
ARCH_TYPE = TIMELY #即时归档类型
ARCH_DEST = GRP1_RWW_03 #即时归档目标实例名

[GRP1]
DW_TYPE =GLOBAL #全局守护类型
DW_MODE = AUTO #自动切换模式
DW_ERROR_TIME = 10 #远程守护进程故障认定时间
INST_RECOVER_TIME = 60 #主库守护进程启动恢复的间隔时间
INST_ERROR_TIME = 10 #本地实例故障认定时间
INST_OGUID = 453332 #守护系统唯一 OGUID 值
INST_INI = /dm8/data/DMDB/dm.ini #dm.ini 配置文件路径
INST_AUTO_RESTART = 1 #打开实例的自动启动功能
INST_STARTUP_CMD = /home/dmdba/dmdbms/bin/dmserver #命令行方式启动
RLOG_SEND_THRESHOLD = 0 #指定主库发送日志到备库的时间阈值,默认关闭
RLOG_APPLY_THRESHOLD = 0 #指定备库重演日志的时间阈值,默认关闭

启动备库

dmdba@lxm:~$ dmserver /dm8/data/DMDB/dm.ini mount &
[1] 3553
dmdba@lxm:~$ file dm.key not found, use default license!
version info: develop
DM Database Server 64 V8 03134283890-20220525-161267-10045 startup...
Normal of FAST
Normal of DEFAULT
Normal of RECYCLE
Normal of KEEP
Normal of ROLL
Database mode = 0, oguid = 0
License will expire on 2023-05-25
file lsn: 39229
ndct db load finished
ndct second level fill fast pool finished
ndct third level fill fast pool finished
ndct second level fill fast pool finished
ndct third level fill fast pool finished
ndct fill fast pool finished
nsvr_startup end.
aud sys init success.
aud rt sys init success.
systables desc init success.
ndct_db_load_info success.
SYSTEM IS READY.

设置OGUID

dmdba@lxm:~$ rlwrap disql SYSDBA/SYSDBA

Server[LOCALHOST:5236]:mode is normal, state is mount
login used time : 1.686(ms)
disql V8
SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 1);
DMSQL executed successfully
used time: 6.989(ms). Execute id is 0.
SQL> sp_set_oguid(453332);
DMSQL executed successfully
used time: 3.644(ms). Execute id is 1.
SQL> SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 0);
DMSQL executed successfully
used time: 4.848(ms). Execute id is 2.
SQL> 

修改数据库模式

SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 1);  
DMSQL executed successfully
used time: 4.853(ms). Execute id is 3.
SQL> alter database standby; 
executed successfully
used time: 6.120(ms). Execute id is 0.
SQL> SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 0); 
DMSQL executed successfully
used time: 4.636(ms). Execute id is 4.

4.配置备库02

配置文件

INSTANCE_NAME = GRP1_RWW_03
PORT_NUM = 5236 #数据库实例监听端口
DW_INACTIVE_INTERVAL = 60 #接收守护进程消息超时时间
ALTER_MODE_STATUS = 0 #不允许手工方式修改实例模式/状态/OGUID
ENABLE_OFFLINE_TS = 2 #不允许备库 OFFLINE 表空间
MAL_INI = 1 #打开 MAL 系统
ARCH_INI = 1 #打开归档配置
RLOG_SEND_APPLY_MON = 64 #统计最近 64 次的日志重演信息

MAL_CHECK_INTERVAL = 5 #MAL 链路检测时间间隔
MAL_CONN_FAIL_INTERVAL = 5 #判定 MAL 链路断开的时间
[MAL_INST1]
MAL_INST_NAME = GRP1_RWW_01 #实例名,和 dm.ini 中的 INSTANCE_NAME 一致
MAL_HOST = 172.21.215.74 #MAL 系统监听 TCP 连接的 IP 地址
MAL_PORT = 61141 #MAL 系统监听 TCP 连接的端口
MAL_INST_HOST = 8.142.141.128  #实例的对外服务 IP 地址
MAL_INST_PORT = 5236 #实例的对外服务端口,和 dm.ini 中的 PORT_NUM 一致
MAL_DW_PORT = 52141 #实例对应的守护进程监听 TCP 连接的端口
MAL_INST_DW_PORT = 33141 #实例监听守护进程 TCP 连接的端口
[MAL_INST2]
MAL_INST_NAME = GRP1_RWW_02
MAL_HOST = 172.17.44.74
MAL_PORT = 61142
MAL_INST_HOST = 39.103.139.80
MAL_INST_PORT = 5236
MAL_DW_PORT = 52142
MAL_INST_DW_PORT = 33142
[MAL_INST3]
MAL_INST_NAME = GRP1_RWW_03
MAL_HOST = 172.21.215.76
MAL_PORT = 61143
MAL_INST_HOST = 39.99.248.41
MAL_INST_PORT = 5236
MAL_DW_PORT = 52143
MAL_INST_DW_PORT = 33143

[ARCHIVE_TIMELY1] 
ARCH_TYPE = TIMELY #即时归档类型
ARCH_DEST = GRP1_RWW_01 #即时归档目标实例名
[ARCHIVE_TIMELY2] 
ARCH_TYPE = TIMELY #即时归档类型
ARCH_DEST = GRP1_RWW_02 #即时归档目标实例名

[GRP1]
DW_TYPE = GLOBAL #全局守护类型
DW_MODE = AUTO #自动切换模式
DW_ERROR_TIME = 10 #远程守护进程故障认定时间
INST_RECOVER_TIME = 60 #主库守护进程启动恢复的间隔时间
INST_ERROR_TIME = 10 #本地实例故障认定时间
INST_OGUID = 453332 #守护系统唯一 OGUID 值
INST_INI = /dm8/data/DMDB/dm.ini #dm.ini 配置文件路径
INST_AUTO_RESTART = 1 #打开实例的自动启动功能
INST_STARTUP_CMD = /home/dmdba/dmdbms/bin/dmserver #命令行方式启动
RLOG_SEND_THRESHOLD = 0 #指定主库发送日志到备库的时间阈值,默认关闭
RLOG_APPLY_THRESHOLD = 0 #指定备库重演日志的时间阈值,默认关闭

启动备库

dmdba@lxm3:~$ dmserver /dm8/data/DMDB/dm.ini mount &
[1] 19937
dmdba@lxm3:~$ file dm.key not found, use default license!
version info: develop
DM Database Server 64 V8 03134283890-20220525-161267-10045 startup...
Normal of FAST
Normal of DEFAULT
Normal of RECYCLE
Normal of KEEP
Normal of ROLL
Database mode = 0, oguid = 0
License will expire on 2023-05-25
file lsn: 39229
ndct db load finished
ndct second level fill fast pool finished
ndct third level fill fast pool finished
ndct second level fill fast pool finished
ndct third level fill fast pool finished
ndct fill fast pool finished
nsvr_startup end.
aud sys init success.
aud rt sys init success.
systables desc init success.
ndct_db_load_info success.
SYSTEM IS READY.

设置OGUID

SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 1);
sp_set_oguid(453332);
SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 0);

修改数据库模式

SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 1);  
alter database standby; 
SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 0); 

5.配置单实例监视器

MON_DW_CONFIRM = 1 #确认监视器模式
MON_LOG_PATH = C:\dmdbms #监视器日志文件存放路径
MON_LOG_INTERVAL = 60 #每隔 60s 定时记录系统信息到日志文件
MON_LOG_FILE_SIZE = 32 #每个日志文件最大 32M
MON_LOG_SPACE_LIMIT = 0 #不限定日志文件总占用空间
[GRP1]
MON_INST_OGUID = 453332 #组 GRP1 的唯一 OGUID 值
#以下配置为监视器到组 GRP1 的守护进程的连接信息,以―IP:PORT‖的形式配置
#IP 对应 dmmal.ini 中的 MAL_HOST,PORT 对应 dmmal.ini 中的 MAL_DW_PORT
MON_DW_IP = 8.142.141.128:52141
MON_DW_IP = 39.103.139.80:52142
MON_DW_IP = 39.99.248.41:52143

6.启动主备库守护进程

dmdba@lxm2:~$ dmwatcher /dm8/data/DMDB/dmwatcher.ini
DMWATCHER[4.0] V8
DMWATCHER[4.0] IS READY
SHOW
2022-07-04 21:51:30
---------------------------------------------------------------------------
GROUP_NAME       TYPE      MODE      OGUID       MPP_FLAG  AUTO_RESTART DW_STATUS       DW_SUB_STATUS   DW_CTL_STATUS   
GRP1             GLOBAL    AUTO      453332      FALSE     TRUE         OPEN            SUB_STATE_START VALID           

INST_OK   NAME             SVR_MODE  SYS_STATUS   RTYPE     FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG 
OK        GRP1_RWW_01      PRIMARY   OPEN         TIMELY    7708            43849           7708            43850           0            
---------------------------------------------------------------------------


--备库略

7.启动监视器

 C:\dmdbms\bin>dmmonitor.exe C:\dmdbms\data\dmmonitor.ini
[monitor]         2022-07-04 21:54:43: DMMONITOR[4.0] V8
[monitor]         2022-07-04 21:54:43: DMMONITOR[4.0] IS READY.

[monitor]         2022-07-04 21:54:43: 收到守护进程(GRP1_RWW_03)消息
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN
                  2022-07-04 21:54:43  OPEN           OK        GRP1_RWW_03      OPEN        STANDBY   NULL     3        43914           43914

[monitor]         2022-07-04 21:54:43: 收到守护进程(GRP1_RWW_01)消息
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN
                  2022-07-04 21:54:43  OPEN           OK        GRP1_RWW_01      OPEN        PRIMARY   VALID    3        43914           43914

[monitor]         2022-07-04 21:54:43: 收到守护进程(GRP1_RWW_02)消息
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN
                  2022-07-04 21:54:43  OPEN           OK        GRP1_RWW_02      OPEN        STANDBY   VALID    3        43914           43914

8.测试验证

完成读写分离集群的部署,并测试建表和插入数据备机是否能及时同步,并可查询到数据。

--主库建表
dmdba@lxm2:~$ rlwrap disql SYSDBA/SYSDBA

Server[LOCALHOST:5236]:mode is primary, state is open
login used time : 1.731(ms)
disql V8
SQL> CREATE TABLE T2(COL1 INT,COL2 INT,COL3 VARCHAR);
executed successfully
used time: 16.185(ms). Execute id is 600.
SQL> INSERT INTO T2 VALUES(1002,1002,'数据守护集群');
affect rows 1

used time: 0.801(ms). Execute id is 601.
SQL> COMMIT;
executed successfully
used time: 2.178(ms). Execute id is 602.
--备库查询
SQL> 
SQL> SELECT * FROM T2;

LINEID     COL1        COL2        COL3              
---------- ----------- ----------- ------------------
1          1002        1002        数据守护集群

used time: 0.235(ms). Execute id is 102.
SQL> 

模拟场景故障处理

(a)制造主机故障,测试备机可接管并持续提供服务。
(b)恢复故障主机,将故障主机加入集群。
(c)手动切换主备集群,恢复原主备集群的位置。
(d)测试读事务自动分发到备机的过程。

制造主机故障

root@lxm2:/home/dmdba/dmdbms/script/root# shutdown
Shutdown scheduled for Mon 2022-07-04 22:08:17 CST, use 'shutdown -c' to cancel.

备机自动接管

[monitor]         2022-07-04 21:59:22: 接收守护进程(GRP1_RWW_02)消息超时
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN
                  2022-07-04 21:59:11  ERROR          OK        GRP1_RWW_02      OPEN        STANDBY   VALID    3        44003           44003

[monitor]         2022-07-04 21:59:23: 守护进程(GRP1_RWW_02)状态切换 [NONE-->STARTUP]
[monitor]         2022-07-04 21:59:24: 守护进程(GRP1_RWW_02)状态切换 [STARTUP-->OPEN]
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN
                  2022-07-04 21:59:24  OPEN           OK        GRP1_RWW_02      OPEN        STANDBY   VALID    3        44007           44007

[monitor]         2022-07-04 21:59:42: 守护进程(GRP1_RWW_01)状态切换 [NONE-->STARTUP]
[monitor]         2022-07-04 21:59:44: 守护进程(GRP1_RWW_01)状态切换 [STARTUP-->OPEN]
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN
                  2022-07-04 21:59:44  OPEN           OK        GRP1_RWW_01      OPEN        PRIMARY   VALID    3        44014           44014

[monitor]         2022-07-04 22:07:54: 守护进程(GRP1_RWW_02)状态切换 [NONE-->OPEN]
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN
                  2022-07-04 22:07:54  OPEN           OK        GRP1_RWW_02      OPEN        STANDBY   VALID    3        44229           44229

[monitor]         2022-07-04 22:07:54: 守护进程(GRP1_RWW_01)状态切换 [NONE-->OPEN]
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN
                  2022-07-04 22:07:54  OPEN           OK        GRP1_RWW_01      OPEN        PRIMARY   VALID    3        44229           44230

[monitor]         2022-07-04 22:08:27: 接收守护进程(GRP1_RWW_01)消息超时
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN
                  2022-07-04 22:08:16  ERROR          OK        GRP1_RWW_01      OPEN        PRIMARY   VALID    3        44237           44237

[monitor]         2022-07-04 22:08:27: 检测到PRIMARY实例故障,开始对组(GRP1)执行自动接管

[monitor]         2022-07-04 22:08:27: 通知组(GRP1)当前活动的守护进程设置MID
[monitor]         2022-07-04 22:08:27: 通知组(GRP1)当前活动的守护进程设置MID成功
[monitor]         2022-07-04 22:08:27: 开始使用实例GRP1_RWW_02接管
[monitor]         2022-07-04 22:08:27: 通知守护进程GRP1_RWW_02切换TAKEOVER状态
[monitor]         2022-07-04 22:08:27: 守护进程(GRP1_RWW_02)状态切换 [OPEN-->TAKEOVER]
[monitor]         2022-07-04 22:08:27: 切换守护进程GRP1_RWW_02为TAKEOVER状态成功
[monitor]         2022-07-04 22:08:27: 实例GRP1_RWW_02开始执行SP_SET_GLOBAL_DW_STATUS(0, 7)语句
[monitor]         2022-07-04 22:08:27: 实例GRP1_RWW_02执行SP_SET_GLOBAL_DW_STATUS(0, 7)语句成功
[monitor]         2022-07-04 22:08:27: 实例GRP1_RWW_02开始执行ALTER DATABASE MOUNT语句
[monitor]         2022-07-04 22:08:28: 实例GRP1_RWW_02执行ALTER DATABASE MOUNT语句成功
[monitor]         2022-07-04 22:08:28: 实例GRP1_RWW_02开始执行ALTER DATABASE PRIMARY语句
[monitor]         2022-07-04 22:08:28: 实例GRP1_RWW_02执行ALTER DATABASE PRIMARY语句成功
[monitor]         2022-07-04 22:08:28: 通知实例GRP1_RWW_02修改所有归档状态无效
[monitor]         2022-07-04 22:08:28: 修改所有实例归档为无效状态成功
[monitor]         2022-07-04 22:08:28: 实例GRP1_RWW_02开始执行ALTER DATABASE OPEN FORCE语句
[monitor]         2022-07-04 22:08:28: 实例GRP1_RWW_02执行ALTER DATABASE OPEN FORCE语句成功
[monitor]         2022-07-04 22:08:28: 实例GRP1_RWW_02开始执行SP_SET_GLOBAL_DW_STATUS(7, 0)语句
[monitor]         2022-07-04 22:08:28: 实例GRP1_RWW_02执行SP_SET_GLOBAL_DW_STATUS(7, 0)语句成功
[monitor]         2022-07-04 22:08:28: 通知守护进程GRP1_RWW_02切换OPEN状态
[monitor]         2022-07-04 22:08:28: 守护进程(GRP1_RWW_02)状态切换 [TAKEOVER-->OPEN]
[monitor]         2022-07-04 22:08:29: 切换守护进程GRP1_RWW_02为OPEN状态成功
[monitor]         2022-07-04 22:08:30: 通知组(GRP1)的守护进程执行清理操作
[monitor]         2022-07-04 22:08:30: 清理守护进程(GRP1_RWW_02)请求成功
[monitor]         2022-07-04 22:08:30: 清理守护进程(GRP1_RWW_03)请求成功
[monitor]         2022-07-04 22:08:30: 使用实例GRP1_RWW_02接管成功

[monitor]         2022-07-04 22:08:30: 组(GRP1)使用实例GRP1_RWW_02自动接管成功

[monitor]         2022-07-04 22:08:31: 守护进程(GRP1_RWW_02)状态切换 [OPEN-->RECOVERY]
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN
                  2022-07-04 22:08:31  RECOVERY       OK        GRP1_RWW_02      OPEN        PRIMARY   VALID    4        46709           46719

[monitor]         2022-07-04 22:08:33: 守护进程(GRP1_RWW_02)状态切换 [RECOVERY-->OPEN]
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN
                  2022-07-04 22:08:33  OPEN           OK        GRP1_RWW_02      OPEN        PRIMARY   VALID    4        46725           46726



恢复故障主机

#================================================================================#

[monitor]         2022-07-04 22:15:30: 守护进程(GRP1_RWW_01)状态切换 [NONE-->STARTUP]
[monitor]         2022-07-04 22:15:32: 守护进程(GRP1_RWW_01)状态切换 [STARTUP-->UNIFY EP]
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN
                  2022-07-04 22:15:32  UNIFY EP       OK        GRP1_RWW_01      MOUNT       PRIMARY   VALID    3        44237           44237

[monitor]         2022-07-04 22:15:32: 守护进程(GRP1_RWW_01)状态切换 [UNIFY EP-->STARTUP]
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN
                  2022-07-04 22:15:32  STARTUP        OK        GRP1_RWW_01      MOUNT       STANDBY   INVALID  3        44237           44237

[monitor]         2022-07-04 22:15:32: 守护进程(GRP1_RWW_01)状态切换 [STARTUP-->OPEN]
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN
                  2022-07-04 22:15:32  OPEN           OK        GRP1_RWW_01      OPEN        STANDBY   INVALID  3        44237           44237

[monitor]         2022-07-04 22:15:32: 守护进程(GRP1_RWW_02)状态切换 [OPEN-->RECOVERY]
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN
                  2022-07-04 22:15:32  RECOVERY       OK        GRP1_RWW_02      OPEN        PRIMARY   VALID    4        51023           51023

[monitor]         2022-07-04 22:15:35: 守护进程(GRP1_RWW_02)状态切换 [RECOVERY-->OPEN]
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN
                  2022-07-04 22:15:35  OPEN           OK        GRP1_RWW_02      OPEN        PRIMARY   VALID    4        51050           51050

show
2022-07-04 22:15:54
#================================================================================#
GROUP            OGUID       MON_CONFIRM     MODE            MPP_FLAG
GRP1             453332      TRUE            AUTO            FALSE


<<DATABASE GLOBAL INFO:>>
DW_IP               MAL_DW_PORT  WTIME                WTYPE     WCTLSTAT  WSTATUS        INAME            INST_OK   N_EP  N_OK  ISTATUS     IMODE     DSC_STATUS     RTYPE     RSTAT
39.103.139.80       52142        2022-07-04 22:15:53  GLOBAL    VALID     OPEN           GRP1_RWW_02      OK        1     1     OPEN        PRIMARY   DSC_OPEN       TIMELY    VALID

EP INFO:
INST_IP             INST_PORT  INST_OK   INAME            ISTATUS     IMODE     DSC_SEQNO  DSC_CTL_NODE RTYPE     RSTAT    FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG
39.103.139.80       5236       OK        GRP1_RWW_02      OPEN        PRIMARY   0          0            TIMELY    VALID    8515            51231           8515            51231           NONE

<<DATABASE GLOBAL INFO:>>
DW_IP               MAL_DW_PORT  WTIME                WTYPE     WCTLSTAT  WSTATUS        INAME            INST_OK   N_EP  N_OK  ISTATUS     IMODE     DSC_STATUS     RTYPE     RSTAT
8.142.141.128       52141        2022-07-04 22:15:53  GLOBAL    VALID     OPEN           GRP1_RWW_01      OK        1     1     OPEN        STANDBY   DSC_OPEN       TIMELY    VALID

EP INFO:
INST_IP             INST_PORT  INST_OK   INAME            ISTATUS     IMODE     DSC_SEQNO  DSC_CTL_NODE RTYPE     RSTAT    FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG
8.142.141.128       5236       OK        GRP1_RWW_01      OPEN        STANDBY   0          0            TIMELY    VALID    8059            51231           8059            51231           NONE

DATABASE(GRP1_RWW_01) APPLY INFO FROM (GRP1_RWW_02), REDOS_PARALLEL_NUM (1):
DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[8515, 8515, 8515], (RLSN, SLSN, KLSN)[51231, 51231, 51231], N_TSK[0], TSK_MEM_USE[0]
REDO_LSN_ARR: (51231)


<<DATABASE GLOBAL INFO:>>
DW_IP               MAL_DW_PORT  WTIME                WTYPE     WCTLSTAT  WSTATUS        INAME            INST_OK   N_EP  N_OK  ISTATUS     IMODE     DSC_STATUS     RTYPE     RSTAT
39.99.248.41        52143        2022-07-04 22:15:53  GLOBAL    VALID     OPEN           GRP1_RWW_03      OK        1     1     OPEN        STANDBY   DSC_OPEN       TIMELY    VALID

EP INFO:
INST_IP             INST_PORT  INST_OK   INAME            ISTATUS     IMODE     DSC_SEQNO  DSC_CTL_NODE RTYPE     RSTAT    FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG
39.99.248.41        5236       OK        GRP1_RWW_03      OPEN        STANDBY   0          0            TIMELY    VALID    5645            51221           5645            51221           NONE

DATABASE(GRP1_RWW_03) APPLY INFO FROM (GRP1_RWW_02), REDOS_PARALLEL_NUM (1):
DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[8514, 8514, 8514], (RLSN, SLSN, KLSN)[51221, 51221, 51221], N_TSK[0], TSK_MEM_USE[0]
REDO_LSN_ARR: (51221)


#================================================================================#

恢复原主备位置

switchover GRP1_RWW_01
[monitor]         2022-07-04 22:26:53: 存在SLSN更大的活动实例GRP1_RWW_03[STANDBY, OPEN, ISTAT_SAME:TRUE]

LOGIN
用户名:SYSDBA
密码:
[monitor]         2022-07-04 22:27:11: 登录监视器成功!

switchover GRP1_RWW_01
[monitor]         2022-07-04 22:27:17: 存在SLSN更大的活动实例GRP1_RWW_03[STANDBY, OPEN, ISTAT_SAME:TRUE]

SHOW
2022-07-04 22:28:32
#================================================================================#
GROUP            OGUID       MON_CONFIRM     MODE            MPP_FLAG
GRP1             453332      TRUE            AUTO            FALSE


<<DATABASE GLOBAL INFO:>>
DW_IP               MAL_DW_PORT  WTIME                WTYPE     WCTLSTAT  WSTATUS        INAME            INST_OK   N_EP  N_OK  ISTATUS     IMODE     DSC_STATUS     RTYPE     RSTAT
39.103.139.80       52142        2022-07-04 22:28:31  GLOBAL    VALID     OPEN           GRP1_RWW_02      OK        1     1     OPEN        PRIMARY   DSC_OPEN       TIMELY    VALID

EP INFO:
INST_IP             INST_PORT  INST_OK   INAME            ISTATUS     IMODE     DSC_SEQNO  DSC_CTL_NODE RTYPE     RSTAT    FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG
39.103.139.80       5236       OK        GRP1_RWW_02      OPEN        PRIMARY   0          0            TIMELY    VALID    9409            58993           9409            58994           NONE

<<DATABASE GLOBAL INFO:>>
DW_IP               MAL_DW_PORT  WTIME                WTYPE     WCTLSTAT  WSTATUS        INAME            INST_OK   N_EP  N_OK  ISTATUS     IMODE     DSC_STATUS     RTYPE     RSTAT
8.142.141.128       52141        2022-07-04 22:28:31  GLOBAL    VALID     OPEN           GRP1_RWW_01      OK        1     1     OPEN        STANDBY   DSC_OPEN       TIMELY    VALID

EP INFO:
INST_IP             INST_PORT  INST_OK   INAME            ISTATUS     IMODE     DSC_SEQNO  DSC_CTL_NODE RTYPE     RSTAT    FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG
8.142.141.128       5236       OK        GRP1_RWW_01      OPEN        STANDBY   0          0            TIMELY    VALID    8059            58983           8059            58983           NONE

DATABASE(GRP1_RWW_01) APPLY INFO FROM (GRP1_RWW_02), REDOS_PARALLEL_NUM (1):
DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[9408, 9408, 9408], (RLSN, SLSN, KLSN)[58983, 58983, 58983], N_TSK[0], TSK_MEM_USE[0]
REDO_LSN_ARR: (58983)


<<DATABASE GLOBAL INFO:>>
DW_IP               MAL_DW_PORT  WTIME                WTYPE     WCTLSTAT  WSTATUS        INAME            INST_OK   N_EP  N_OK  ISTATUS     IMODE     DSC_STATUS     RTYPE     RSTAT
39.99.248.41        52143        2022-07-04 22:28:31  GLOBAL    VALID     OPEN           GRP1_RWW_03      OK        1     1     OPEN        STANDBY   DSC_OPEN       TIMELY    VALID

EP INFO:
INST_IP             INST_PORT  INST_OK   INAME            ISTATUS     IMODE     DSC_SEQNO  DSC_CTL_NODE RTYPE     RSTAT    FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG
39.99.248.41        5236       OK        GRP1_RWW_03      OPEN        STANDBY   0          0            TIMELY    VALID    5645            58993           5645            58993           NONE

DATABASE(GRP1_RWW_03) APPLY INFO FROM (GRP1_RWW_02), REDOS_PARALLEL_NUM (1):
DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[9409, 9409, 9409], (RLSN, SLSN, KLSN)[58993, 58993, 58993], N_TSK[0], TSK_MEM_USE[0]
REDO_LSN_ARR: (58993)


#================================================================================#

SHOW
2022-07-04 22:33:04
#================================================================================#
GROUP            OGUID       MON_CONFIRM     MODE            MPP_FLAG
GRP1             453332      TRUE            AUTO            FALSE


<<DATABASE GLOBAL INFO:>>
DW_IP               MAL_DW_PORT  WTIME                WTYPE     WCTLSTAT  WSTATUS        INAME            INST_OK   N_EP  N_OK  ISTATUS     IMODE     DSC_STATUS     RTYPE     RSTAT
39.103.139.80       52142        2022-07-04 22:33:04  GLOBAL    VALID     OPEN           GRP1_RWW_02      OK        1     1     OPEN        PRIMARY   DSC_OPEN       TIMELY    VALID

EP INFO:
INST_IP             INST_PORT  INST_OK   INAME            ISTATUS     IMODE     DSC_SEQNO  DSC_CTL_NODE RTYPE     RSTAT    FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG
39.103.139.80       5236       OK        GRP1_RWW_02      OPEN        PRIMARY   0          0            TIMELY    VALID    9716            61791           9716            61791           NONE

<<DATABASE GLOBAL INFO:>>
DW_IP               MAL_DW_PORT  WTIME                WTYPE     WCTLSTAT  WSTATUS        INAME            INST_OK   N_EP  N_OK  ISTATUS     IMODE     DSC_STATUS     RTYPE     RSTAT
8.142.141.128       52141        2022-07-04 22:33:04  GLOBAL    VALID     OPEN           GRP1_RWW_01      OK        1     1     OPEN        STANDBY   DSC_OPEN       TIMELY    VALID

EP INFO:
INST_IP             INST_PORT  INST_OK   INAME            ISTATUS     IMODE     DSC_SEQNO  DSC_CTL_NODE RTYPE     RSTAT    FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG
8.142.141.128       5236       OK        GRP1_RWW_01      OPEN        STANDBY   0          0            TIMELY    VALID    8059            61780           8059            61780           NONE

DATABASE(GRP1_RWW_01) APPLY INFO FROM (GRP1_RWW_02), REDOS_PARALLEL_NUM (1):
DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[9714, 9714, 9714], (RLSN, SLSN, KLSN)[61780, 61780, 61780], N_TSK[0], TSK_MEM_USE[0]
REDO_LSN_ARR: (61780)


<<DATABASE GLOBAL INFO:>>
DW_IP               MAL_DW_PORT  WTIME                WTYPE     WCTLSTAT  WSTATUS        INAME            INST_OK   N_EP  N_OK  ISTATUS     IMODE     DSC_STATUS     RTYPE     RSTAT
39.99.248.41        52143        2022-07-04 22:33:04  GLOBAL    VALID     OPEN           GRP1_RWW_03      OK        1     1     OPEN        STANDBY   DSC_OPEN       TIMELY    VALID

EP INFO:
INST_IP             INST_PORT  INST_OK   INAME            ISTATUS     IMODE     DSC_SEQNO  DSC_CTL_NODE RTYPE     RSTAT    FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG
39.99.248.41        5236       OK        GRP1_RWW_03      OPEN        STANDBY   0          0            TIMELY    VALID    5645            61780           5645            61780           NONE

DATABASE(GRP1_RWW_03) APPLY INFO FROM (GRP1_RWW_02), REDOS_PARALLEL_NUM (1):
DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[9714, 9714, 9714], (RLSN, SLSN, KLSN)[61780, 61780, 61780], N_TSK[0], TSK_MEM_USE[0]
REDO_LSN_ARR: (61780)


#================================================================================#

switchover GRP1_RWW_01
[monitor]         2022-07-04 22:33:17: 开始切换实例GRP1_RWW_01
[monitor]         2022-07-04 22:33:17: 通知守护进程GRP1_RWW_02切换SWITCHOVER状态
[monitor]         2022-07-04 22:33:17: 守护进程(GRP1_RWW_02)状态切换 [OPEN-->SWITCHOVER]
[monitor]         2022-07-04 22:33:18: 切换守护进程GRP1_RWW_02为SWITCHOVER状态成功
[monitor]         2022-07-04 22:33:18: 通知守护进程GRP1_RWW_01切换SWITCHOVER状态
[monitor]         2022-07-04 22:33:18: 守护进程(GRP1_RWW_01)状态切换 [OPEN-->SWITCHOVER]
[monitor]         2022-07-04 22:33:18: 切换守护进程GRP1_RWW_01为SWITCHOVER状态成功
[monitor]         2022-07-04 22:33:18: 实例GRP1_RWW_02开始执行SP_SET_GLOBAL_DW_STATUS(0, 6)语句
[monitor]         2022-07-04 22:33:18: 实例GRP1_RWW_02执行SP_SET_GLOBAL_DW_STATUS(0, 6)语句成功
[monitor]         2022-07-04 22:33:18: 实例GRP1_RWW_01开始执行SP_SET_GLOBAL_DW_STATUS(0, 6)语句
[monitor]         2022-07-04 22:33:19: 实例GRP1_RWW_01执行SP_SET_GLOBAL_DW_STATUS(0, 6)语句成功
[monitor]         2022-07-04 22:33:19: 实例GRP1_RWW_02开始执行ALTER DATABASE MOUNT语句
[monitor]         2022-07-04 22:33:19: 实例GRP1_RWW_02执行ALTER DATABASE MOUNT语句成功
[monitor]         2022-07-04 22:33:19: 实例GRP1_RWW_01开始执行ALTER DATABASE MOUNT语句
[monitor]         2022-07-04 22:33:19: 实例GRP1_RWW_01执行ALTER DATABASE MOUNT语句成功
[monitor]         2022-07-04 22:33:19: 实例GRP1_RWW_02开始执行ALTER DATABASE STANDBY语句
[monitor]         2022-07-04 22:33:19: 实例GRP1_RWW_02执行ALTER DATABASE STANDBY语句成功
[monitor]         2022-07-04 22:33:19: 实例GRP1_RWW_01开始执行ALTER DATABASE PRIMARY语句
[monitor]         2022-07-04 22:33:19: 实例GRP1_RWW_01执行ALTER DATABASE PRIMARY语句成功
[monitor]         2022-07-04 22:33:19: 通知实例GRP1_RWW_01修改所有归档状态无效
[monitor]         2022-07-04 22:33:19: 修改所有实例归档为无效状态成功
[monitor]         2022-07-04 22:33:19: 实例GRP1_RWW_02开始执行ALTER DATABASE OPEN FORCE语句
[monitor]         2022-07-04 22:33:19: 实例GRP1_RWW_02执行ALTER DATABASE OPEN FORCE语句成功
[monitor]         2022-07-04 22:33:19: 实例GRP1_RWW_01开始执行ALTER DATABASE OPEN FORCE语句
[monitor]         2022-07-04 22:33:20: 实例GRP1_RWW_01执行ALTER DATABASE OPEN FORCE语句成功
[monitor]         2022-07-04 22:33:20: 实例GRP1_RWW_02开始执行SP_SET_GLOBAL_DW_STATUS(6, 0)语句
[monitor]         2022-07-04 22:33:20: 实例GRP1_RWW_02执行SP_SET_GLOBAL_DW_STATUS(6, 0)语句成功
[monitor]         2022-07-04 22:33:20: 实例GRP1_RWW_01开始执行SP_SET_GLOBAL_DW_STATUS(6, 0)语句
[monitor]         2022-07-04 22:33:20: 实例GRP1_RWW_01执行SP_SET_GLOBAL_DW_STATUS(6, 0)语句成功
[monitor]         2022-07-04 22:33:20: 通知守护进程GRP1_RWW_02切换OPEN状态
[monitor]         2022-07-04 22:33:20: 守护进程(GRP1_RWW_02)状态切换 [SWITCHOVER-->OPEN]
[monitor]         2022-07-04 22:33:20: 切换守护进程GRP1_RWW_02为OPEN状态成功
[monitor]         2022-07-04 22:33:20: 通知守护进程GRP1_RWW_01切换OPEN状态
[monitor]         2022-07-04 22:33:20: 守护进程(GRP1_RWW_01)状态切换 [SWITCHOVER-->OPEN]
[monitor]         2022-07-04 22:33:21: 切换守护进程GRP1_RWW_01为OPEN状态成功
[monitor]         2022-07-04 22:33:21: 通知组(GRP1)的守护进程执行清理操作
[monitor]         2022-07-04 22:33:21: 清理守护进程(GRP1_RWW_01)请求成功
[monitor]         2022-07-04 22:33:21: 清理守护进程(GRP1_RWW_02)请求成功
[monitor]         2022-07-04 22:33:21: 清理守护进程(GRP1_RWW_03)请求成功
[monitor]         2022-07-04 22:33:22: 实例GRP1_RWW_01切换成功

2022-07-04 22:33:22
#================================================================================#
GROUP            OGUID       MON_CONFIRM     MODE            MPP_FLAG
GRP1             453332      TRUE            AUTO            FALSE


<<DATABASE GLOBAL INFO:>>
DW_IP               MAL_DW_PORT  WTIME                WTYPE     WCTLSTAT  WSTATUS        INAME            INST_OK   N_EP  N_OK  ISTATUS     IMODE     DSC_STATUS     RTYPE     RSTAT
8.142.141.128       52141        2022-07-04 22:33:21  GLOBAL    VALID     OPEN           GRP1_RWW_01      OK        1     1     OPEN        PRIMARY   DSC_OPEN       TIMELY    VALID

EP INFO:
INST_IP             INST_PORT  INST_OK   INAME            ISTATUS     IMODE     DSC_SEQNO  DSC_CTL_NODE RTYPE     RSTAT    FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG
8.142.141.128       5236       OK        GRP1_RWW_01      OPEN        PRIMARY   0          0            TIMELY    VALID    9740            64394           9740            64394           NONE

<<DATABASE GLOBAL INFO:>>
DW_IP               MAL_DW_PORT  WTIME                WTYPE     WCTLSTAT  WSTATUS        INAME            INST_OK   N_EP  N_OK  ISTATUS     IMODE     DSC_STATUS     RTYPE     RSTAT
39.103.139.80       52142        2022-07-04 22:33:21  GLOBAL    VALID     OPEN           GRP1_RWW_02      OK        1     1     OPEN        STANDBY   DSC_OPEN       TIMELY    INVALID

EP INFO:
INST_IP             INST_PORT  INST_OK   INAME            ISTATUS     IMODE     DSC_SEQNO  DSC_CTL_NODE RTYPE     RSTAT    FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG
39.103.139.80       5236       OK        GRP1_RWW_02      OPEN        STANDBY   0          0            TIMELY    INVALID  9737            61947           9737            61947           NONE

DATABASE(GRP1_RWW_02) APPLY INFO FROM (GRP1_RWW_01), REDOS_PARALLEL_NUM (1):
DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[9737, 9737, 9737], (RLSN, SLSN, KLSN)[61947, 61947, 61947], N_TSK[0], TSK_MEM_USE[0]
REDO_LSN_ARR: (61947)


<<DATABASE GLOBAL INFO:>>
DW_IP               MAL_DW_PORT  WTIME                WTYPE     WCTLSTAT  WSTATUS        INAME            INST_OK   N_EP  N_OK  ISTATUS     IMODE     DSC_STATUS     RTYPE     RSTAT
39.99.248.41        52143        2022-07-04 22:33:21  GLOBAL    VALID     OPEN           GRP1_RWW_03      OK        1     1     OPEN        STANDBY   DSC_OPEN       TIMELY    INVALID

EP INFO:
INST_IP             INST_PORT  INST_OK   INAME            ISTATUS     IMODE     DSC_SEQNO  DSC_CTL_NODE RTYPE     RSTAT    FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG
39.99.248.41        5236       OK        GRP1_RWW_03      OPEN        STANDBY   0          0            TIMELY    INVALID  5645            61947           5645            61947           NONE

DATABASE(GRP1_RWW_03) APPLY INFO FROM (GRP1_RWW_02), REDOS_PARALLEL_NUM (1):
DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[9737, 9737, 9737], (RLSN, SLSN, KLSN)[61947, 61947, 61947], N_TSK[0], TSK_MEM_USE[0]
REDO_LSN_ARR: (61947)


#================================================================================#

[monitor]         2022-07-04 22:33:23: 守护进程(GRP1_RWW_01)状态切换 [OPEN-->RECOVERY]
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN
                  2022-07-04 22:33:23  RECOVERY       OK        GRP1_RWW_01      OPEN        PRIMARY   VALID    5        64394           64395

[monitor]         2022-07-04 22:33:25: 守护进程(GRP1_RWW_01)状态切换 [RECOVERY-->OPEN]
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN
                  2022-07-04 22:33:25  OPEN           OK        GRP1_RWW_01      OPEN        PRIMARY   VALID    5        64395           64395


测试读事务自动分发到备机

配置dm_svc.conf

TIME_ZONE=(480)
LANGUAGE=(cn)
DW_SVC=(8.142.141.128:5236,39.103.139.80:5236,39.99.248.41:5236)

[DW_SVC]
LOGIN_MODE=(1)
RW_SEPARATE=(1)
RW_PERCENT=(30)
SWITCH_TIME=6000
SWITCH_INTERVAL=500
TRACE=DEBUG

搭建开发环境,使用springboot项目测试主备集群数据库

server.port= 8080
spring.datasource.url= jdbc:dm://DW_SVC?rwSeparate=1&rwPercent=30&logLevel=all
spring.datasource.driver-class-name= dm.jdbc.driver.DmDriver
spring.datasource.password= SYSDBA
spring.datasource.username= SYSDBA

spring.datasource.druid.initial-size=1
spring.datasource.druid.min-idle=1
spring.datasource.druid.max-active=20
spring.datasource.druid.test-on-borrow=true
spring.datasource.druid.stat-view-servlet.allow=true
package com.example.config;

import com.alibaba.druid.pool.DruidDataSource;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

import javax.sql.DataSource;

@Configuration
public class JdbcConfig {

    @Bean
    @ConfigurationProperties(prefix = "jdbc")
    public DataSource dataSource(){
        DruidDataSource dataSource= new DruidDataSource();
        return dataSource;
    }
}

接口查询代码

package com.example.demo;

import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.ResponseBody;

import javax.annotation.Resource;
import java.util.List;
import java.util.Map;

@Controller
@RequestMapping("/jdbc")
public class JdbcController {
    @Resource
    private JdbcTemplate jdbcTemplate;

    @RequestMapping("/info")
    @ResponseBody
    public List<Map<String, Object>> list() {
        String sql = "SELECT * FROM T2";
        List<Map<String, Object>> infoList = jdbcTemplate.queryForList(sql);
        for (Map<String, Object> info : infoList) {
            System.out.println(info.toString());
        }
        return infoList;
    }
}

使用jemeter测试工具100并发压测数据库,日志如下。

[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { DmDriver@5a6fd36d } connect(DmProperties): conn-1007, sessionID-0xdc17c40;  [PARAMS]: {switchinterval=500, rwseparate=1, lastloginip=::ffff:223.71.97.46, svrstat=4, language=cn, rwpercent=30, globalserverseries=7, lastlogintime=2022-07-05 23:16:53, svrmode=1, dsccontrol=true, , backslashescape=false, clientuser=SYSDBA, localtimezone=480, host=8.142.141.128, switchtimes=6000, loginmode=1, loginwarningid=2, epgroup=DW_SVC=([8.142.141.128:5236 (PRIMARY, OPEN, DSC CONTROL), sessions: 0, 39.103.139.80:5236 (STANDBY, OPEN, DSC CONTROL), sessions: 0, 39.99.248.41:5236 (STANDBY, OPEN, DSC CONTROL), sessions: 0]), lifetimeremainder=-1, gracetimeremainder=0, failedattempts=0, serverversion=8.1.2.114, clienthostname=8.142.141.128, url=jdbc:dm://DW_SVC?rwSeparate=1&rwPercent=30&logLevel=all, applicationname=, instancename=GRP1_RWW_01, dbname=DMDB, port=5236, loglevel=all, guid=16EE621216EE621247FBC7757841ABD7, user=SYSDBA};
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1007 } isClosed(): false;
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1007 } clearWarnings();
[DEBUG - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] try connect loop 0
[DEBUG - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009 } access();  CMD_STARTUP
[DEBUG - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009 } access();  CMD_LOGIN
[DEBUG - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] try connect fail [39.103.139.80:5236 (STANDBY, OPEN, DSC CONTROL), sessions: 0] 服务器模式不匹配
[DEBUG - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009 } access();  CMD_STARTUP
[DEBUG - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009 } access();  CMD_LOGIN
[DEBUG - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] try connect fail [39.99.248.41:5236 (STANDBY, OPEN, DSC CONTROL), sessions: 0] 服务器模式不匹配
[DEBUG - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009 } access();  CMD_STARTUP
[DEBUG - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009 } access();  CMD_LOGIN
[DEBUG - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] try connect success [8.142.141.128:5236]
[DEBUG - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009 } access();  CMD_STMT_ALLOCATE
[DEBUG - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009 } access();  CMD_PREPARE
[DEBUG - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] try connect loop 0
[DEBUG - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1010 } accessStandby();  CMD_STARTUP
[DEBUG - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1010 } accessStandby();  CMD_LOGIN
[DEBUG - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] try connect success [39.103.139.80:5236]
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { DmDriver@5a6fd36d } connect(DmProperties): conn-1009, sessionID-0x7f1610525b70;  [PARAMS]: {switchinterval=500, rwseparate=1, lastloginip=::ffff:223.71.97.46, svrstat=4, language=cn, rwpercent=30, globalserverseries=7, lastlogintime=2022-07-05 23:16:53, svrmode=1, dsccontrol=true, , backslashescape=false, clientuser=SYSDBA, localtimezone=480, host=8.142.141.128, switchtimes=6000, loginmode=1, loginwarningid=2, epgroup=DW_SVC=([8.142.141.128:5236 (PRIMARY, OPEN, DSC CONTROL), sessions: 0, 39.103.139.80:5236 (STANDBY, OPEN, DSC CONTROL), sessions: 0, 39.99.248.41:5236 (STANDBY, OPEN, DSC CONTROL), sessions: 0]), lifetimeremainder=-1, gracetimeremainder=0, failedattempts=0, serverversion=8.1.2.114, clienthostname=8.142.141.128, url=jdbc:dm://DW_SVC?rwSeparate=1&rwPercent=30&logLevel=all, applicationname=, instancename=GRP1_RWW_01, dbname=DMDB, port=5236, loglevel=all, guid=16EE621216EE621247FBC7757841ABD7, user=SYSDBA};
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009 } isClosed(): false;
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009 } clearWarnings();
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009 } getAutoCommit(): true;
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009 } getTransactionIsolation(): 2;
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009 } getTransactionIsolation(): 2;
[DEBUG - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1010 } accessStandby();  CMD_STMT_ALLOCATE
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009 } createStatement(): stmt-19, handle-0;
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009, stmt-19 } setQueryTimeout(Integer);  [PARAMS]: 10;
[DEBUG - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009 } access();  CMD_PREPARE
[SQL   - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009, stmt-19 } executeQuery(String): rs-10;  [PARAMS]: "select * from T2;"; [USED TIME]: 11.4807ms;
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009, stmt-19, rs-10 } getMetaData(): rsmd-5;
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { rsmd-5 } getColumnCount(): 3;
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { rsmd-5 } getColumnLabel(Integer): "COL1";  [PARAMS]: 1;
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { rsmd-5 } getColumnLabel(Integer): "COL2";  [PARAMS]: 2;
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { rsmd-5 } getColumnLabel(Integer): "COL3";  [PARAMS]: 3;
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009, stmt-19, rs-10 } next(): true;
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009, stmt-19, rs-10 } getObject(Integer): 1002;  [PARAMS]: 1;
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009, stmt-19, rs-10 } getObject(Integer): 1002;  [PARAMS]: 2;
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009, stmt-19, rs-10 } getObject(Integer): "数据守护集群";  [PARAMS]: 3;
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009, stmt-19, rs-10 } next(): false;
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009, stmt-19, rs-10 } close();
[INFO  - 2022-07-05 23:16:52] tid:457 - [Thread Group 1-1] { conn-1009, stmt-19 } close();

中间有出现服务器模式不匹配,是集群节点选主的过程。

监控备机

压测期间查看备机库sql运行情况, 主备读写分离生效。

dmdba@lxm2:~$ rlwrap disql SYSDBA/SYSDBA

Server[LOCALHOST:5236]:mode is standby, state is open
login used time : 1.649(ms)
disql V8
SQL> alter system set 'ENABLE_MONITOR'=1;
DMSQL executed successfully
used time: 16.890(ms). Execute id is 294200.

SQL> SELECT TOP_SQL_TEXT,START_TIME FROM V$SQL_HISTORY ORDER BY START_TIME DESC LIMIT 10;

LINEID     TOP_SQL_TEXT     START_TIME                
---------- ---------------- --------------------------
1          SELECT * FROM T2 2022-07-05 22:17:37.000000
2          SELECT * FROM T2 2022-07-05 22:17:37.000000
3          SELECT * FROM T2 2022-07-05 22:17:37.000000
4          SELECT * FROM T2 2022-07-05 22:17:37.000000
5          SELECT * FROM T2 2022-07-05 22:17:37.000000
6          SELECT * FROM T2 2022-07-05 22:17:37.000000
7          SELECT * FROM T2 2022-07-05 22:17:37.000000
8          SELECT * FROM T2 2022-07-05 22:17:37.000000
9          SELECT * FROM T2 2022-07-05 22:17:37.000000
10         SELECT * FROM T2 2022-07-05 22:17:37.000000

10 rows got

used time: 0.996(ms). Execute id is 3327.

更多内容请参见达梦社区:https://eco.dameng.com