代码改变世界

【OGG 故障处理】 丢失归档恢复

2019-05-30 16:50  那个,我  阅读(2097)  评论(0编辑  收藏  举报

OGG 有两天由于某种原因没有启动,而这段时间的备份文件缺失了一部分归档。恢复过程记录如下:

GGSCI (xxxx) 6> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     RUNNING     DP_XXXX3    00:00:00      00:00:08    
EXTRACT     RUNNING     DP_XXXX4    00:00:00      00:00:05       
EXTRACT     ABENDED     EX_XXXX3    00:00:01      65:09:29    
EXTRACT     ABENDED     EX_XXXX4    00:00:01      65:09:35    


GGSCI (xxxx) 6> view report EX_XXXX1
2019-05-27 14:58:34  ERROR   OGG-00446  Could not find archived log for sequence 76437 thread 1 under default destinations SQL <SELECT  name    FROM v$archived_log   WHERE sequence# = :ora_seq_no AND    
     thread# = :ora_thread AND         resetlogs_id = :ora_resetlog_id AND         archived = 'YES' AND         deleted = 'NO'         AND standby_dest = 'NO'         order by name DESC>, error retrievin
g redo file name for sequence 76437, archived = 1, use_alternate = 0 Not able to establish initial position for sequence 76437, rba 6905872.

2019-05-27 14:58:34  ERROR   OGG-01668  PROCESS ABENDING.

检查确认:
set line 200 pages 200
col name for a100
SELECT sequence#,name,thread#,resetlogs_id,archived,deleted FROM v$archived_log WHERE sequence# >= 76437;

 SEQUENCE# NAME                                                                                                    THREAD# RESETLOGS_ID ARC DEL
---------- ---------------------------------------------------------------------------------------------------- ---------- ------------ --- ---
     76437                                                                                                               1    898097516 YES YES
     76438                                                                                                               1    898097516 YES YES
     76439                                                                                                               1    898097516 YES YES
     76440                                                                                                               1    898097516 YES YES
     76441                                                                                                               1    898097516 YES YES
     76442                                                                                                               1    898097516 YES YES
     76443                                                                                                               1    898097516 YES YES
     76444                                                                                                               1    898097516 YES YES
     76445                                                                                                               1    898097516 YES YES
     76446                                                                                                               1    898097516 YES YES
     76447                                                                                                               1    898097516 YES YES
     76448                                                                                                               1    898097516 YES YES
     76449                                                                                                               1    898097516 YES YES
     76450                                                                                                               1    898097516 YES YES
     76451                                                                                                               1    898097516 YES YES
     76452                                                                                                               1    898097516 YES YES
     76453                                                                                                               1    898097516 YES YES
     76454                                                                                                               1    898097516 YES YES
     76455                                                                                                               1    898097516 YES YES
     76456                                                                                                               1    898097516 YES YES
     76457                                                                                                               1    898097516 YES YES
     76458                                                                                                               1    898097516 YES YES
     76459                                                                                                               1    898097516 YES YES
     76460                                                                                                               1    898097516 YES YES
     76461                                                                                                               1    898097516 YES YES
     76462                                                                                                               1    898097516 YES YES
     76463                                                                                                               1    898097516 YES YES
     76464                                                                                                               1    898097516 YES YES
     76465                                                                                                               1    898097516 YES YES

29 rows selected.

看不到归档路径,DELETED='YES' ,可见确实归档丢失。

如果归档没有丢失,将归档恢复到原来位置,重新启动抽取进程,即可完成恢复。大概流程如下:

--1.检查备份归档是否存在
rman target /
RMAN > list backup of archivelog sequence between 76437 and 76501;

--2.恢复归档
RMAN > restore archivelog from sequence 76437;

--3.确认归档恢复到了默认归档位置
--4.重新启动抽取进程
./ggsci
start ex_xxxx3
info all
stats ex_xxxx3
stats dp_xxxx3
比对数据,确认恢复

再次确认后,RAC 两个节点的归档都有丢失。

thread2: 52036 到 52041断档
thread1: 76495 到 76501断档

其中

进程3 对应的表只有insert操作,我们采用如下方式处理的:
比对数据将这段时间的数据插入进去(可以采用tns/exp where/plsql where/expdp where等方式),然后跳过这段时间进行同步

进程4 包含了更新插入删除操作,只能通过重新初始化的方式恢复

 

进程3  补数据恢复

--1.获取断档故障时间范围
SQL> alter session set nls_date_format='YYYYMMDD HH24:MI:ss';
Session altered.
SQL> select THREAD#,SEQUENCE#,FIRST_TIME,NEXT_TIME from v$archived_log where  SEQUENCE# between 52036 and 52041;

 SEQUENCE# FIRST_TIME        NEXT_TIME
---------- ----------------- -----------------
     52036 20190526 20:05:42 20190526 21:05:41
     52037 20190526 21:05:41 20190526 21:34:56
     52038 20190526 21:34:56 20190526 22:34:54
     52039 20190526 22:34:54 20190526 23:34:54
     52040 20190526 23:34:54 20190527 00:34:55
     52041 20190527 00:34:55 20190527 01:34:55

6 rows selected.
SQL> select THREAD#,SEQUENCE#,FIRST_TIME,NEXT_TIME from v$archived_log where  SEQUENCE# between 76495 and 76501;

   THREAD#  SEQUENCE# FIRST_TIME        NEXT_TIME
---------- ---------- ----------------- -----------------
         1      76495 20190526 20:05:29 20190526 20:05:42
         1      76496 20190526 20:05:42 20190526 20:56:07
         1      76497 20190526 20:56:07 20190526 21:34:54
         1      76498 20190526 21:34:54 20190526 22:24:17
         1      76499 20190526 22:24:17 20190526 23:24:15
         1      76500 20190526 23:24:15 20190527 00:24:15
         1      76501 20190527 00:24:15 20190527 01:24:17

--2.将这段时间数据插入过去
insert into target_table select * from source_table where createtime>=to_date('20190526 20:05:29','YYYYMMDD HH24:MI:SS') AND createtime<=to_date('20190527 01:34:55','YYYYMMDD HH24:MI:SS');
--3.跳过这段数据同步
GGSCI (xxxx) 2> ALTER EXTRACT EX_XXXX3, THREAD 2, BEGIN 2019-05-27 01:34:56
EXTRACT altered.
GGSCI (xxxx) 2> ALTER EXTRACT EX_XXXX3, THREAD 1, BEGIN 2019-05-27 01:24:18
EXTRACT altered.
GGSCI (xxxx) 14> ALTER EXTRACT EX_XXXX3, ETROLLOVER

2019-05-28 17:21:15  INFO    OGG-01520  Rollover performed.  For each affected output trail of Version 10 or higher format, after starting the source extract, issue ALTER EXTSEQNO for that trail's reader (either pump EXTRACT or REPLICAT) to move the reader's scan to the new trail file;  it will not happen automatically.
EXTRACT altered.

GGSCI (xxxx) 4> start EX_XXXX3


Sending START request to MANAGER ...
EXTRACT EX_XXXX3 starting

GGSCI (xxxx) 4> info all  --检查状态
GGSCI (xxxx) 4> info exttrail ./dirdat/lc

       Extract Trail: ./dirdat/lc
             Extract: EX_XXXX3
               Seqno: 2831          --tracefile SEQNO已经跳转到2831 
                 RBA: 78985035
           File Size: 100M        
GGSCI (xxxx) 7> info DP_XXXX3

EXTRACT    DP_XXXX3  Last Started 2019-05-27 14:47   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:00:04 ago)
Log Read Checkpoint  File ./dirdat/lc002830             --DB进程的trace文件不能自动跳转
                     2019-05-28 17:12:21.854829  RBA 1084
GGSCI (xxxx) 7> ALTER EXTRACT DP_XXXX3, EXTSEQNO 2831, EXTRBA 0
GGSCI (xxxx) 7> START DP_XXXX3
目标端
GGSCI (xxxx) 1> start RP_XXXX3
GGSCI (xxxx) 1> stats RP_XXXX3  --检查是否有数据,完成恢复

进程4 重新初始化恢复


源端:

1.根据SCN 导出数据
sqlplus ggadmin/ggadmin
select to_char(current_scn) from v$database; --15555721660
expdp userid=scott/tiger directory=expdir tables=testtable1 dumpfile=testtable1.dmp logfile=testtable1.log cluster=n CONTENT=data_only FLASHBACK_SCN=15555721660
scp 到目标端


2.目标端导入
impdp target/target directory=expdir remap_schema=scott:target dumpfile=testtable1.dmp table_exists_action=truncate


3. 修复OGG
随便找个不断档的时间点恢复即可

STOP EX_XXXX4
ALTER EXTRACT EX_XXXX4, THREAD 2, BEGIN 2019-05-28 16:00:00
ALTER EXTRACT EX_XXXX4, THREAD 1, BEGIN 2019-05-28 16:00:00
ALTER EXTRACT EX_XXXX4, ETROLLOVER
2019-05-28 17:21:15  INFO    OGG-01520  Rollover performed.  For each affected output trail of Version 10 or higher format, after starting the source extract, issue ALTER EXTSEQNO for that trail's reader (either pump EXTRACT or REPLICAT) to move the reader's scan to the new trail file;  it will not happen automatically.
EXTRACT altered.
start EX_XXXX4
info all

GGSCI (xxxx) 4> info exttrail ./dirdat/ld

       Extract Trail: ./dirdat/lc
             Extract: EX_XXXX3
               Seqno: 2638
                 RBA: 45765133
           File Size: 100M
STOP DP_XXXX4
ALTER EXTRACT DP_XXXX4, EXTSEQNO 2638, EXTRBA 0
START DP_XXXX4

目标端:
START REPLICAT RP_XXXX4,ATCSN 15555721660
完成恢复。

以上过程是后期梳理,并没有保留详细日志,所以存在一些出入,但大概操作步骤如此。