惜分飞

博客园 首页 联系 订阅 管理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:200T 数据库非归档无备份恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一套近200T的,6个节点的RAC,由于存储管线链路不稳定,导致服务器经常性掉盘,引起asm 磁盘组频繁dismount/mount,数据库集群节点不停的重启,修复好链路问题之后,数据库启动报ORA-01113,ORA-01110
ORA-01113-ORA-01110


通过Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check)脚本检测,发现有10个数据文件异常,无法正常恢复
20240814155122

该库比较大,有近200T,因此恢复需要各位谨慎(无法做现场备份,另外客户要求2天时间必须恢复好)
200t

由于数据库是非归档模式,该库无法通过应用归档日志来实现对这些文件进行恢复,对于这种情况,直接使用dbms_diskgroup把数据文件头拷贝到文件系统中,类似操作

 

SQL> @dbms_diskgroup_get_block.sql  +DATA/xifenfei.dbf 1 1 /tmp/xff/xifenfei.dbf.header
 
Parameter 1:
ASM_file_name (required)
 
 
Parameter 2:
block_to_extract (required)
 
 
Parameter 3
number_of_blocks_to_extract (required)
 
 
Parameter 4:
FileSystem_File_Name (required)
 
old  14:  v_AsmFilename := '&ASM_File_Name';
new  14:  v_AsmFilename := '+DATA/xifenfei.dbf';
old  15:  v_offstart := '&block_to_extract';
new  15:  v_offstart := '1';
old  16:  v_numblks := '&number_of_blocks_to_extract';
new  16:  v_numblks := '1';
old  17:  v_FsFilename := '&FileSystem_File_Name';
new  17:  v_FsFilename := '/tmp/xff/xifenfei.dbf.header';
File: +DATA/xifenfei.dbf
Type: 2 Data File
Size (in logical blocks): 3978880
Logical Block Size: 16384
Physical Block Size: 512
 
PL/SQL procedure successfully completed.

然后通过bbed修改相关scn

BBED> set filename 'xifenfei.dbf.header'
    FILENAME        xifenfei.dbf.header
 
BBED> set blocksize 16384
    BLOCKSIZE       16384
 
BBED> map
 File: xifenfei.dbf.header (0)
 Block: 1                                     Dba:0x00000000
------------------------------------------------------------
 Data File Header
 
 struct kcvfh, 860 bytes                    @0      
 
 ub4 tailchk                                @16380  
 
 
BBED> p kcvfh.kcvfhckp.kcvcpscn
struct kcvcpscn, 8 bytes                    @484    
   ub4 kscnbas                              @484      0xa8061324
   ub2 kscnwrp                              @488      0x0081
 
BBED> assign file 295 block 1 kcvfh.kcvfhckp.kcvcpscn = file 1 block 1 kcvfh.kcvfhckp.kcvcpscn;
struct kcvcpscn, 8 bytes                    @484    
   ub4 kscnbas                              @484      0xa8133e2b
   ub2 kscnwrp                              @488      0x0081

然后把修改的数据文件头写回到asm中

SQL> @dbms_diskgroup_cp_block_to_asm.sql  /tmp/xff/xifenfei.dbf.header  +DATA/xifenfei.dbf 1 1
 
Parameter 1:
v_FsFileName (required)
 
 
Parameter 2:
v_AsmFileName (required)
 
 
Parameter 3
v_offstart (required)
 
 
Parameter 4
v_numblks (required)
 
old  16: v_FsFileName := '&v_FsFileName';
new  16: v_FsFileName := '/tmp/xff/xifenfei.dbf.header';
old  17: v_AsmFileName := '&v_AsmFileName';
new  17: v_AsmFileName := '+DATA/xifenfei.dbf';
old  18: v_offstart := '&v_offstart';
new  18: v_offstart := '1';
old  19:  v_numblks := '&v_numblks';
new  19:  v_numblks := '1';
File: +DATA/xifenfei.dbf
Type: 2 Data File
Size (in logical blocks): 3978880
Logical Block Size: 16384
 
PL/SQL procedure successfully completed.

查询文件头是否修改成功

[oracle@xff1 xff]$ sqlplus / as sysdba
 
SQL*Plus: Release 11.2.0.3.0 Production on Sat Aug 10 16:45:02 2024
 
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
 
SQL> set numw 16
SQL> select CHECKPOINT_CHANGE# from v$datafile_header where file# in (1,295);
 
CHECKPOINT_CHANGE#
------------------
      556870614571
      556870614571
 
SQL> recover datafile 295;
Media recovery complete.

通过上述操作,确认bbed修改文件头成功,后续类似方法对其他9个文件进行修改,并打开数据库

SQL> recover database;
Media recovery complete.
SQL> alter database open;
 
Database altered.

alert日志提示

Sat Aug 10 16:46:11 2024
ALTER DATABASE RECOVER  datafile 295 
Media Recovery Start
Serial Media Recovery started
WARNING! Recovering data file 295 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
Media Recovery Complete (xff1)
Completed: ALTER DATABASE RECOVER  datafile 295 
Sat Aug 10 16:46:39 2024
ALTER DATABASE RECOVER  database 
Media Recovery Start
 started logmerger process
Sat Aug 10 16:46:51 2024
WARNING! Recovering data file 1139 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 1140 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 1601 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 1803 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 1827 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 1931 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 2185 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 2473 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 2616 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
Sat Aug 10 16:46:54 2024
Parallel Media Recovery started with 64 slaves
Media Recovery Complete (xff1)
Completed: ALTER DATABASE RECOVER  database 
Sat Aug 10 17:19:58 2024
alter database open
This instance was first to open
Sat Aug 10 17:19:58 2024
SUCCESS: diskgroup DATA was mounted
Sat Aug 10 17:19:58 2024
NOTE: dependency between database xff and diskgroup resource ora.DATA.dg is established
Sat Aug 10 17:20:10 2024
Picked broadcast on commit scheme to generate SCNs
Sat Aug 10 17:20:10 2024
SUCCESS: diskgroup REDO was mounted
Sat Aug 10 17:20:10 2024
NOTE: dependency between database xff and diskgroup resource ora.REDO.dg is established
Thread 1 opened at log sequence 124958
  Current log# 14 seq# 124958 mem# 0: +REDO/xff/log2.ora
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Aug 10 17:20:14 2024
SMON: enabling cache recovery
Instance recovery: looking for dead threads
Instance recovery: lock domain invalid but no dead threads
[33770] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:261099864 end:261100854 diff:990 (9 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
Sat Aug 10 17:20:16 2024
minact-scn: Inst 1 is now the master inc#:2 mmon proc-id:33650 status:0x7
minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0000.00000000 gcalc-scn:0x0000.00000000
Starting background process GTX0
Sat Aug 10 17:20:16 2024
GTX0 started with pid=45, OS id=34119
Starting background process RCBG
Sat Aug 10 17:20:16 2024
RCBG started with pid=46, OS id=34121
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Sat Aug 10 17:20:16 2024
QMNC started with pid=47, OS id=34134
Starting background process SMCO
Completed: alter database open

其他集群其他节点数据库,一切正常
20240814162201


检查数据字典一致性

 

SQL> @hcheck.sql
HCheck Version 07MAY18 on 10-AUG-2024 18:24:49
----------------------------------------------
Catalog Version 11.2.0.3.0 (1102000300)
db_name: XFF
 
                   Catalog   Fixed
Procedure Name             Version    Vs Release    Timestamp
Result
------------------------------ ... ---------- -- ---------- --------------
------
.- LobNotInObj             ... 1102000300 <=  *All Rel* 08/10 18:24:49 PASS
.- MissingOIDOnObjCol          ... 1102000300 <=  *All Rel* 08/10 18:24:49 PASS
.- SourceNotInObj          ... 1102000300 <=  *All Rel* 08/10 18:24:49 PASS
.- OversizedFiles          ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
.- PoorDefaultStorage          ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
.- PoorStorage             ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
.- TabPartCountMismatch        ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
.- OrphanedTabComPart          ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
.- MissingSum$             ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
.- MissingDir$             ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
.- DuplicateDataobj        ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
.- ObjSynMissing           ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS
.- ObjSeqMissing           ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS
.- OrphanedUndo            ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS
.- OrphanedIndex           ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS
.- OrphanedIndexPartition      ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS
.- OrphanedIndexSubPartition   ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- OrphanedTable           ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- OrphanedTablePartition      ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- OrphanedTableSubPartition   ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- MissingPartCol          ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- OrphanedSeg$            ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- OrphanedIndPartObj#         ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- DuplicateBlockUse           ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- FetUet              ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- Uet0Check               ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- SeglessUET              ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- BadInd$             ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- BadTab$             ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- BadIcolDepCnt           ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- ObjIndDobj              ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- TrgAfterUpgrade         ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- ObjType0            ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- BadOwner            ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- StmtAuditOnCommit           ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- BadPublicObjects        ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- BadSegFreelist          ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- BadDepends              ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- CheckDual               ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- ObjectNames             ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- BadCboHiLo              ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
.- ChkIotTs            ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
.- NoSegmentIndex          ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
.- BadNextObject           ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
.- DroppedROTS             ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
.- FilBlkZero              ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
.- DbmsSchemaCopy          ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
.- OrphanedObjError        ... 1102000300 >  1102000000 08/10 18:24:54 PASS
.- ObjNotLob               ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
.- MaxControlfSeq          ... 1102000300 <=  *All Rel* 08/10 18:24:55 PASS
.- SegNotInDeferredStg         ... 1102000300 >  1102000000 08/10 18:25:18 PASS
.- SystemNotRfile1         ... 1102000300 >   902000000 08/10 18:25:18 PASS
.- DictOwnNonDefaultSYSTEM     ... 1102000300 <=  *All Rel* 08/10 18:25:18 PASS
.- OrphanTrigger           ... 1102000300 <=  *All Rel* 08/10 18:25:18 PASS
.- ObjNotTrigger           ... 1102000300 <=  *All Rel* 08/10 18:25:18 PASS
---------------------------------------
10-AUG-2024 18:25:18  Elapsed: 29 secs
---------------------------------------
Found 0 potential problem(s) and 0 warning(s)
 
PL/SQL procedure successfully completed.
 
Statement processed.
 
Complete output is in trace file:
/u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_71148_HCHECK.trc

运气不错,数据字典本身没有损坏,业务直接运行,一切正常(主要原因是在光纤链路不稳定的情况下,客户已经没有往库中写入数据)

posted on   惜分飞  阅读(34)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· 单线程的Redis速度为什么快?
点击右上角即可分享
微信分享提示