解决SMON_SCN_TO_TIME_AUX表损坏故障
同事在给客户做数据库巡检的过程中,发现其中一个数据库的alert日志中报了一个坏块的错误信息,具体如下:
Reading datafile '+DATA_DW/xtdw/datafile/sysaux.295.819217697' for corruption at rdba: 0x0081140e (file 2, block 70670) Read datafile mirror 'DATA_DW_CD_05_DWCEL02' (file 2, block 70670) found same corrupt data (no logical check) Read datafile mirror 'DATA_DW_CD_09_DWCEL01' (file 2, block 70670) found same corrupt data (no logical check) Errors in file /u01/app/oracle/diag/rdbms/xtdw/xtdw1/trace/xtdw1_smon_12209.trc (incident=128756): ORA-01578: ORACLE 数据块损坏 (文件号 2, 块号 70670) ORA-01110: 数据文件 2: '+DATA_DW/xtdw/datafile/sysaux.295.819217697' Errors in file /u01/app/oracle/diag/rdbms/xtdw/xtdw1/trace/xtdw1_smon_12209.trc: ORA-00604: 递归 SQL 级别 1 出现错误 ORA-01578: ORACLE 数据块损坏 (文件号 2, 块号 70670) ORA-01110: 数据文件 2: '+DATA_DW/xtdw/datafile/sysaux.295.819217697' Fri Jan 26 17:10:41 2018 Sweep [inc][128756]: completed Corrupt Block Found TSN = 1, TSNAME = SYSAUX RFN = 2, BLK = 70670, RDBA = 8459278 OBJN = 270, OBJD = 268, OBJECT = SMON_SCN_TO_TIME_AUX, SUBOBJECT = SEGMENT OWNER = SYS, SEGMENT TYPE = Cluster Segment |
从错误日志可以看出是SMON_SCN_TO_TIME_AUX表出现了坏块。为了谨慎起见,还是在测试环境中模拟了整个故障,并做了恢复测试。
1、模拟故障
[oracle@ggdb02 ~]$ bbed parfile=bbed.par Password:
BBED: Release 2.0.0.0.0 - Limited Production on Sat Feb 24 13:22:38 2018
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set dba 2,411 DBA 0x0080019b (8389019 2,411)
BBED> corrupt Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y Block marked media corrupt.
BBED> sum Check value for File 2, Block 411: current = 0xda6e, required = 0xda6e
BBED> exit |
使用BBED工具故意损坏(2,411)数据块。
SQL> startup force ORACLE instance started.
Total System Global Area 835104768 bytes Fixed Size 2257840 bytes Variable Size 322964560 bytes Database Buffers 507510784 bytes Redo Buffers 2371584 bytes Database mounted. Database opened.
SQL> select * from SMON_SCN_TIME; select * from SMON_SCN_TIME * ERROR at line 1: ORA-01578: ORACLE data block corrupted (file # 2, block # 411) ORA-01110: data file 2: '/u01/app/oracle/oradata/ggdb/sysaux01.dbf'
SQL>select tablespace_name, segment_type, owner, segment_name from dba_extents where file_id = 2 and 411 between block_id and block_id + blocks - 1;
TABLESPACE_NAME SEGMENT_TYPE OWNER SEGMENT_NAME --------------- --------------- ---------- ------------------------------ SYSAUX CLUSTER SYS SMON_SCN_TO_TIME_AUX
SQL> |
可见,故障已经重现,SMON_SCN_TO_TIME_AUX表已经无法访问。
2、故障处理方案
(1).重启数据库,并设置12500 event SHUTDOWN IMMEDIATE STARTUP RESTRICT ALTER SYSTEM SET EVENTS '12500 TRACE NAME CONTEXT FOREVER, LEVEL 10';
(2).获取SMON_SCN_TIME 和SMON_SCN_TO_TIME_AUX的表结构和相关索引结构 SET LONG 1000 SET LINESIZE 500 SELECT DBMS_METADATA.GET_DDL('CLUSTER', 'SMON_SCN_TO_TIME_AUX') FROM DUAL; SELECT DBMS_METADATA.GET_DDL('TABLE', 'SMON_SCN_TIME') FROM DUAL; SELECT DBMS_METADATA.GET_DDL('INDEX', A.INDEX_NAME) FROM DBA_INDEXES A WHERE TABLE_NAME IN('SMON_SCN_TO_TIME_AUX', 'SMON_SCN_TIME');
(3).删除相关的表 DROP TABLE SMON_SCN_TIME; DROP CLUSTER SMON_SCN_TO_TIME_AUX;
(4).重新执行步骤2中获取的表结构语句,重新创建SMON_SCN_TO_TIME_AUX和SMON_SCN_TIME表(注意:要先创建SMON_SCN_TO_TIME_AUX表)
(5).禁用12500 event并重启数据库 ALTER SYSTEM SET EVENTS '12500 TRACE NAME CONTEXT OFF'; SHUTDOWN IMMEDIATE STARTUP |
3、处理过程中遇到的问题
SQL> DROP TABLE SMON_SCN_TIME; DROP TABLE SMON_SCN_TIME * ERROR at line 1: ORA-01578: ORACLE data block corrupted (file # 2, block # 411) ORA-01110: data file 2: '/u01/app/oracle/oradata/ggdb/sysaux01.dbf'
SQL> DROP CLUSTER SMON_SCN_TO_TIME_AUX; DROP CLUSTER SMON_SCN_TO_TIME_AUX * ERROR at line 1: ORA-00951: cluster not empty |
在执行方案的第3步时,无法成功删除SMON_SCN_TIME表,此时,尝试使用truncate cluster smon_scn_to_time_aux的方式来修复该故障,但仍然报错:
SQL> truncate TABLE SMON_SCN_TIME; truncate TABLE SMON_SCN_TIME * ERROR at line 1: ORA-03292: Table to be truncated is part of a cluster
SQL> truncate CLUSTER SMON_SCN_TO_TIME_AUX; truncate CLUSTER SMON_SCN_TO_TIME_AUX * ERROR at line 1: ORA-01578: ORACLE data block corrupted (file # 2, block # 411) ORA-01110: data file 2: '/u01/app/oracle/oradata/ggdb/sysaux01.dbf' |
最终,只能再次借用了删除数据的非常规方法,直接修改了基表信息,成功删除了SMON_SCN_TIME表,删除了SMON_SCN_TIME表之外,可正常删除SMON_SCN_TO_TIME_AUX。
SQL> select object_id, data_object_id , object_name from dba_objects where object_name='SMON_SCN_TIME';
OBJECT_ID DATA_OBJECT_ID OBJECT_NAME ---------- -------------- ----------------------------------- 87932 87895 SMON_SCN_TIME
SQL> delete from tab$ where obj#=87932; 1 row deleted.
SQL> commit; Commit complete.
SQL> SQL> update obj$ set type#=10 where obj#=87932; 1 row updated.
SQL> commit; Commit complete.
SQL> alter system flush buffer_cache; System altered.
SQL> alter system flush shared_pool; System altered.
SQL> DROP CLUSTER SMON_SCN_TO_TIME_AUX; Cluster dropped.
SQL> select object_id, data_object_id , object_name from dba_objects where object_name like '%SMON%'; no rows selected
SQL> |
重建完相关表和索引后,坏块故障成功解决。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 【自荐】一款简洁、开源的在线白板工具 Drawnix
· 没有Manus邀请码?试试免邀请码的MGX或者开源的OpenManus吧
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· C#/.NET/.NET Core优秀项目和框架2025年2月简报
· DeepSeek在M芯片Mac上本地化部署