[20190531]ORA-600 kokasgi1故障模拟与恢复(后续).txt
[20190531]ORA-600 kokasgi1故障模拟与恢复(后续).txt
--//http://blog.itpub.net/267265/viewspace-2646340/=>[20190531]ORA-600 kokasgi1故障模拟与恢复.txt
--//后续有一些恢复没做,补充测试看看.
--//先更正链接http://blog.itpub.net/267265/viewspace-2646340/的一些错误:
--//1.前面做坏块恢复时,少写了执行步骤:.
BBED> assign kcbh.seq_kcbh = 0x01
--//2.使用system用户登录时:
SYS@book> connect system/oracle
ERROR:
ORA-00604: error occurred at recursive SQL level 1
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kdBlkCheckError], [1], [209], [6110], [], [], [], [], [], [], [], []
ORA-01017: invalid username/password; logon denied
Warning: You are no longer connected to ORACLE.
--//主要原因是口令不对(注:我恢复最原始的口令),导致要更新user$相应记录,而块没有完全恢复,所以报错.如果口令正确,不会出现以
--//上错误.
1.环境:
SYS@book> @ ver1
PORT_STRING VERSION BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx 11.2.0.4.0 Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
SYS@book> @ hide _db_always_check_system_ts
NAME DESCRIPTION DEFAULT_VALUE SESSION_VALUE SYSTEM_VALUE
-------------------------- ------------------------------------------------------------- ------------- ------------- ------------
_db_always_check_system_ts Always perform block check and checksum for System tablespace FALSE FALSE FALSE
2.修复块dba=1,209看看.
--//user$的用户SYS,SYSTEM改名.修正回来后dba 1,209存在问题.
BBED> set dba 1,209
DBA 0x004000d1 (4194513 1,209)
BBED> verify
DBVERIFY - Verification starting
FILE = /mnt/ramdisk/book/system01.dbf
BLOCK = 209
Block Checking: DBA = 4194513, Block Type = KTB-managed data block
data header at 0x7fa51592825c
kdbchk: the amount of space used is not equal to block size
used=2548 fsc=1 avsp=5515 dtl=8096
Block 209 failed with check code 6110
--//修改口令看看.
SYS@book> password system
Changing password for system
New password:
Retype new password:
Password changed
--//这也证明_db_always_check_system_ts=false,块有一些瑕疵不会报错.
SYS@book> alter system checkpoint ;
System altered.
SYS@book> alter system checkpoint ;
System altered.
SYS@book> alter system checkpoint ;
System altered.
BBED> verify dba 1,209
DBVERIFY - Verification starting
FILE = /mnt/ramdisk/book/system01.dbf
BLOCK = 209
Block Checking: DBA = 4194513, Block Type = KTB-managed data block
data header at 0x7ffe7f88e25c
kdbchk: the amount of space used is not equal to block size
used=2548 fsc=0 avsp=5516 dtl=8096
Block 209 failed with check code 6110
--//还是存在.连续执行2次sqlplus system/aaa,这样里面不成功登录计数改变,会导致记录长度变化,看看.
BBED> x /rcncnnttttncnnnnnccnnncct *kdbr[27]
rowdata[0] @2113
----------
flag@2113: 0x6c (KDRHFL, KDRHFF, KDRHFH, KDRHFC)
lock@2114: 0x02
cols@2115: 22
ckix@2116: 5
col 0[6] @2117: SYSTEM
col 1[2] @2124: 1
col 2[16] @2127: 2D594E86F93B17A1
col 3[1] @2144: 0
col 4[2] @2146: 3
col 5[7] @2149: 2013-08-24 11:37:40
col 6[7] @2157: 2019-05-31 11:43:38
col 7[0] @2165: *NULL*
col 8[0] @2166: *NULL*
col 9[1] @2167: 0
col 10[0] @2169: *NULL*
col 11[2] @2170: 1
col 12[0] @2173: *NULL*
col 13[0] @2174: *NULL*
col 14[1] @2175: 0
col 15[2] @2177: 2
--// 2次不成功登录.
col 16[22] @2180: DEFAULT_CONSUMER_GROUP
col 17[0] @2203: *NULL*
col 18[1] @2204: 0
col 19[0] @2206: *NULL*
col 20[0] @2207: *NULL*
col 21[62] @2208: S:0CDF21806AF97030971BEB57BB609CA72A6DB6B1989178BC1CBF5A82C39C
BBED> verify dba 1,209
DBVERIFY - Verification starting
FILE = /mnt/ramdisk/book/system01.dbf
BLOCK = 209
Block Checking: DBA = 4194513, Block Type = KTB-managed data block
data header at 0x7f4cc456b25c
kdbchk: the amount of space used is not equal to block size
used=2549 fsc=0 avsp=5515 dtl=8096
Block 209 failed with check code 6110
--//对比前面fsc=0已经发现变化.开始修复:
--//公式 dtl= used+fsc+avsp
--//dtl-used-fsc= 8096 - 2549 - 0 = 5547
BBED> p dba 1,209 kdbh
struct kdbh, 14 bytes @92
ub1 kdbhflag @92 0x00 (NONE)
sb1 kdbhntab @93 2
sb2 kdbhnrow @94 43
sb2 kdbhfrre @96 -1
sb2 kdbhfsbo @98 108
sb2 kdbhfseo @100 2021
sb2 kdbhavsp @102 5515
sb2 kdbhtosp @104 5515
--//assign kdbh.kdbhavsp=5547
BBED> assign kdbh.kdbhavsp=5547
sb2 kdbhavsp @102 5547
BBED> sum apply
Check value for File 1, Block 209:
current = 0x3ccd, required = 0x3ccd
BBED> verify dba 1,209
DBVERIFY - Verification starting
FILE = /mnt/ramdisk/book/system01.dbf
BLOCK = 209
Block Checking: DBA = 4194513, Block Type = KTB-managed data block
data header at 0x18e105c
kdbchk: avsp(5547) > tosp(5515)
Block 209 failed with check code 6128
--//assign kdbh.kdbhtosp = kdbh.kdbhavsp
BBED> assign kdbh.kdbhtosp = kdbh.kdbhavsp
sb2 kdbhtosp @104 5547
BBED> sum apply
Check value for File 1, Block 209:
current = 0x3ced, required = 0x3ced
BBED> verify dba 1,209
DBVERIFY - Verification starting
FILE = /mnt/ramdisk/book/system01.dbf
BLOCK = 209
--//OK dba=1,209已经恢复.
2.修复索引I_user1看看.
--//根据前面的修复,该索引在dba = 1,417 .
--//相关信息可以参考:http://blog.itpub.net/267265/viewspace-2637037/=>[20190226]删除tab$记录的恢复6.txt
--//参考:https://www.cnblogs.com/lfree/p/10438177.html=>[20190226]测试使用bbed恢复索引.txt
SYS@book> @ bbvi 1 417
BVI_COMMAND
----------------------------------------------------------------------------------------------------
bvi -b 3416064 -s 8192 /mnt/ramdisk/book/system01.dbf
xxd -c16 -g 2 -s 3416064 -l 8192 /mnt/ramdisk/book/system01.dbf
dd if=/mnt/ramdisk/book/system01.dbf bs=8192 skip=417 count=1 of=1_417.dd conv=notrunc 2>/dev/null
od -j 3416064 -N 8192 -t x1 -v /mnt/ramdisk/book/system01.dbf
hexdump -s 3416064 -n 8192 -C -v /mnt/ramdisk/book/system01.dbf
alter system dump datafile '/mnt/ramdisk/book/system01.dbf' block 417;
alter session set events 'immediate trace name set_tsn_p1 level 1';
alter session set events 'immediate trace name buffer level 4194721';
9 rows selected.
$ dd if=/mnt/ramdisk/book/system01.dbf bs=8192 skip=417 count=1 of=1_417.dd conv=notrunc
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000106025 seconds, 77.3 MB/s
$ strings -td -3 1_417.dd | egrep SYS
5864 SYSDW
5878 SYSTEMDW
6005 JAVASYSPRIV
6375 OWBSYS_AUDIT
6396 OWBSYS
6544 SYSMAN
6780 OLAPSYS
6835 MDSYS
6911 ORDSYS
7148 CTXSYS
7163 EXFSYS
7275 JAVASYSPRIV
7357 WMSYS
7370 APPQOSSYS
7627 GATHER_SYSTEM_STATISTICS
8027 SYSTEM
8102 SYS
--//也就是修复指向原来的位置. 减去-9 对应 kd_off的偏移.
--//8102-9 = 8093
--//8027-9 = 8018
BBED> x /rc *kd_off[86]
rowdata[2242] @8093
-------------
flag@8093: 0x01 (KDXRDEL)
lock@8094: 0x02
keydata[6]: 0x00 0x40 0x00 0xd1 0x00 0x01
data key:
col 0[3] @8102: SYS
BBED> x /rc *kd_off[87]
rowdata[4] @5855
----------
flag@5855: 0x00 (NONE)
lock@5856: 0x02
keydata[6]: 0x00 0x40 0x00 0xd1 0x00 0x01
data key:
col 0[5] @5864: SYSDW
BBED> x /rc *kd_off[88]
rowdata[684] @6535
------------
flag@6535: 0x00 (NONE)
lock@6536: 0x00
keydata[6]: 0x00 0x40 0x00 0xd4 0x00 0x0b
data key:
col 0[6] @6544: SYSMAN
BBED> x /rc *kd_off[89]
rowdata[2167] @8018
-------------
flag@8018: 0x01 (KDXRDEL)
lock@8019: 0x02
keydata[6]: 0x00 0x40 0x00 0xd1 0x00 0x06
data key:
col 0[6] @8027: SYSTEM
BBED> x /rc *kd_off[90]
rowdata[18] @5869
-----------
flag@5869: 0x00 (NONE)
lock@5870: 0x02
keydata[6]: 0x00 0x40 0x00 0xd1 0x00 0x06
data key:
col 0[8] @5878: SYSTEMDW
--//注意看flag标识,删除标识是0x01 (KDXRDEL).对应rowid并没有变化(keydata).
--//另外注意索引的特点是块内无序,块间有序.可以看出修改并不改变行目录的顺序.
--//注意键值一定按照行目录排序的,oracle插入索引键值应该通过行目录二分法定位,然后行目录对应记录后移。
assign offset 8093 = 0x0;
assign offset 5855 = 0x1;
assign offset 8018 = 0x0;
assign offset 5869 = 0x1;
BBED> sum apply
Check value for File 1, Block 417:
current = 0x049d, required = 0x049d
BBED> verify
DBVERIFY - Verification starting
FILE = /mnt/ramdisk/book/system01.dbf
BLOCK = 417
Block Checking: DBA = 4194721, Block Type = KTB-managed data block
**** actual free space credit for itl 2 = 35 != # in trans. hdr = 31
---- end index block validation
Block 417 failed with check code 6401
BBED> p /d ktbbh.ktbbhitl[1]._ktbitun
union _ktbitun, 2 bytes @86
sb2 _ktbitfsc @86 31
ub2 _ktbitwrp @86 31
BBED> assign ktbbh.ktbbhitl[1]._ktbitun._ktbitfsc=35
sb2 _ktbitfsc @86 35
BBED> p /d ktbbh.ktbbhitl[1]._ktbitun
union _ktbitun, 2 bytes @86
sb2 _ktbitfsc @86 35
ub2 _ktbitwrp @86 35
BBED> sum apply
Check value for File 1, Block 417:
current = 0x04a1, required = 0x04a1
--//OK.
3.修复禁用的索引.
BBED> set dba 1,522
DBA 0x0040020a (4194826 1,522)
BBED> x /rnnc *kdbr[19]
rowdata[228] @1754
------------
flag@1754: 0x3c (KDRHFL, KDRHFF, KDRHFD, KDRHFH)
lock@1755: 0x01
cols@1756: 0
BBED> assign offset 1754=0x2c;
ub1 rowdata[0] @1754 0x2c
BBED> x /rnnc *kdbr[19]
rowdata[228] @1754
------------
flag@1754: 0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@1755: 0x01
cols@1756: 3
col 0[2] @1757: 46
col 1[2] @1760: 46
col 2[197] @1763: CREATE UNIQUE INDEX I_USER1 ON USER$(NAME) PCTFREE 10 INITRANS 2 MAXTRANS 255 STORAGE ( INITIAL 64K NEXT 1024K MINEXTENTS 1 MAXEXTENTS
2147483645 PCTINCREASE 0 OBJNO 46 EXTENTS (FILE 1 BLOCK 416))
BBED> sum apply
Check value for File 1, Block 522:
current = 0x1ee8, required = 0x1ee8
BBED> verify
DBVERIFY - Verification starting
FILE = /mnt/ramdisk/book/system01.dbf
BLOCK = 522
--//OK,现在全部修复.重启数据库
SYS@book> startup [150/10484]
ORACLE instance started.
Total System Global Area 643084288 bytes
Fixed Size 2255872 bytes
Variable Size 205521920 bytes
Database Buffers 427819008 bytes
Redo Buffers 7487488 bytes
Database mounted.
Database opened.
SYS@book> select /*+ full(a) */ name from user$ a minus select name from user$ ;
no rows selected
SYS@book> select name from user$ minus select /*+ full(a) */ name from user$ a;
no rows selected
SYS@book> validate index i_user1;
Index analyzed.
--//ok,已经全部修复.
4.收尾:
--//修改参数_db_always_check_system_ts=true.
--//alter system reset "_db_always_check_system_ts";
--//重启数据库略.