Oracle Block浅析1:Data block 的物理结构
一.Data Block 物理结构图:
二.一次对block的dump过程
2.1.获取t表行所在的rowid,File id,Block number,Slot number in block
1 SQL> select 2 2 rowid, 3 3 dbms_rowid.rowid_relative_fno(rowid) REL_FNO,--File id 4 4 dbms_rowid.rowid_block_number(rowid) BLOCK_NO,--Block number 5 5 dbms_rowid.rowid_row_number(rowid) ROW_NO, 6 6 id 7 7 from t; 8 9 ROWID REL_FNO BLOCK_NO ROW_NO ID 10 ------------------ ---------- ---------- ---------- ---------- 11 AAA+Q4AABAAAhMqAAA 1 135978 0 1 12 AAA+Q4AABAAAhMqAAB 1 135978 1 2 13 AAA+Q4AABAAAhMqAAC 1 135978 2 3 14 15 SQL>
2.2获取trace文件
1 --把这个block dump到trace: 2 SQL> alter system dump datafile 1 block 135978; 3 System altered. 4 SQL> 5 6 --当前的trace 文件位置: 7 SQL> oradebug setmypid; 8 Statement processed. 9 SQL> oradebug tracefile_name; 10 /opt/oracle/diag/rdbms/nwom/nwom/trace/nwom_ora_12056.trc 11 SQL> 12 13 --trace文件 14 SQL> !cat /opt/oracle/diag/rdbms/nwom/nwom/trace/nwom_ora_12056.trc 15 *** 2013-03-07 13:46:19.935 16 Start dump data blocks tsn: 0 file#:1 minblk 135978 maxblk 135978 17 Block dump from cache: 18 Dump of buffer cache at level 4 for tsn=0 rdba=4330282 -- 19 BH (0xb9f77848) file#: 1 rdba: 0x0042132a (1/135978) class: 1 ba: 0xb91a4000 20 set: 9 pool: 3 bsz: 8192 bsi: 0 sflg: 1 pwc: 0,25 21 dbwrid: 0 obj: 255032 objn: 255032 tsn: 0 afn: 1 hint: f 22 hash: [0xdd5bd6b0,0xdd5bd6b0] lru: [0x79f68100,0x87f7d830] 23 ckptq: [NULL] fileq: [NULL] objq: [0xd79df028,0xd79df028] objaq: [0xd79df018,0xd79df018] 24 st: XCURRENT md: NULL fpin: 'ktswh23: ktsfbkl' tch: 6 25 flags: block_written_once redo_since_read 26 LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [1] 27 Block dump from disk: 28 buffer tsn: 0 rdba: 0x0042132a (1/135978) 29 scn: 0x06ce.614c643d seq: 0x03 flg: 0x06 tail: 0x643d0603 30 frmt: 0x02 chkval: 0x5507 type: 0x06=trans data 31 Hex dump of block: st=0, typ_found=1 32 Dump of memory from 0x00002AFE97E08A00 to 0x00002AFE97E0AA00 33 2AFE97E08A00 0000A206 0042132A 614C643D 060306CE [....*.B.=dLa....] 34 2AFE97E08A10 00005507 00000001 0003E438 614C6421 [.U......8...!dLa] 35 2AFE97E08A20 000006CE 00030002 00000000 00020007 [................] 36 2AFE97E08A30 0001F2C9 008004BE 001839A3 00002001 [.........9... ..] 37 2AFE97E08A40 614C6430 00170008 000208F9 00800541 [0dLa........A...] 38 2AFE97E08A50 001C365C 00002002 614C643D 00030100 [\6... ..=dLa....] 39 2AFE97E08A60 0018FFFF 1F371F4F 00001F37 1F850003 [....O.7.7.......] 40 2AFE97E08A70 1F4F1F6A 00000000 00000000 03819839 [j.O.........9...] 41 2AFE97E08A80 03819839 00000000 00000000 00000000 [9...............] 42 2AFE97E08A90 00000000 00000000 00000000 00000000 [................] 43 Repeat 3 times 44 2AFE97E08AD0 00000001 00002000 00000000 00001434 [..... ......4...] 45 2AFE97E08AE0 00000000 0381983A 00000001 03819839 [....:.......9...] 46 2AFE97E08AF0 0381983A 00000000 00000000 00000000 [:...............] 47 2AFE97E08B00 00000000 00000000 00000001 00000000 [................] 48 2AFE97E08B10 00029073 10000000 03819839 00000008 [s.......9.......] 49 2AFE97E08B20 00000000 00000000 00000000 00000000 [................] 50 Repeat 152 times 51 2AFE97E094B0 03819839 0381983C 00000000 00000000 [9...<...........] 52 2AFE97E094C0 00000000 00000000 00000000 00000000 [................] 53 Repeat 151 times 54 2AFE97E09E40 00000000 00000000 0381983A 00000000 [........:.......] 55 2AFE97E09E50 00000000 00000000 00000000 00000000 [................] 56 Repeat 180 times 57 2AFE97E0A9A0 00000000 00000000 2C000000 C1020202 [...........,....] 58 2AFE97E0A9B0 726F1404 656C6361 20202020 20202020 [..oracle ] 59 2AFE97E0A9C0 20202020 022C2020 03C10202 61726F14 [ ,......ora] 60 2AFE97E0A9D0 20656C63 20202020 20202020 20202020 [cle ] 61 2AFE97E0A9E0 02012C20 1402C102 6361726F 2020656C [ ,......oracle ] 62 2AFE97E0A9F0 20202020 20202020 20202020 643D0603 [ ..=d] 63 Block header dump: 0x0042132a 64 Object id on Block? Y 65 seg/obj: 0x3e438 csc: 0x6ce.614c6421 itc: 2 flg: O typ: 1 - DATA 66 fsl: 0 fnx: 0x0 ver: 0x01 67 68 Itl Xid Uba Flag Lck Scn/Fsc 69 0x01 0x0007.002.0001f2c9 0x008004be.39a3.18 --U- 1 fsc 0x0000.614c6430 70 0x02 0x0008.017.000208f9 0x00800541.365c.1c --U- 2 fsc 0x0000.614c643d 71 bdba: 0x0042132a 72 data_block_dump,data header at 0x2afe97e08a5c 73 =============== 74 tsiz: 0x1fa0 75 hsiz: 0x18 76 pbl: 0x2afe97e08a5c 77 76543210 78 flag=-------- 79 ntab=1 80 nrow=3 81 frre=-1 82 fsbo=0x18 83 fseo=0x1f4f 84 avsp=0x1f37 85 tosp=0x1f37 86 0xe:pti[0] nrow=3 offs=0 87 0x12:pri[0] offs=0x1f85 88 0x14:pri[1] offs=0x1f6a 89 0x16:pri[2] offs=0x1f4f 90 block_row_dump: 91 tab 0, row 0, @0x1f85 92 tl: 27 fb: --H-FL-- lb: 0x1 cc: 2 93 col 0: [ 2] c1 02 94 col 1: [20] 6f 72 61 63 6c 65 20 20 20 20 20 20 20 20 20 20 20 20 20 20 95 tab 0, row 1, @0x1f6a 96 tl: 27 fb: --H-FL-- lb: 0x2 cc: 2 97 col 0: [ 2] c1 03 98 col 1: [20] 6f 72 61 63 6c 65 20 20 20 20 20 20 20 20 20 20 20 20 20 20 99 tab 0, row 2, @0x1f4f 100 tl: 27 fb: --H-FL-- lb: 0x2 cc: 2 101 col 0: [ 2] c1 04 102 col 1: [20] 6f 72 61 63 6c 65 20 20 20 20 20 20 20 20 20 20 20 20 20 20 103 end_of_block_dump 104 End dump data blocks tsn: 0 file#: 1 minblk 135978 maxblk 135978 105 106 SQL>
三. 先介绍数据块中包括的3种头信息:
首先,数据块是在data block buffer cache中完成读和写操作的,所以它提供了20bytes的Cache Header和4bytes的 Tail给Cache,用来读取和管理。
其次,是为Transaction层提供的Header信息。它一共占据了48bytes,包括24bytes的控制信息,和一系列的Interested Transaction Slot (ITS)
剩下的部分都叫Data Area,用来存储用户数据。Data Area也包括data header,和row data及剩余空间。但是Cluster blocks,table blocks, index block的data header,row data结构是不相同的,这里主要介绍table blocks.
四.结合trace文件中的信息,详细介绍:
1) The Cache Header and Tail: Block dump from disk: buffer tsn: 0 rdba: 0x0042132a (1/135978) scn: 0x06ce.614c643d seq: 0x03 flg: 0x06 tail: 0x643d0603 frmt: 0x02 chkval: 0x5507 type: 0x06=trans data --buffer tsn: 0 --该块对应的表空间号,这里是0号表空间(占用4 bytes) --rdba: 0x0042132a (1/135978):--相对数据块地址,表示该块为1号数据文件第135978个块,用4个字节32位来表示,前10位为相对数据文件号,后22位为块号。 --SCN: 占用6bytes,表示最后变化的scn。包括2bytes的高位字节(SCN wrap),和4bytes的低位字节(SCN base) --Sequence: 占用1byte,用途不明确,可能是辅助SCN的变化Tail:包括了SCN中SCN base的低位(low-order)2bytes,然后是block type,还有Sequence number。每当block 被读的时候,都要检查Tail与 block header 是否一致,保证了这个block不是损坏的(corrupted)
--Flag:占用1byte --Format: 占用1byte,应该是用来区分版本。Oracle 8之前值为1,之后为2。 --Checksum: 占用2byte,跟db_block_checksum 参数有关系。 --Block type: 占用1byte,经常用到的有:1=undo segment header block; 2=undo data block; 5= data segment header block; 6=data block
以上为Cache Header,正好20字节。
2) The Transaction Header: 一共占据48bytes,包括24bytes的控制信息,和一系列的Interested Transaction Slot (ITS)。这些ITS组合在一起称为Interested Transaction List (ITL)。初始的ITL slot 数量由 INITRANS 决定(index branch block 只有1个slot)。如果有足够的剩余空间,oracle会根据需要动态的分配这些slot,直到受到空间限制或者达到了MAXTRANS。 Block header dump: 0x0042132a Object id on Block? Y seg/obj: 0x3e438 csc: 0x6ce.614c6421 itc: 2 flg: O typ: 1 - DATA fsl: 0 fnx: 0x0 ver: 0x01 Itl Xid Uba Flag Lck Scn/Fsc 0x01 0x0007.002.0001f2c9 0x008004be.39a3.18 --U- 1 fsc 0x0000.614c6430 0x02 0x0008.017.000208f9 0x00800541.365c.1c --U- 2 fsc 0x0000.614c643d 24bytes的控制信息包括: Object number(seg/obj): 占用4bytes,指在OBJ$中记录的segment的 object number(0x3e438) Cleanout SCN(csc): 占用6bytes,表示最后一次块清除(Block CleanOut)时候的SCN。 ITL count(itc): 占用1byte,ITL 的slot数量。 Flag: 占用2bytes。O表示这个block在freelist 上。否则flag为”-” Block type: 占用1byte。1=data; 2=index ITL freelist slot(fsl): 占用1byte。Index to the first slot on the ITL freelist Next freelist block(fnx): 占用4byte。Segment freelist中下一个block的RDBA Version: 1 byte Unused: 4bytes,用来前后兼容。 每个ITL entry包括以下的内容: Transaction id(Xid): 8bytes。其中包括rollback segment number, transaction table中的slot number等。 Undo block address(Uba): 8bytes。其中包括rollback segment block的DBA,sequence number等。 Flags: 1nibble。---- = transaction is active, or committed pending cleanout --说明该事物是活动的。 C--- = transaction has been committed and locks cleaned out --事物已经提交,锁已经被清除 -B-- = this undo record contains the undo for this ITL entry --U- = transaction committed (maybe long ago); SCN is an upper bound --事物已经提交,但是锁还没有清除 ---T = transaction was still active at block cleanout SCN -块清除的SCN被记录时,该事务仍然是活动的 Locks: 3nibbles. 也就是所谓的行级锁(row-level locks);表示这个事务所影响的行数 Scn/Fsc:Commit SCN或者快速提交(Fast Commit Fsc)的SCN
3)Data Area 包括14bytes的data header,4bytes/table的table dictionary,2bytes/row的row dictionary。table dictionary主要用于 cluster block中,只不过table block中的table dictionary只有一个table。 data_block_dump,data header at 0x2afe97e08a5c =============== tsiz: 0x1fa0 ==> total data area size hsiz: 0x18 ==> data header size (14+ntabs*4 + nrows*2) pbl: 0x2afe97e08a5c ==> pointer to buffer holding the block 76543210 ==> block dba / rdba flag=-------- ==> n=pctfree hit (clusters),f=don’t put on freelist, k=flushable ntab=1 ==> number of tables (>1 so this is a cluster) nrow=3 ==> number of rows frre=-1 fsbo=0x18 ==> free space begin offset fseo=0x1f4f ==> free space end offset avsp=0x1f37 ==> available space in the block tosp=0x1f37 ==> total available space when all transactions commit 0xe:pti[0] nrow=3 offs=0 ==> row directroy 记录该块中每一条记录的地址信息 0x12:pri[0] offs=0x1f85 0x14:pri[1] offs=0x1f6a 0x16:pri[2] offs=0x1f4f block_row_dump: tab 0, row 0, @0x1f85 ==> 3bytes row header,第一条记录 tl: 27 fb: --H-FL-- lb: 0x1 cc: 2 col 0: [ 2] c1 02 ==> 第一列的值为'cl02' col 1: [20] 6f 72 61 63 6c 65 20 20 20 20 20 20 20 20 20 20 20 20 20 20 ==>第二列的值 tab 0, row 1, @0x1f6a ==> 3bytes row header,第二条记录 tl: 27 fb: --H-FL-- lb: 0x2 cc: 2 col 0: [ 2] c1 03 col 1: [20] 6f 72 61 63 6c 65 20 20 20 20 20 20 20 20 20 20 20 20 20 20 tab 0, row 2, @0x1f4f tl: 27 fb: --H-FL-- lb: 0x2 cc: 2 ==>lb: 0x2说明事物在该数据行上的锁还没清除,并且该锁指向02号事物槽。(此前对改行进行了更新,并且未提交) col 0: [ 2] c1 04 col 1: [20] 6f 72 61 63 6c 65 20 20 20 20 20 20 20 20 20 20 20 20 20 20 end_of_block_dump End dump data blocks tsn: 0 file#: 1 minblk 135978 maxblk 135978 1) fb: K = Cluster Key(Flags may change meaning if this is set to show HASH cluster) C = Cluster table member H = Head piece of row D = Deleted row F = First data piece L = Last data piece P = First column continues from previous piece N = Last column continues in next piece 2) lb: 和上面的 ITL 的lck相对应表示这行是否被lock 了 3) cc: 有几列数据 这里只能表示255列 超过了就会有链接行