Ivo落班

记录着自己非专业的起步

  博客园  :: 首页  :: 新随笔  :: 联系 :: 订阅 订阅  :: 管理

转载自love wife & love life —Roger 提供oracle技术支持服务

本文链接地址: intra blcok chain

最近在一个优化项目中,通过awr报告发现table fetch continued row 指标很高,怀疑是行迁移/链接比较严重。
后来经过沟通发现,原来客户的数据库中存在几个table,其column 数目超过255. 针对超过255列的行数据.如下:

 
Statistic Total per Second per Trans
dirty buffers inspected 7,532 4.17 0.25
free buffer inspected 17,409,018 9,633.69 575.56
free buffer requested 17,115,682 9,471.36 565.86
table fetch by rowid 98,848,588 54,700.18 3,268.05
table fetch continued row 97,797,107 54,118.32 3,233.28
oracle会将其每行数据都分成2个row piece. 实际上如果超过510个列,那么会被分成3个row piece存放在同一个
block中。 10gR2的官方文档是这样描述的:
 
"When a table has more than 255 columns, rows that have data after the 255th column are likely to
be chained within the same block. This is called intra-block chaining. A chained row's pieces are
chained together using the rowids of the pieces. With intra-block chaining, users receive all the
data in the same block. If the row fits in the block, users do not see an effect in I/O
performance, because no extra I/O operation is required to retrieve the rest of the row."
创建超过255列的测试表:
 
----tab1 256 columns
SQL> declare
2 v_sql varchar2(32767) ;
3 begin
4 v_sql := 'create table t_chain1 ( ' ;
5 for i in 1..256 loop
6 v_sql := v_sql || 'id'||i||' number,' ;
7 end loop ;
8 v_sql := rtrim(v_sql, ',') || ')';
9 execute immediate v_sql;
10 end ;
11 /
 
PL/SQL procedure successfully completed.
 
SQL> select count(1) from user_tab_columns where table_name='T_CHAIN1';
 
COUNT(1)
----------
256
SQL> create sequence t_chain_seq
2 minvalue 1
3 nomaxvalue
4 start with 1
5 increment by 1
6 nocycle
7 cache 10000;
 
Sequence created.
 
SQL>
SQL> insert into t_chain1(id256) values(t_chain_seq.NEXTVAL);
 
1 row created.
 
SQL> /
 
1 row created.
 
SQL> /
 
1 row created.
 
SQL> commit;
 
Commit complete.
 
SQL> select count(1) from t_chain1;
 
COUNT(1)
----------
3
 
SQL>

再创建一个255列的测试表进行对比:

 
----tab2 255 columns
SQL> declare
2 v_sql varchar2(32767) ;
3 begin
4 v_sql := 'create table t_chain2 ( ' ;
5 for i in 1..255 loop
6 v_sql := v_sql || 'id'||i||' number,' ;
7 end loop ;
8 v_sql := rtrim(v_sql, ',') || ')';
9 execute immediate v_sql;
10 end ;
11 /
 
PL/SQL procedure successfully completed.
 
SQL> select count(1) from user_tab_columns where table_name='T_CHAIN2';
 
COUNT(1)
----------
255
 
SQL> create sequence t_chain_seq2
2 minvalue 1
3 nomaxvalue
4 start with 1
5 increment by 1
6 nocycle
7 cache 10000;
 
Sequence created.
 
SQL> insert into t_chain2(id255) values(t_chain_seq2.NEXTVAL);
 
1 row created.
 
SQL> /
 
1 row created.
 
SQL> /
 
1 row created.
 
SQL> commit;
 
Commit complete.
 
SQL> select count(1) from t_chain2;
 
COUNT(1)
----------
3

通过前面的测试表,我们来对比观察下这2个表的block信息的差别:
 
----使用dbms_rowid定位到block 号
SQL> select dbms_rowid.rowid_relative_fno(t.rowid) as "file#" ,
2 dbms_rowid.rowid_block_number(t.rowid) as "block#" from t_chain1 t ;
 
file# block#
---------- ----------
2 73688
2 73688
2 73688
 
SQL> select dbms_rowid.rowid_relative_fno(t.rowid) as "file#" ,
2 dbms_rowid.rowid_block_number(t.rowid) as "block#" from t_chain2 t ;
 
file# block#
---------- ----------
2 73696
2 73696
2 73696

我们先来看下这2种情况下,block内信息的差异.

 
---tab1 256 column dump
data_block_dump,data header at 0xd0da87c
===============
tsiz: 0x1f80
hsiz: 0x1e
pbl: 0x0d0da87c
bdba: 0x00811fd8
76543210
flag=--------
ntab=1
nrow=6 ---注意这里是6(实际上测试表的数据只有3条)
frre=-1
fsbo=0x1e
fseo=0x1c56
avsp=0x1c38
tosp=0x1c38
0xe:pti[0] nrow=6 offs=0
0x12:pri[0] offs=0x1e7c
0x14:pri[1] offs=0x1e72
0x16:pri[2] offs=0x1d6e
0x18:pri[3] offs=0x1d64
0x1a:pri[4] offs=0x1c60
0x1c:pri[5] offs=0x1c56
block_row_dump:
tab 0, row 0, @0x1e7c 我们可以看到,oracle这样讲第1行数据分成了2个row piece.这里是第一个row piece的offset
tl: 260 fb: -----L-- lb: 0x1 cc: 255
col 0: *NULL*
col 1: *NULL*
col 2: *NULL*
col 3: *NULL*
.......
col 253: *NULL*
col 254: [ 2] c1 02 从这里我们可以看到,oracle将有数据的列存放到第一个row piece了.
tab 0, row 1, @0x1e72 这里是第一行数据的第2个row piece.
tl: 10 fb: --H-F--- lb: 0x1 cc: 1
nrid: 0x00811fd8.0 这里的nrid相当于rdba,转换后即为我们的file 2 block 73688地址.
col 0: *NULL*
tab 0, row 2, @0x1d6e
tl: 260 fb: -----L-- lb: 0x1 cc: 255
col 0: *NULL*
col 1: *NULL*
col 2: *NULL*
col 3: *NULL*
......
col 252: *NULL*
col 253: *NULL*
col 254: [ 2] c1 03
tab 0, row 3, @0x1d64
tl: 10 fb: --H-F--- lb: 0x1 cc: 1
nrid: 0x00811fd8.2
col 0: *NULL*
tab 0, row 4, @0x1c60
tl: 260 fb: -----L-- lb: 0x1 cc: 255
col 0: *NULL*
col 1: *NULL*
col 2: *NULL*
col 3: *NULL*
......
col 252: *NULL*
col 253: *NULL*
col 254: [ 2] c1 04
tab 0, row 5, @0x1c56
tl: 10 fb: --H-F--- lb: 0x1 cc: 1
nrid: 0x00811fd8.4
col 0: *NULL*
end_of_block_dump
对于超过255列的表,oracle会将其在block内的一行数据分成2个row piece来存放。但是会将有数据的列存放在
第一个row piece中,无数的列存放的在第2个row piece中.

如下是tab2 255 columns的dump:

 
data_block_dump,data header at 0xd0da864
===============
tsiz: 0x1f98
hsiz: 0x18
pbl: 0x0d0da864
bdba: 0x00811fe0
76543210
flag=--------
ntab=1
nrow=3
frre=-1
fsbo=0x18
fseo=0x1c8c
avsp=0x1c74
tosp=0x1c74
0xe:pti[0] nrow=3 offs=0
0x12:pri[0] offs=0x1e94
0x14:pri[1] offs=0x1d90
0x16:pri[2] offs=0x1c8c
block_row_dump:
tab 0, row 0, @0x1e94
tl: 260 fb: --H-FL-- lb: 0x1 cc: 255
col 0: *NULL*
col 1: *NULL*
col 2: *NULL*
......
col 253: *NULL*
col 254: [ 2] c1 02
tab 0, row 1, @0x1d90
tl: 260 fb: --H-FL-- lb: 0x1 cc: 255
col 0: *NULL*
col 1: *NULL*
col 2: *NULL*
.......
col 253: *NULL*
col 254: [ 2] c1 03
tab 0, row 2, @0x1c8c
tl: 260 fb: --H-FL-- lb: 0x1 cc: 255
col 0: *NULL*
col 1: *NULL*
col 2: *NULL*
col 3: *NULL*
........
col 253: *NULL*
col 254: [ 2] c1 04
end_of_block_dump

对于未超过255列的表.没有什么特别的,正是大家平时所看到的这样.
这里给大家补充一点,为什么oracle这里最大允许255个列呢 ? 如下:
 

 
BBED> x /rnnnnnnnnnnnnnnnnnnnn
rowdata[6686] @7924
-------------
flag@7924: 0x04 (KDRHFL)
lock@7925: 0x00
cols@7926: 255
.......
 
BBED> d /v offset 7924 count 10
File: /home/ora10g/oradata/roger01.dbf (2)
Block: 73688 Offsets: 7924 to 7933 Dba:0x00811fd8
-------------------------------------------------------
d80000ff 0401ffff ffff l ........
 
<16 bytes per line>

可以看到oracle用了一个byte来存放column 的count值。简称cc.
我们知道,1个byte等于8个bit.  一个bit最大表示的数目是power(2,1).以此类推,那么一个byte
所能表示的最大数目都是power(2,8),即使256.  所以oracle这里一个row piece最大允许255个column.
超过255列即为分到另外一个row piece中.
虽然数据被划分到了其他的row piece,然而,一行数据仍然是存在同一个block中. 同时,每行都存在一个rowid.
 
 
SQL> select rowid from t_chain1;
 
ROWID
------------------
AAAPJfAACAAAR/YAAB
AAAPJfAACAAAR/YAAD
AAAPJfAACAAAR/YAAF
 
SQL> select dbms_rowid.rowid_object(rowid) obj#,
2 dbms_rowid.rowid_relative_fno(rowid) rfile#,
3 dbms_rowid.rowid_block_number(rowid) block#,
4 dbms_rowid.rowid_row_number(rowid) row#
5 from t_chain1 ;
 
OBJ# RFILE# BLOCK# ROW#
---------- ---------- ---------- ----------
62047 2 73688 1
62047 2 73688 3
62047 2 73688 5
我们可以看到,虽然该block存放了6行数据(3条数据,6个row piece),然而其rowid只有3个. 从这点也可以看出,其实虽然一条数据
被分成2个row piece,然而其rowid却只有一个.所有如果是所有通过rowid访问,那么是可以指定返回整条数据的,不需要产生额外的IO。
那么,针对这样的行内迁移(intra block chain),到底是否会产生多余的IO消耗呢 ?我们通过创建通过如下的SQL来简单测试下:

+++++++首先测试不存在行迁移的表,观察逻辑读消耗

 
SQL> conn roger/roger
Connected.
SQL> select a.NAME, b.VALUE
2 from v$statname a, v$mystat b
3 where a.STATISTIC# = b.STATISTIC#
4 and lower(a.NAME) = 'table fetch continued row';
 
NAME VALUE
---------------------------------------------------------------- ----------
table fetch continued row 0
 
SQL> alter system flush shared_pool;
 
System altered.
 
SQL> alter system flush BUFFER_CACHE;
 
System altered.
 
SQL> set autot on
SQL> set lines 200
SQL> select count(1) from t_chain2 where id255=2;
 
COUNT(1)
----------
1
 
Execution Plan
----------------------------------------------------------
Plan hash value: 1667017148
 
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T_CHAIN2 | 1 | 3 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
2 - filter("ID255"=2)
 
Statistics
----------------------------------------------------------
476 recursive calls
0 db block gets
41 consistent gets
17 physical reads
0 redo size
411 bytes sent via SQL*Net to client
400 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
1 rows processed
++++++++ 测试intra block chaining的表

 
SQL> select count(1) from t_chain1 where id256=2;
 
COUNT(1)
----------
1
 
SQL> select a.NAME, b.VALUE
2 from v$statname a, v$mystat b
3 where a.STATISTIC# = b.STATISTIC#
4 and lower(a.NAME) = 'table fetch continued row';
 
NAME VALUE
---------------------------------------------------------------- ----------
table fetch continued row 48
 
SQL> alter system flush buffer_cache;
 
System altered.
 
SQL> set autot on
SQL> set autot off
SQL> select a.NAME, b.VALUE
2 from v$statname a, v$mystat b
3 where a.STATISTIC# = b.STATISTIC#
4 and lower(a.NAME) = 'table fetch continued row';
 
NAME VALUE
---------------------------------------------------------------- ----------
table fetch continued row 48
 
SQL> set autot on
SQL> select count(1) from t_chain1 where id256=2;
 
COUNT(1)
----------
1
 
Execution Plan
----------------------------------------------------------
Plan hash value: 432753352
 
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T_CHAIN1 | 1 | 3 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
2 - filter("ID256"=2)
 
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
10 consistent gets
6 physical reads
0 redo size
411 bytes sent via SQL*Net to client
400 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
 
SQL> alter system flush buffer_cache;
 
System altered.
 
SQL>
SQL> set autot off
SQL> select a.NAME, b.VALUE
2 from v$statname a, v$mystat b
3 where a.STATISTIC# = b.STATISTIC#
4 and lower(a.NAME) = 'table fetch continued row';
 
NAME VALUE
---------------------------------------------------------------- ----------
table fetch continued row 63
 
SQL> alter system flush BUFFER_CACHE;
 
System altered.
 
SQL> set autot on
SQL> select count(1) from t_chain1 where id256=2;
 
COUNT(1)
----------
1
 
Execution Plan
----------------------------------------------------------
Plan hash value: 432753352
 
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T_CHAIN1 | 1 | 3 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
2 - filter("ID256"=2)
 
Statistics
----------------------------------------------------------
477 recursive calls
0 db block gets
38 consistent gets
18 physical reads
0 redo size
411 bytes sent via SQL*Net to client
400 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
1 rows processed
 
SQL> select a.NAME, b.VALUE
2 from v$statname a, v$mystat b
3 where a.STATISTIC# = b.STATISTIC#
4 and lower(a.NAME) = 'table fetch continued row';
 
NAME VALUE
---------------------------------------------------------------- ----------
table fetch continued row 87

我们可以看到虽然intra block chain,会由于对表的访问导致table fetch continued row 统计信息的增加,
然而通过测试我们可以发现,其本身并不会消耗额外的IO。
从我上面的2个测试来看:
t_chain1    256 列
t_chain2    255 列
这2个测试表均包含3条数据。 全表扫描的逻辑读几乎一致。
---------------------------------------------------------------  
以上是roger的博文,我基本照做了一遍,没有问题,当中有一个概念
对于超过255列的表,oracle会将其在block内的一行数据分成2个row piece来存放。但是会将有数据的列存放在
第一个row piece中,无数的列存放的在第2个row piece中.
这个概念有误,oracle用逆向存储数据的方式,即最新的数据存在最前段,所以即使是行链,也不是这种情况。(老杨说)
导出块的方式 alter system dump datafile 2 block 73688
nrid: 0x00811fd8.0 是对应的rdba,那是怎么转变的呢?
这里先把nrid转换成数字
    SQL> select to_number('00811fd8','xxxxxxxx') from dual;
 
    TO_NUMBER('00811FD8','XXXXXXXX
    ------------------------------
                       8462296
 
    执行dbms_utility.data_block_address_block(8462296)
            dbms_utility.data_block_address_file(8462296)
    得到相应的值

ps:我收藏这篇博文的目的是因为,我比较缺乏这种测试的想法,个人觉得这篇博文,知识点还有测试方法都不错,所以做一个记录
posted on 2013-07-05 09:46  Ivo落班  阅读(339)  评论(0编辑  收藏  举报