Oracle的聚簇因子对于查询IO的影响

Oracle中，对于同一个查询语句，有时候会很快的完成，有时候却很慢，但是表结构什么的完全一致，表中的数据也完全一致，这个具体是什么原因呢，就要从Index中的细节说起了。

在Oracle中的一个特殊的视图user_indexes中有一个特殊的列，名字是clustering_factor，这个值的内容就是如果访问表的整个表数据，会造成多少次数据库IO。我们可以通过下面的SQL语句来查看。

SELECT
        A.INDEX_NAME,
        B.NUM_ROWS,
        B.BLOCKS,
        A.CLUSTERING_FACTOR
FROM
        USER_INDEXES A,
        USER_TABLES B
WHERE
        A.INDEX_NAME = ?
        AND A.TABLE_NAME = B.TABLE_NAME

在这个SQL语句中，?代表的就是我们要检索的Index的名称。在表中数据有时候属于无序状态，这个时候的CLUSTERING_FACTOR比较接近NUM_ROWS，说明如果扫描整个表，每次都要根据Index来读取相应行的RowID，这个时候的IO操作很多，自然检索时间会比较长。如果数据有序的话，CLUSTERING_FACTOR比较接近BLOCKS，说明相邻的数据在一个块中，减少了IO操作数量，自然检索时间会大大降低。

下面这一段是Oracle 手册中关于CLUSTERING_FACTOR的说明：

Indicates the amount of order of the rows in the table based on the values of the index.

If the value is near the number of blocks, then the table is very well ordered. In this case, the index entries in a single leaf block tend to point to rows in the same data blocks.
If the value is near the number of rows, then the table is very randomly ordered. In this case, it is unlikely that index entries in the same leaf block point to rows in the same data blocks.

posted on 2008-10-24 22:48 一江水阅读(1329) 评论(0) 收藏举报