cache buffers chains【转载】

In order to understand why we get CBC latch contention we have to understand what the CBC latch protects. The CBC latch protects information controlling the buffer cache. Here is a schematic of computer memory and the Oracle processes, SGA and the main components of the SGA:
 
 
The buffer cache holds in memory versions of datablocks for faster access. Can you imagine though how we find a block we want in the buffer cache? The buffer cache doesn't have a index of blocks it contains and we certainly don't scan the whole cache looking for the block we want (though I have heard that as a concern when people increase the size of there buffer cache). The way we find a block in the buffer cache is by taking the block's address, ie it's file and block number and hashing it. What's hashing? A simple example of hashing is  the "Modulo" function
1 mod 4 = 1
2 mod 4 = 2
3 mod 4 = 3
4 mod 4 = 0
5 mod 4 = 1
6 mod 4 = 2
7 mod 4 = 3
8 mod 4 = 0
Using "mod 4" as a hash funtion creates 4 possible results. These results are used by Oracle as "buckets" or identifiers of locations to store things. The things in this case will be block headers. 
 
Block headers are meta data about data block including pointers to the actual datablock as well as pointers to the other headers in the same bucket. 
 
The block headers in the hash buckets are connected via a doubly linked list. One link points forward the other points backwards
 
The resulting layout looks like
 
the steps to find a block in the cache are

If there are a lot of sessions concurrently accessing the same buffer header (or buffer headers in the same bucket) then the latch that protects that bucket will get hot and users will have to wait getting "latch: cache buffers chains" wait.
 
Two ways this can happen (among probably several others)
 
 
For the nested loops example, Oracle will in some (most?) cases try and pin the root block of the index because Oracle knows we will be using it over and over. When a block is pinned we don't have to use the cbc latch. There seem to be cases (some I think might be bugs) where the root block doesn't get pinned. (I want to look into this more - let me know if you have more info)
 
One thing that can make CBC latch contention worse is if a session is modifying the data block that users are reading because readers will clone a block with uncommitted changes and roll back the changes in the cloned block:
 
all these clone copies will go in the same bucket and be protected by the same latch:
 
How many copies of a block are in the cache?
 
select 
       count(*)
     , name
     , file#
     , dbablk
     , hladdr 
from   x$bh bh
          , obj$ o
where 
      o.obj#(+)=bh.obj and
      hladdr in 
(
    select ltrim(to_char(p1,'XXXXXXXXXX') )
    from v$active_session_history 
    where event like 'latch: cache%'
    group by p1 
)
group by name,file#, dbablk, hladdr
having count(*) > 1
order by count(*);
 
 
CNT NAME        FILE#  DBABLK HLADDR
--- ---------- ------ ------- --------
 14 MYDUAL          1   93170 2C9F4B20
 
Notice that the number of copies, 14, is higher the the max number of copies allowed set by "_db_block_max_cr_dba = 6" in 10g. The reason is this value is just a directive not a restriction. Oracle tries to limit the  number of copies.
 
 
Solutions
Find SQL ( Why is application hitting the block so hard? )
Possibly change application logic
Eliminate hot spots
Nested loops, possibly
Hash Partition the index with hot block
Use Hash Join instead of Nested loop join
Use Hash clusters
Look up tables (“select language from lang_table where ...”)
Change application
Use plsql function
Spread data out to reduce contention, like set PCTFREE to 0 and recreate the table so that there is only one row per block
Select from dual
Possibly use x$dual
Note starting in 10g Oracle uses the "fast dual" table (ie x$dual) automatically when executing a query on dual as long as the column "dummy" is not accessed. Accessing dummy would be cases like
    select count(*) from dual;
    select * from dual;
    select dummy from dual;
an example of not accessing "dummy" would be:
    select 1 from dual;
    select sysdate from dual;
Updates, inserts , select for update on blocks while reading those blocks
Cause multiple copies and make things worse
posted @ 2018-11-15 21:36  dayu.liu  阅读(304)  评论(0编辑  收藏  举报