我们在学习Oracle的过程中,或多或少会存在个人对概念的理解错误、误解或者根本是教材编写存在不严谨的地方,这样或以讹传讹或三人言虎,导致在Oracle圈子存在着一些古老相传的迷信(superstition),因为这些迷信已经深入人心了,所以我们几乎很难纠正过来;这其实很有意思,IT作为一个高科技的领域也会出现迷信,说明我们在IT技术的"教学"和"思考"上存在问题,这一点值得深思。
这里我列出几个最为常见的迷信,算作抛砖引玉:
1.几乎所有的Oracle入门教程都会在介绍Large pool的时候这样描述:"RMAN 备份使用large pool作为磁盘I/O缓冲区,配置Large pool有助于提高RMAN备份性能"
Truth:除非你启用了slaves IO,否则rman并不使用large pool
RMAN I/O可以分成三种模式:
Mode |
Disk |
tape |
Asynchronous I/O |
绝大多数操作系统支持AIO,默认disk_asynch_io为TRUE,即默认启用磁盘异步IO。如果磁盘设备不支持AIO,那么会使用synchronous I/O。磁盘异步模式下RMAN I/O缓冲区域从PGA中分配,相关IO性能信息存放在V$backup_async_io视图中 |
磁带设备本身不支持AIO(tape I/O is always synchronous),虽然默认tape_asynch_io为TRUE,但磁带设备只能通过IO slaves模拟异步IO,所以启用磁带AIO需要另外设置backup_tape_io_slaves=TRUE。此模式下RMAN I/O缓冲区从shared pool或者large pool中分配,相关IO性能信息存放在V$backup_async_io视图中 |
Synchronous I/O |
若disk_asynch_io设置为false,或操作系统不支持异步IO,且dbwr_io_slaves=0时启用Synchronous I/O。此时RMAN I/O缓冲区从PGA中分配,相关IO性能信息存放在V$backup_sync_io视图中 |
默认backup_tape_io_slaves为false,即磁带设备默认不启用AIO而使用Synchronous I/O。此时RMAN I/O缓冲区从PGA中分配,相关性能信息存放在V$backup_sync_io视图中 |
Slaves I/O |
启用disk slaves I/O,要求设置disk_asynch_io=false且dbwr_io_slaves>0。此模式下RMAN I/O缓冲区从shared pool或者large pool中分配,相关IO性能信息存放在V$backup_async_io视图中 |
设置tape_asynch_io=true且backup_tape_io_slaves=true时启用,磁带的AIO模式其实就是使用slaves Io模拟获得的。所以此模式下的一切和tape AIO完全一样 |
我们在使用RMAN备份数据库时无论是磁盘备份还是磁带备份总是优先期望使用AIO异步IO特性(tape aio比较特殊,见上表),使用AIO的前提是设置合理的初始化参数以及操作系统支持AIO,如果我们使用的操作系统不支持AIO那么我们将不得不使用Synchronous IO同步IO。这并不是世界末日,因为Oracle提供了IO从属进程(slaves IO)来模拟AIO,当然这是退而求其次的。为了启用slaves IO,我们需要手动设置backup_tape_io_slaves或dbwr_io_slaves参数来启用IO从属特性,当使用磁带备份时设置backup_tape_io_slaves(
此时tape_asynch_io应当为true)为true,当使用磁盘设备时设置dbwr_io_slaves(此时disk_asynch_io应当为false)为非零值。在启用slaves IO的前提下RMAN才会从Large pool当中分配内存并加以利用,如果没有配置large pool(注意如果启用了ASMM,那么Oracle会自动为large pool分配一个granule大小的空间)或者large pool过小,那么RMAN的内存缓冲区将从shared pool中分配。如果Oracle仍不能获得足够内存,那么将本地进程获取足够的IO缓存。若我们启用了I/O slaves,那么很有必要配置一个足够大的Large pool(一般60-100M就足够了),这样RMAN的I/O缓存区可以从large pool中分配,避免了RMAN的I/O buffer和shared pool中的library cache等其他组件发生竞争。
If I/O slaves are used, I/O buffers are obtained from the SGA ,or the large pool, if configured.If LARGE_POOL_SIZE is set, then Oracle attempts to get memory from the large pool. If this value is not large enough, then Oracle does not try to get buffers from the shared pool.If Oracle cannot get enough memory, then it obtains I/O buffer memory from local process memory and writes a message to the alert.log file indicating that synchronous I/O is used for this backup.
在默认情况下Oracle对于磁盘设备使用AIO模式(disk_asynch_io=true & dbwr_io_slaves=0 by default),而对于磁带设备使用synchronous I/O(tape_asynch_io=true & backup_tape_io_slaves=false by default),都不会启用slaves I/O,所以默认情况下RMAN总是从PGA中分配缓存。换而言之在默认情况下,即便配置了较大的Large pool也不会为RMAN所用。
RMAN allocates the tape buffers in the SGA or the PGA, depending on whether I/O slaves are used. If you set the initialization parameter BACKUP_TAPE_IO_SLAVES = true, then RMAN allocates tape buffers from the SGA or the large pool if the LARGE_POOL_SIZE initialization parameter is set. If you set the parameter to false, then RMAN allocates the buffers from the PGA.
我们来通过以下演示,进一步验证AIO/Slave IO环境下RMAN内存缓冲区从哪里分配,并加强印象:
SQL> select * From v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE 10.2.0.4.0 Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
SQL> show parameter async
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
disk_asynch_io boolean TRUE
tape_asynch_io boolean TRUE
SQL> select * From v$sgastat where pool='large pool';
POOL NAME BYTES
------------ -------------------------- ----------
large pool PX msg pool 903840
large pool free memory 15873376
backup as backupset database skip offline;
SQL> select * From v$sgastat where pool='large pool';
POOL NAME BYTES
------------ -------------------------- ----------
large pool PX msg pool 903840
large pool free memory 15873376
/* 在AIO模式下,全库备份后发现large pool未发生变化 */
SQL> alter system set disk_asynch_io=false scope=spfile;
System altered.
SQL> alter system set dbwr_io_slaves=2 scope=spfile;
System altered.
/* 以上启用了磁盘I/O Slave特性 */
SQL> startup force;
[oracle@rh2 ~]$ ps -ef|grep i10|grep -v grep
oracle 20761 1 0 20:44 ? 00:00:00 ora_i101_G10R2
oracle 20763 1 0 20:44 ? 00:00:00 ora_i102_G10R2
/* 启用I/O Slave后会出现ora_ixxx_SID这样的后台进程 */
SQL> select * From v$sgastat where pool='large pool';
POOL NAME BYTES
------------ -------------------------- ----------
large pool PX msg pool 903840
large pool free memory 15873376
RMAN> backup as backupset database skip offline;
SQL> select * From v$sgastat where pool='large pool';
POOL NAME BYTES
------------ -------------------------- ----------
large pool PX msg pool 903840
large pool free memory 24151392
large pool KSFQ Buffers 25276416
SQL> /
POOL NAME BYTES
------------ -------------------------- ----------
large pool PX msg pool 903840
large pool free memory 41006432
large pool KSFQ Buffers 8421376
/* 启用了I/O Slave后执行备份操作,
large pool中出现了KSFQ Buffers,这个KSFQ buffer就是RMAN所使用的缓冲区,
实际决定该buffer大小的是隐藏参数_backup_ksfq_bufsz和_backup_ksfq_bufcnt */
SQL> col name for a30
SQL> col describ for a70
SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE, x.ksppdesc describ
2 FROM SYS.x$ksppi x, SYS.x$ksppcv y
3 WHERE x.inst_id = USERENV ('Instance')
4 AND y.inst_id = USERENV ('Instance')
5 AND x.indx = y.indx
6 AND x.ksppinm LIKE '%ksfq%';
NAME VALUE DESCRIB
------------------------------ ---------- ----------------------------------------------------------------------
_backup_ksfq_bufsz 0 size of the ksfq buffer used for backup/restore
_backup_ksfq_bufcnt 0 number of the ksfq buffers used for backup/restore
/* 在10g中似乎Oracle会自动调控以上2个参数 */
SQL> alter system set "_backup_ksfq_bufsz"=131072;
System altered.
SQL> alter system set "_backup_ksfq_bufcnt"=1;
System altered.
RMAN> backup tablespace data01;
/* I/O slaves的IO统计信息仍存放在V$backup_sync_io视图中,
而非可能是你所预期的v$backup_sync_io视图 */
SQL> select type,buffer_size,buffer_count from v$backup_async_io;
TYPE BUFFER_SIZE BUFFER_COUNT
--------- ----------- ------------
AGGREGATE 0 0
INPUT 131072 1
OUTPUT 1048576 4
另外large pool的使用量可以通过下列公式来估算:
LARGE_POOL_SIZE =
(4 * {RMAN Channels} * {DB_BLOCK_SIZE} * {DB_DIRECT_IO_COUNT} * {Multiplexing Level})
+
(4 * {RMAN Channels} * {Tape Buffer Size})
事实上如果你probe过PGA的内存使用情况,那么你或许会在PGA headdump中看到过"KSFQ heap"的相关信息。显然当在非slaves IO模式下,RMAN会从PGA HEAP->KSFQ heap这个subheap子堆中分配必要的buffer。
我们在磁盘AIO模式下执行必要的backup操作,之后找出RMAN相关的shadow process并对其做heapdump,分析其pga内存使用情况
SQL> select spid,pga_used_mem,pga_max_mem from v$process where addr in
2 (select paddr from v$session where program like '%rman%')
3 order by pga_used_mem desc ;
SPID PGA_USED_MEM PGA_MAX_MEM
------------ ------------ -----------
24424 5750341 14410829
24425 4717957 12134125
24413 3308341 9626701
24423 435773 993005
SQL> oradebug setospid 24424;
Oracle pid: 25, Unix process pid: 24424, image: oracle@rh2.oracle.com (TNS V1-V3)
SQL> oradebug dump heapdump 536870917;
Statement processed.
SQL> oradebug tracefile_name;
/s01/admin/G10R2/udump/g10r2_ora_24424.trc
==========================heapdump details==============================
FIVE LARGEST SUB HEAPS for heap name="pga heap" desc=0x68d3ec0
Subheap ds=0x87c83e8 heap name= KSFQ heap size= 4205296
owner=(nil) latch=(nil)
******************************************************
HEAP DUMP heap name="KSFQ heap" desc=0x87c83e8
extent sz=0x1040 alt=32767 het=32767 rec=0 flg=2 opc=2
parent=0x68d3ec0 owner=(nil) nex=(nil) xsz=0x20228
EXTENT 0 addr=0x7f86bf788dd8
Chunk 7f86bf788de8 sz= 1049112 freeable "KSFQ Buffers "
EXTENT 1 addr=0x7f86bf988dd8
Chunk 7f86bf988de8 sz= 1049112 freeable "KSFQ Buffers "
EXTENT 2 addr=0x7f86bfb88dd8
Chunk 7f86bfb88de8 sz= 1049112 freeable "KSFQ Buffers "
EXTENT 3 addr=0x7f86bfc98dd8
Chunk 7f86bfc98de8 sz= 1049112 freeable "KSFQ Buffers "
EXTENT 4 addr=0x7f86bfddf358
Chunk 7f86bfddf368 sz= 5192 freeable "KSFQ ctx "
EXTENT 5 addr=0x87c7680
Chunk 0087c7690 sz= 984 perm "perm " alo=984
Chunk 0087c7a68 sz= 1944 free " "
Chunk 0087c8200 sz= 464 freeable "KSFQ buffer poo"
Total heap size = 4205032
FREE LISTS:
Bucket 0 size=0
Chunk 0087c7a68 sz= 1944 free " "
Total free space = 1944
UNPINNED RECREATABLE CHUNKS (lru first):
PERMANENT CHUNKS:
Chunk 0087c7690 sz= 984 perm "perm " alo=984
Permanent space = 984
/* 以上可以看到KSFQ Heap子堆共占用了4205296=4M内存,
而该服务进程的pga_used_memory总量为5750341 bytes,即KSFQ占该进程PGA的73%
另外这里KSFQ Buffer的大多Chunk是freeable的,仅少量为perm */
另外磁带I/O缓冲区的大小可以在配置通道时指定,其默认值由操作系统决定,一般为64k。我们使用allocate channel命令设置磁带I/O缓冲区,为了达到最佳性能,可以将磁带I/O缓冲区大小设置为256KB或更大,如:
allocate channel maclean1 device type sbt
parms="blksize=262144,ENV=(NB_ORA_SERV=nas,NB_ORA_POLICY=racdb,NB_ORA_CLIENT=rh2)";
结论:
- 在默认情况下(即disk backup使用AIO而tape backup使用sync IO),RMAN backup并不会从Large Pool中获益,而是从PGA中的KSFQ heap中分配必要的I/O内存缓冲区。当然我们还是推荐设置Large_pool_size为100M左右,因为即使是PC服务器也不会缺这一点内存
- 仅当启用I/O slaves时RMAN backup从Large Pool中分配ksfq buffer(ksfq buffer used for backup/restore),在9i/10g中常有因large pool过小而引起ORA-04031错误的案例;若large pool大小为0,那么ksfq buffer会从shared pool中分配,一方面ORA-04031:("shared pool","unknown object","sga heap(1,0)","KSFQ Buffers")错误的概率将大大增加,另一个方面将造成KSFQ与library cache间的竞争,不利于性能。这种情况下RMAN backup的确从Large pool中得到收益,设置large_pool_size为100M仍是被推荐的
- 在非slaves IO模式下RMAN从PGA的KSFQ heap子堆中分配I/O缓冲区,因为在非slaves IO模式下该内存缓冲区没有共享的必要
待修订!!
Mode |
Disk |
tape |
Asynchronous I/O |
绝大多数磁盘设备支持AIO,默认disk_asynch_io为TRUE,即默认启用磁盘异步IO。如果磁盘设备不支持AIO,那么会使用synchronous I/O。磁盘异步模式下RMAN I/O缓冲区域从PGA中分配,相关IO性能信息存放在V$backup_async_io视图中 |
磁带设备本身不支持AIO,虽然默认tape_asynch_io为TRUE,但磁带设备只能通过IO slaves模拟异步IO,所以启用磁带AIO需要另外设置backup_tape_io_slaves=TRUE。此模式下RMAN I/O缓冲区从shared pool或者large pool中分配,相关IO性能信息存放在V$backup_async_io视图中 |
Synchronous I/O |
若disk_asynch_io设置为false,或磁盘设备不支持异步IO,且dbwr_io_slaves=0时启用Synchronous I/O。此时RMAN I/O缓冲区从PGA中分配,相关IO性能信息存放在V$backup_sync_io视图中 |
默认backup_tape_io_slaves为false,即磁带设备默认不启用AIO而使用Synchronous I/O。此时RMAN I/O缓冲区从PGA中分配,相关性能信息存放在V$backup_sync_io视图中 |
Slaves I/O |
启用disk slaves I/O,要求设置disk_asynch_io=false且dbwr_io_slaves>0。此模式下RMAN I/O缓冲区从shared pool或者large pool中分配,相关IO性能信息存放在V$backup_async_io视图中 |
同tape AIO时一样 |