ora-04030 PGA溢出

最近的生产系统上报了ora-04030错误,log如下。

ORA-04030: out of process memory when trying to allocate 169040 bytes (pga heap,kgh stack)
ORA-04030: out of process memory when trying to allocate 16328 bytes (koh-kghu call ,pmucalm coll)

log上看起来是pga内存不足,系统采用的是dedicated的连接方式,所以pga应该包含uga,不确定是否是uga溢出还是pga本身溢出。如果是pga本身溢出的话,说明可能有大型的排序动作或者大量的绑定变量;如果是uga的话,就不知道了。从oracle官方的文档上来看,还是怀疑绑定变量过多,导致出了问题。后来请oracle的工程师来诊断,问题分析结果类似,贴一部分问题报告如下:

原因分析
从上述日志可以看出,spid = 27971 的会话遇到了ORA-4030错误,其参数是(koh-kghu call ,pmucalm coll)。检查相关的数据库会话信息,发现该会话的program是xxxx.bin,pid = 120:

select sid, to_char( logon_time, 'yyyy-mm-dd hh24:mi' ) logon_time, s.program, p.pid
from v$session s, v$process p 
where p.spid = 27971 and s.paddr = p.addr

      SID LOGON_TIME			  PROGRAM
---------- ------------------------------ ------------------------------------
          8 2012-11-07 05:00		  cmdssvc.tsk@MACHINE01 (TNS V1-V3)

查询该会话所占用的内存:
select * from v$process_memory where pid = 120;

      PID    SERIAL# CATEGORY	       ALLOCATED       USED MAX_ALLOCATED
---------- ---------- --------------- ---------- ---------- -------------
       120	  180 SQL			  136112      25944	  1055016
       120	  180 PL/SQL		  706616	     0    4278693376
       120	  180 Freeable		 1114112	     0
       120	  180 Other		 3092636		  9127260

其中PL/SQL部分有过最高的分配记录,即4278693376 byte = 4080.47998 MB。
根据以上现象,导致ORA-4030报错发生的原因是Oracle以及操作系统对PL/SQL程序中使用的内存有上限(在缺省配制下为4GB),详细可以参见MOS网站上的如下文档:
Support Recommended	PLSQL Procedure Causing ORA-04030: (pga heap,control file i/o buffer) And ORA-04030: (koh-kghu sessi,pmuccst: adt/record) or ORA-04030: (koh-kghucall ,pmucalm coll) Errors (Doc ID 1325100.1)
Bug 11852492: ORA-4030 OCCURS WHEN PGA EXCEEDS 4GB.
ORA-04030 Error For Memory Allocation Type "pl/sql vc2" (Doc ID 1315884.1)

根据这些文档,这个报错的典型发生场景是:
存储过程中有一个或者多个很大的VARRAYs变量,并且调用了很多次
存储过程中有一个或者多个很大的本地变量,并且调用了很多次
或者在上述的任何一个场景中,存储过程错误的进行了死循环调用。

比如,存储过程中有类似如下的变量定义:
DECLARE
TYPE BigVArray IS VARRAY(10000) OF VARCHAR2(32000); -- VARRAY type
...

DECLARE
BigVar VARCHAR2(32000);

该错误的通常原因是PL/SQL的编码问题,对于非常大的数据应该保存在表中,或者不能发生死循环。
之所以此时PGA_AGGREGATE_TARGET 或者PGA_MAX_SIZE参数没有发挥作用,是因为PL/SQL本地变量不是通过常见的Oracle memory heap所管理,它们的大小仅受到操作系统的限制。

原文如下:
The memory allocation type "pl/sql vc2" is used when Oracle needs to allocate memory for a local variable in PL/SQL code.  Some examples of PL/SQL local variables are as follows:

DECLARE
TYPE BigVArray IS VARRAY(10000) OF VARCHAR2(32000); -- VARRAY type
...

DECLARE
BigVar VARCHAR2(32000);
...

The problem can typically occur if:

    a procedure has 1 or more large VARRAYs defined, and is called many times,
    a procedure has 1 or more large local variables, and is called many times,
    or either of the above situations occur, but the procedure is (accidentally) called in an infinite loop.

The cause of the ORA-4030 error is usually a PL/SQL coding issue, as very larger amounts of data should really be stored in a table, or in the case of an infinite loop, the procedure call needs to have a limit.

The reason why neither PGA_AGGREGATE_TARGET nor _PGA_MAX_SIZE has any impact, is because PL/SQL local variables are not managed by the regular Oracle memory heaps. PL/SQL local variables are just like other programming languages' local variables and should adhere to good coding practices. Their size is only limited by the operating system memory limits.


解决方案建议
根据前文的说明,对这一问题的根本解决办法是对PL/SQL代码进行优化。
在剩余物理内存充足的情况下,也可以通过下面的配置,以突破缺省的4GB的限制:
修改操作系统参数vm.max_map_count,由缺省的65536增大为比如20000。

$ more /proc/sys/vm/max_map_count
$ sysctl -w vm.max_map_count=200000 (for example)

修改数据库参数: 

_use_realfree_heap=TRUE
_realfree_heap_pagesize_hint = 262144

缺省的realfree allocator pagesize是64KB,所以缺省最大的4GB。修改为256KB(262144)后,最大限制为16GB。
说明:建议以上修改在测试系统进行详细的测试。

  

 后记,的确是程序中有个无限循环,导致申请了N多变量。

posted @ 2012-11-09 16:23  valleylord  阅读(2861)  评论(0编辑  收藏  举报