了解Maclean Liu|向Maclean Liu提问 Oracle ALLSTARS 全明星(群内有多位Oracle高级售后support,N位OCM和ACE) QQ群 # QQ群号:23549328 # 已经升级到 2000人群,空位多多。欢迎有一定基础的Oracle骨友加入,现在入群需要经过Maclean的技术面试,欢迎面试,请加QQ号:47079569 为好友参加面试 2群基础群 适合刚入门的同学,会共享最佳入门实践和资料 QQ群 # QQ群号:171092051 # 已经升级到 500人的超级群,空位多多,无需面试

ora-7445 [kghalp+0500] [SIGSEGV]错误

今天没有外出(似乎人不到现场就特别容易出问题),早上10点左右接到电话被告知crm11实例上出现了7445错误,准备用web vpn拨上去查看一下,赫然发觉windows 7 不支持这种vpn(准确说ie8和firefox都不支持);无奈无奈只好用拨号。 发现alert log中出现大量 7445错误记录:
Fri Mar 26 09:24:53 2010
Errors in file /oravl01/oracle/admin/CRMDB1/udump/crmdb11_ora_6754320.trc:
ORA-07445: exception encountered: core dump [kghalp+0500] [SIGSEGV] [Invalid permissions for mapped object] [0x00000003B] [] []
Fri Mar 26 09:24:55 2010
Trace dumping is performing id=[cdmp_20100326092455]
Fri Mar 26 09:31:16 2010
Errors in file /oravl01/oracle/admin/CRMDB1/udump/crmdb11_ora_2994552.trc:
ORA-07445: exception encountered: core dump [kghalp+0500] [SIGSEGV] [Invalid permissions for mapped object] [0x00000003B] [] []
看到kghalp函数第一印象 ,是Oracle中堆管理使用的函数; 让我们猜猜字面意思? k -> kernel g -> generic h-> heap a-> allocation p-> point 再让我们来看一下当时的call stack:
Exception signal: 11 (SIGSEGV), code: 51 (Invalid permissions for mapped object), addr: 0x3b, PC: [0x1000973e0, kghalp+0500]
Registers:
iar: 00000001000973e0, msr: a00000000000d0b2
 lr: 00000001013a6df8,  cr: 0000000022292484
r00: 0000000000000010, r01: 0ffffffffffcb160, r02: 000000011022a9c0,
r03: 0000000000000002, r04: 0000000000000000, r05: 0000000000000100,
r06: 0000000000000001, r07: 0000000000000000, r08: 0000000000000000,
r09: 0000000000000000, r10: 00000000101b60d8, r11: 0000000000000004,
r12: 0000000024592484, r13: 000000011026bfe0, r14: 0000000000000000,
r15: 0000000000009000, r16: 0000000110195b2c, r17: 0000000000000000,
r18: 0000000000000001, r19: 0000000000000000, r20: 0000000000001000,
r21: 0000000000000000, r22: 0000000000000100, r23: 0000000000000001,
r24: 0000000000000000, r25: 0000000000000000, r26: 0000000000000001,
r27: 0000000104c7fd44, r28: 0000000000000000, r29: 0000000000000100,
r30: 0000000000000000, r31: 0000000110195a58,
*** 2010-03-26 09:57:28.679
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [kghalp+0500] [SIGSEGV] [Invalid permissions for mapped object] [0x00000003B] [] []
Current SQL statement for this session:
INSERT INTO AUDIT_DDL_LOG (DDL_TIME, SESSION_ID, OS_USER, IP_ADDRESS, TERMINAL, HOST, USER_NAME, DDL_TYPE, OBJECT_TYPE, OWNER, OBJECT_NAME, SQL_TEXT) VALUES (SYSDATE, SYS_CONTEXT('USERENV','SESSIONID'), SYS_CONTEXT('USERENV','OS_USER'), SYS_CONTEXT('USERENV','IP_ADDRESS'), SYS_CONTEXT('USERENV','TERMINAL'), SYS_CONTEXT('USERENV','HOST'), ORA_LOGIN_USER, ORA_SYSEVENT, ORA_DICT_OBJ_TYPE, ORA_DICT_OBJ_OWNER, ORA_DICT_OBJ_NAME, :B1 )
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
70000043da500d0        10  anonymous block
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst+001c          bl       ksedst1              000000000 ? 104A54EED ?
ksedmp+0290          bl       ksedst               104A54870 ?
ssexhd+03e0          bl       ksedmp               300001D15 ?
000044C0             ?        00000000
parchk+01f4          bl       kghalp               000000000 ?
                                                   2842288200000001 ?
                                                   000000000 ? 000000000 ?
                                                   000001040 ? 110195B2C ?
ptmak+0168           bl       parchk               FFFFFFFFFFCB560 ?
                                                   FFFFFFFFFFCB430 ?
                                                   FFFFFFFFFFCB430 ?
pdybF00_Init+0244    bl       ptmak                10008049C ? 000000000 ?
                                                   FFFFFFFFFFCB4F0 ? 07FFFFFFF ?
pdy1F79_Init+00c8    bl       pdybF00_Init         110BEB1D0 ?
pdy1F01_Driver+0048  bl       pdy1F79_Init         FFFFFFFFFFCBC40 ?
pdli_new_cog+00f0    bl       pdy1F01_Driver       FFFFFFFFFFCBCE0 ? 000000000 ?
pdlifu+0264          bl       pdli_new_cog         1013885F4 ? FFFFFFFFFFCCB00 ?
                                                   7000004383E7680 ?
phpcog+0010          bl       pdlifu               FFFFFFFFFFCD958 ?
                                                   7000004383E7680 ? 104C95048 ?
phpcmp+0f80          bl       phpcog               FFFFFFFFFFCC4F0 ? 000000000 ?
pcicms2+02d4         bl       phpcmp               FFFFFFFFFFCD958 ?


发生错误的最上层 kghalp 函数由 parchk 调用, 这似乎是一个package check函数(猜测,呵呵). 我们来整理一下思路, parchk 函数调用了 kghalp函数以帮其分配内存,但却得到了一个非法的低地址[[0x00000003B],正常情况下正文段使用的空间; 这看起来显然是一个bug。 让我们来查查support.oracle.com , 键入7445 kghalp 和sigsegv 关键字 (很多时候不需要使用ora 600/7445 lookup tools). bug 8244533 赫然显目:
Bug 8244533: ORA-07445 [KGHALP] ERRORS COMPILING PACKAGE WITH DEBUG
    STACK TRACE:
    ------------
       ksedst <- ksedmp <- ssexhd <- 000044BC <- parchk        <- ptmak <-
    pdybF00_Init <- pdy1F79_Init <- pdy1F01_Driver <- pdli_new_cog         <-
    pdlifu <- phpcog <- phpcmp <- pcicms2 <- pcicms          <- kkxcms <- kkxswcm
    <- kkxmpbms <- kkxmesu <- xtypls           <- qctopls <- qctcopn <- qctcopn

    Exception signal: 11 (SIGSEGV), code: 51 (Invalid permissions for mapped
    object),
    addr: 0x3b, PC: [0x1000973e0, kghalp+0500]
    Registers:
    iar: 00000001000973e0, msr: a00000000000d0b2
    lr: 000000010139ffb8,  cr: 00000000222a2484
    r00: 0000000000000010, r01: 0ffffffffffe2980, r02: 00000001101e5ab8,
    r03: 0000000000000002, r04: 0000000000000000, r05: 0000000000000100,
    r06: 0000000000000001, r07: 0000000000000000, r08: 0000000000000000,
    r09: 0000000000000000, r10: 0000000010171200, r11: 0000000000000004,
    r12: 00000000245a2484, r13: 000000011021fbc0, r14: 0000000000000000,
    r15: 0000000000009000, r16: 0000000110150c54, r17: 0000000000000000,
    r18: 0000000000000001, r19: 0000000000000000, r20: 0000000000001000,
    r21: 0000000000000000, r22: 0000000000000100, r23: 0000000000000001,
    r24: 0000000000000000, r25: 0000000000000000, r26: 0000000000000001,
    r27: 0000000104c5983c, r28: 0000000000000000, r29: 0000000000000100,
    r30: 0000000000000000, r31: 0000000110150b80,
    *** 16:37:14.603
    ksedmp: internal or fatal error
    ORA-7445: exception encountered: core dump [kghalp+0500] [SIGSEGV]
    [Invalid permissions for mapped object] [0x00000003B] [] []
    Current SQL statement for this session:
    select dummy from dual where  ora_dict_obj_type = 'TABLE'
----- Call Stack Trace -----ptmak pdybF00_Init pdy1F79_Init pdy1F01_Driver pdli_new_cog pdlifuphpcog phpcmp pcicms2 pcicms kkxcms kkxswcm kkxmpbms kkxmesu xtyplsTo Filer.Based on this call stack this would appear a likely match forbug 6951953 Abstract: ORA-7445 [PTMAK] IMPORTING PACKAGE COMPILED DEBUG.This bug is fixed on 10.2.0.5 and there is a 10.2.0.4 patch available for IBM AIX Based Systems (64-bit).It maybe worth while to have the customer apply the patch to seeif it resolves the issue.Also the uploaded files included test.sql is this a reproducable testcase?
这个bug 似乎仅在 IBM AIX on POWER Systems (64-bit) 发生,当以DEBUG 模式编译包时有一定几率出现。 好了,既然已经了解了可能发生的诱因,我们可以进一步分析了,接下来看看 errorstack trace信息中 的SO 记录。
      SO: 70000043d217668, type: 53, owner: 70000048cee2238, flag: INIT/-/-/0x00
      LIBRARY OBJECT LOCK: lock=70000043d217668 handle=700000446261588 mode=N
      call pin=0 session pin=0 hpc=0000 hlc=0000
      htl=70000043d2176e8[70000042b52b368,70000042bb9a808] htb=70000044929b460 ssga=70000044929ad68
      user=70000048cee2238 session=70000048eb33010 count=1 flags=[0000] savepoint=0x4bac1488
      LIBRARY OBJECT HANDLE: handle=700000446261588 mtx=7000004462616b8(1) cdp=1
      name=ALTER TRIGGER "SHUCRM3O"."TRI_PRODUCT_INSTANCE_RELATED" COMPILE DEBUG REUSE SETTINGS
      hash=164e6a8942406cee159f8943a1a3c85e timestamp=03-26-2010 09:52:12
      namespace=CRSR flags=RON/KGHP/TIM/PN0/SML/KST/DBN/MTX/[120100d0]
      kkkk-dddd-llll=0000-0001-0001 lock=N pin=0 latch#=16 hpc=0002 hlc=0002
      lwt=700000446261630[700000446261630,700000446261630] ltm=700000446261640[700000446261640,700000446261640]
      pwt=7000004462615f8[7000004462615f8,7000004462615f8] ptm=700000446261608[700000446261608,700000446261608]
      ref=700000446261660[700000446261660,700000446261660] lnd=700000446261678[700000446261678,700000446261678]
        LIBRARY OBJECT: object=70000045adbc1e8
        type=CRSR flags=EXS[0001] pflags=[0000] status=VALD load=0
        CHILDREN: size=16
        child#    table reference   handle
             5 70000041776f5c0 70000045ae44720 70000042bfa3a20
        DATA BLOCKS:
        data#     heap  pointer    status pins change whr
            0 70000043d9fed20 70000045adbc300 I/P/A/-/-    0 NONE   00
的确有以debug 模式编译对象的语句,不过对象不是包而是trigger ; 看起来只要是可以以debug 模式compile 的对象都有可能引发该问题。 好了,问题到这里已经比较明确了: 应用端以DEBUG模式重新编译包引发了 Oracle bug 8244533,从而导致了对应服务进程的崩溃;总算是虚惊一场,之后通过trace内的machine和user信息找到了实施变更的应用方人员并教育之。

posted on 2010-03-26 19:09  Oracle和MySQL  阅读(540)  评论(0编辑  收藏  举报

导航