Oracle内部错误:ORA-00600[kccchb_3]一例
一套Linux x86-64上的11.1.0.7 RAC系统,该RAC使用Netapps NFS作为共享存储。其中一个节点出现ORA-00600: internal error code, arguments: [kccchb_3]内部错误并导致实例意外终止,详细日志如下:
Mon Dec 27 00:03:13 2010 Error: Controlfile sequence number in file header is different from the one in memory Please check that the correct mount optionsare used if controlfile is located on NFS Errors in file /d02/oratst/TSMOTEBS/db/11.1.0/admin/TSMOTEBS1_l118464dcss3001/diag/rdbms/tsmotebs/TSMOTEBS1/trace/TSMOTEBS1_ckpt_15907.trc (incident=11353): ORA-00600: internal error code, arguments: [kccchb_3], [555541], [555540], [555540], [], [], [], [], [], [], [], [] Incident details in: /d02/oratst/TSMOTEBS/db/11.1.0/admin/TSMOTEBS1_l118464dcss3001/diag/rdbms/tsmotebs/TSMOTEBS1/incident/incdir_11353/TSMOTEBS1_ckpt_15907_i11353.trc Errors in file /d02/oratst/TSMOTEBS/db/11.1.0/admin/TSMOTEBS1_l118464dcss3001/diag/rdbms/tsmotebs/TSMOTEBS1/trace/TSMOTEBS1_ckpt_15907.trc: ORA-00600: internal error code, arguments: [kccchb_3], [555541], [555540], [555540], [], [], [], [], [], [], [], [] CKPT (ospid: 15907): terminating the instance due to error 469根据Metalink文档Note 303986.1 RAC instance using NFS via Netapps report Ora-600 [kccchb_3]:
Symptoms RAC instance using NFS via Netapps occasionally reports: ORA-00600 [kccchb_3], [689], [0], [],[],[],[] ORA-00600 [kclchkblk_3], [7], [738515663], [14], [], [], [], [] Cause This is most probably a Netapp issue wherein 'noac' doesn't work as expected. NetApp recommended mount options for Oracle9i RAC on Solaris: rw,bg,vers=3,proto=tcp,hard,intr,rsize=32768,wsize=32768,forcedirectio,noac NetApp recommended mount options for Linux SLES9, RHEL4, RHEL3 QU3 and later: rw,bg,hard,nointr,tcp,vers=3,timeo=600,rsize=ORACLE_BLOCK_SIZE, wsize=ORACLE_BLOCK_SIZE,actimeo=0 NetApp recommended mount options for Linux RHAS2.1, RHEL3 pre QU3: rw,bg,hard,nointr,tcp,vers=3,timeo=600,rsize=ORACLE_BLOCK_SIZE, wsize=ORACLE_BLOCK_SIZE,noac And for AS 2.1 in /etc/modules.conf add: options nfs nfs_uncached_io=1 Solution Workaround set filesystemio_options=DIRECTIO for all instances (to bypass any buffer cache)以上介绍了该ORA-00600 [kccchb_3]内部错误可以通过设置初始化参数filesystemio_options为DIRECTIO来workaround;实际上更好的选择可以是设置filesystemio_options为SETALL,即同时使用DIRECTIO和ASYNC异步IO。 顺便提一下在Linux上启用异步ASYNC IO不仅需要设置该filesystemio_options参数,我们还需要同时保证以下几点: 1.filesystemio_options参数设置为ASYNC或者SETALL 2.合理设置Linux Kernel内核参数fs.aio-max-nr,例如设置为3145728;以及参数fs.file-max =6815744 3.disk_asynch_io初始化参数设置为TRUE 4.在10gR2下oracle binary可能没有正确以async_on选项编译,我们需要手动重新make(The reason behind this behavior is that the LIBAIO_0.1 Linux OS library is not attached for io_getevents.)
SQL>shutdown immediate [maclean@rh8 ~]$ cd $ORACLE_HOME/rdbms/lib [maclean@rh8 lib]$ ln -s /usr/lib/libaio.so skgaio.o [maclean@rh8 lib]$ make PL_ORALIBS=-laio -f ins_rdbms.mk async_on rm -f /u01/oracle/product/10.2.0/db_1/rdbms/lib/skgaioi.o cp /u01/oracle/product/10.2.0/db_1/rdbms/lib/skgaio.o /u01/oracle/product/10.2.0/db_1/rdbms/lib/skgaioi.o chmod 755 /u01/oracle/product/10.2.0/db_1/bin - Linking Oracle rm -f /u01/oracle/product/10.2.0/db_1/rdbms/lib/oracle gcc -o /u01/oracle/product/10.2.0/db_1/rdbms/lib/oracle -L/u01/oracle/product/10.2.0/db_1/rdbms/lib/ -L/u01/oracle/product/10.2.0/db_1/lib/ -L/u01/oracle/product/10.2.0/db_1/lib/stubs/ -L/usr/lib -lirc -Wl,-E `test -f /u01/oracle/product/10.2.0/db_1/rdbms/lib/sskgpsmti.o && echo /u01/oracle/product/10.2.0/db_1/rdbms/lib/sskgpsmti.o` /u01/oracle/product/10.2.0/db_1/rdbms/lib/opimai.o /u01/oracle/product/10.2.0/db_1/rdbms/lib/ssoraed.o /u01/oracle/product/10.2.0/db_1/rdbms/lib/ttcsoi.o /u01/oracle/product/10.2.0/db_1/rdbms/lib/defopt.o -Wl,--whole-archive -lperfsrv10 -Wl,--no-whole-archive /u01/oracle/product/10.2.0/db_1/lib/nautab.o /u01/oracle/product/10.2.0/db_1/lib/naeet.o /u01/oracle/product/10.2.0/db_1/lib/naect.o /u01/oracle/product/10.2.0/db_1/lib/naedhs.o /u01/oracle/product/10.2.0/db_1/rdbms/lib/config.o -lserver10 -lodm10 -lnnet10 -lskgxp10 -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lxml10 -lcore10 -lunls10 -lsnls10 -lnls10 -lcore10 -lnls10 -lhasgen10 -lcore10 -lskgxn2 -locr10 -locrb10 -locrutl10 -lhasgen10 -lcore10 -lskgxn2 -lclient10 -lvsn10 -lcommon10 -lgeneric10 -lknlopt `if /usr/bin/ar tv /u01/oracle/product/10.2.0/db_1/rdbms/lib/libknlopt.a | grep xsyeolap.o > /dev/null 2>&1 ; then echo "-loraolap10" ; fi` -lslax10 -lpls10 -lplp10 -lserver10 -lclient10 -lvsn10 -lcommon10 -lgeneric10 -lknlopt -lslax10 -lpls10 -lplp10 -ljox10 -lserver10 -lclsra10 -ldbcfg10 -locijdbcst10 -lwwg `cat /u01/oracle/product/10.2.0/db_1/lib/ldflags` -lnsslb10 -lncrypt10 -lnsgr10 -lnzjs10 -ln10 -lnnz10 -lnl10 -lnro10 `cat /u01/oracle/product/10.2.0/db_1/lib/ldflags` -lnsslb10 -lncrypt10 -lnsgr10 -lnzjs10 -ln10 -lnnz10 -lnl10 -lmm -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lxml10 -lcore10 -lunls10 -lsnls10 -lnls10 -lcore10 -lnls10 `cat /u01/oracle/product/10.2.0/db_1/lib/ldflags` -lnsslb10 -lncrypt10 -lnsgr10 -lnzjs10 -ln10 -lnnz10 -lnl10 -lnro10 `cat /u01/oracle/product/10.2.0/db_1/lib/ldflags` -lnsslb10 -lncrypt10 -lnsgr10 -lnzjs10 -ln10 -lnnz10 -lnl10 -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lxml10 -lcore10 -lunls10 -lsnls10 -lnls10 -lcore10 -lnls10 `if /usr/bin/ar tv /u01/oracle/product/10.2.0/db_1/rdbms/lib/libknlopt.a | grep "kxmnsd.o" > /dev/null 2>&1 ; then echo " " ; else echo "-lordsdo10"; fi` -lctxc10 -lctx10 -lzx10 -lgx10 -lctx10 -lzx10 -lgx10 -lordimt10 -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lxml10 -lcore10 -lunls10 -lsnls10 -lnls10 -lcore10 -lnls10 -lsnls10 -lunls10 -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lxml10 -lcore10 -lunls10 -lsnls10 -lnls10 -lcore10 -lnls10 -laio `cat /u01/oracle/product/10.2.0/db_1/lib/sysliblist` -Wl,-rpath,/u01/oracle/product/10.2.0/db_1/lib -lm `cat /u01/oracle/product/10.2.0/db_1/lib/sysliblist` -ldl -lm -L/u01/oracle/product/10.2.0/db_1/lib /u01/oracle/product/10.2.0/db_1/lib//libcore10.a(lcd.o): In function `lcdprm': lcd.c:(.text+0x8e8): warning: the `gets' function is dangerous and should not be used. mv -f /u01/oracle/product/10.2.0/db_1/bin/oracle /u01/oracle/product/10.2.0/db_1/bin/oracleO mv /u01/oracle/product/10.2.0/db_1/rdbms/lib/oracle /u01/oracle/product/10.2.0/db_1/bin/oracle chmod 6751 /u01/oracle/product/10.2.0/db_1/bin/oracle SQL> startup; /* 通过以下方式可以验证ASYNC IO的工作情况 */ cat /proc/slabinfo | grep kio [maclean@rh8 ~]$ cat /proc/slabinfo | grep kio kioctx 56 72 320 12 1 : tunables 54 27 8 : slabdata 6 6 0 kiocb 15 15 256 15 1 : tunables 120 60 8 : slabdata 1 1 0 /* 这里kioctx的值大于零,说明正在使用异步ASYNC IO */ 如何是11g,那么无需那么麻烦要重新编译async_on,同时可以通过以下手段了解异步ASYNC IO的使用情况: SQL> col name for a60 SQL> SELECT name, asynch_io FROM v$datafile f,v$iostat_file i WHERE f.file# = i.file_no AND filetype_name = 'Data File' ; NAME ASYNCH_IO ------------------------------------------------------------ --------- /standby/oradata/PROD/datafile/o1_mf_system_6q9dwgwh_.dbf ASYNC_ON /standby/oradata/PROD/datafile/o1_mf_sysaux_6q9dwgyp_.dbf ASYNC_ON /standby/oradata/PROD/datafile/o1_mf_undotbs1_6q9dwh0r_.dbf ASYNC_ON /standby/oradata/PROD/datafile/o1_mf_users_6q9dwh2x_.dbf ASYNC_ON /standby/oradata/PROD/datafile/o1_mf_example_6q9dzhh1_.dbf ASYNC_ON /standby/oradata/PROD/datafile/o1_mf_enc_6q9fdzcx_.dbf ASYNC_ON 6 rows selected. /* 可以看到ASYNCH_IO的状态为ON,说明正使用异步IO */
posted on 2010-12-30 21:51 Oracle和MySQL 阅读(476) 评论(0) 编辑 收藏 举报