Top ASM diskgroup mount issues in RAC Environment (文档 ID 2246762.1)

APPLIES TO:

Oracle Database - Enterprise Edition - Version 11.2.0.4 to 12.1.0.2 [Release 11.2 to 12.1]
Information in this document applies to any platform.

PURPOSE

The purpose of this document is to provide a summary of top ASM diskgroup startup issues and the possible solutions in a RAC environment. 

SCOPE

 DBAs

DETAILS

Issue #1: ASM instance could not mount diskgroup hosting OCR/Voting diskgroup

Symptoms :

I. crsd log reports error in accessing ocr/vote diskgroup

2010-04-30 12:58:53.288: [ CRSOCR][2566444800] OCR context init failure. Error: PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=8, opn=kgfoOpenFile01, dep=15056, loc=kgfokge
ORA-17503: ksfdopn:DGOpenFile05 Failed to open file +OCRVDG.255.4294967295
ORA-17503: ksfdopn:2 Failed to open file +OCRVDG.255.4294967295
ORA-15001: diskgroup
] [8]

2010-04-30 12:58:53.288: [ CRSD][2566444800][PANIC] CRSD exiting: Could not init OCR, code: 26


II. kfed read on the asm disk header reports valid disk header

# kfed read /asmocrvote1
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD <<<<<<<<<
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj: 2147483648 ; 0x008: TYPE=0x8 NUMB=0x0
.
kfdhdb.driver.provstr: ORCLDISK ; 0x000: length=8 <<<<<<<
kfdhdb.driver.reserved[0]: 0 ; 0x008: 0x00000000
.
kfdhdb.dsknum: 0 ; 0x024: 0x0000
kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL <<<<<<<<<
kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER <<<<<<<<<
kfdhdb.dskname: OCRVDG_0000 ; 0x028: length=11
.

Cause :

ASM spfile was corrupt. Using a transient ASM pfile the OCR/Voting (+OCRVDG) diskgroup was mounted.

Solutions :

Startup ASM instance with a temporary pfile and recreate spfile.

Refer note 1095214.1 for detailed instructions.

Issue #2: ORA-15063, ORA-15017, ORA-15063 reported in ASM alert log

Symptoms :

I. ASM try to mount diskgroup, but it fails with insufficient disks or duplicate disks

alert__ASM1.log:

Fri Oct 19 13:09:35 2012
NOTE: No asm libraries found in the system
ERROR: -5(Duplicate disk DATA:DATA_0000) <----------- Notice this entry as it tells that ASM discovers duplicate disk (1)

.

Fri Oct 19 13:09:44 2012
SQL> ALTER DISKGROUP ALL MOUNT /* asm agent */

NOTE: Diskgroup used for Voting files is:
DATASOA
Diskgroup used for OCR is:DATA
Diskgroup used for OCR is:DATASOA.

WARNING: Disk Group DATA containing configured OCR is not mounted
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA" <---- ASM discovers not sufficient disks
ORA-15017: diskgroup "DATASOA" cannot be mounted
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATASOA"

Cause :

ASM finding duplicate disks

Solutions :

Verify the disks are present in OS with proper ownership and permissions. Also verify the asm_diskstring value and check disks are not getting detected as duplicate.

Refer note 1501660.1 for additional details

Issue #3: Not able to Mount Diskgroups in one of the RAC nodes after ASM Abnormally Restart

Symptoms :

I. 2 node RAC, Node 1 was evicted then restarted, +ASM1 started, but diskgroups failed to be mounted with error:

alert_+ASM1.log

Sun Aug 30 19:06:07 2015
SQL> ALTER DISKGROUP ALL MOUNT /* asm agent call crs *//* {0:9:62126} */
NOTE: Diskgroups listed in ASM_DISKGROUPS are
FLASH
NOTE: Diskgroup used for Voting files is:
DATA
Sun Aug 30 19:06:14 2015
ERROR: PST found another heartbeat (grp 1)
ERROR: diskgroup DATA was not mounted
ERROR: PST found another heartbeat (grp 2)
ERROR: diskgroup FLASH was not mounted

ORA-15032: not all alterations performed
ORA-15017: diskgroup "FLASH" cannot be mounted
ORA-15003: diskgroup "FLASH" already mounted in another lock name space
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15003: diskgroup "DATA" already mounted in another lock name space
ERROR: ALTER DISKGROUP ALL MOUNT /* asm agent call crs *//* {0:9:62126} */

Cause :

It is due to problem at storage layer. Storage replication happened during that exact time caused two identical disks (voting disk) presented, that led to split brain for clusterware and node eviction 

Solution :

Ensure storage layer is not presenting identical disks at same time. Refer note 2053224.1 for more details

posted @ 2018-06-25 15:17  静心のboke  阅读(588)  评论(0编辑  收藏  举报