SYSTEM.credentials.domains.root.ASM.Self.XXXX.root not found

SYSTEM.credentials.domains.root.ASM.Self.XXXX.root not found

1、现象

Oracle 12.2.0.1.0 集群关闭后,无法启动。

crs状态检查卡在storage starting,其他资源也无法启动

[root@db2 ~]# /oracle/product/12.2.0.1/grid/bin/crsctl status res -t -init
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       xd1archdb2               STABLE
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       xd1archdb2               STABLE
ora.crf
      1        ONLINE  OFFLINE                               STABLE
ora.crsd
      1        ONLINE  OFFLINE                               STABLE
ora.cssd
      1        ONLINE  ONLINE       xd1archdb2               STABLE
ora.cssdmonitor
      1        ONLINE  ONLINE       xd1archdb2               STABLE
ora.ctssd
      1        ONLINE  ONLINE       xd1archdb2               OBSERVER,STABLE
ora.diskmon
      1        OFFLINE OFFLINE                               STABLE
ora.driver.afd
      1        ONLINE  ONLINE       xd1archdb2               STABLE
ora.drivers.acfs
      1        ONLINE  ONLINE       xd1archdb2               STABLE
ora.evmd
      1        ONLINE  INTERMEDIATE xd1archdb2               STABLE
ora.gipcd
      1        ONLINE  ONLINE       xd1archdb2               STABLE
ora.gpnpd
      1        ONLINE  ONLINE       xd1archdb2               STABLE
ora.mdnsd
      1        ONLINE  ONLINE       xd1archdb2               STABLE
ora.storage
      1        ONLINE  OFFLINE      xd1archdb2               STARTING      <===================
--------------------------------------------------------------------------------

crs的alert日志中,最后指向[ORAROOTAGENT(199820)]CRS-5019: All OCR locations are on ASM disk groups [OCR_VOTE], and none of these disk groups are mounted. Details are at "(:CLSN00140:)" in "/oracle/gridbase/diag/crs/xd1archdb2/crs/trace/ohasd_orarootagent_root.trc".

检查该日志,可以见到


2021-11-24 11:15:34.101 : USRTHRD:1556182784: {0:5:3} 8154 Error 4 querying length of attr ASM_STATIC_DISCOVERY_ADDRESS

2021-11-24 11:15:34.136 : CLSCRED:1556182784: (:CLSCRED1079:)clsCredOcrKeyExists: Obj dom : SYSTEM.credentials.domains.root.ASM.Self.9d0ad57d52f57f81bf9bdc78d36d559f.root not found
2021-11-24 11:15:34.136 : USRTHRD:1556182784: {0:5:3} 7872 Error 4 opening dom root in 0x7fc828518480

2021-11-24 11:15:35.190 : default:1556182784: clsCredDomClose: Credctx deleted 0x7fc828228ed0
2021-11-24 11:15:36.207 :   CLSNS:1556182784: clsns_SetTraceLevel:trace level set to 1.
2021-11-24 11:15:36.210 : default:1556182784: Inited LSF context: 0x7fc828321c50 
2021-11-24 11:15:36.214 : CLSCRED:1556182784: clsCredCommonInit: Inited singleton credctx.
2021-11-24 11:15:36.214 : CLSCRED:1556182784: (:CLSCRED0101:)clsCredDomInitRootDom: Using user given storage context for repository access.
2021-11-24 11:15:36.237 : USRTHRD:1556182784: {0:5:3} 8154 Error 4 querying length of attr ASM_DISCOVERY_ADDRESS

2021-11-24 11:15:36.241 : USRTHRD:1556182784: {0:5:3} 8154 Error 4 querying length of attr ASM_STATIC_DISCOVERY_ADDRESS

2021-11-24 11:15:36.276 : CLSCRED:1556182784: (:CLSCRED1079:)clsCredOcrKeyExists: Obj dom : SYSTEM.credentials.domains.root.ASM.Self.9d0ad57d52f57f81bf9bdc78d36d559f.root not found
2021-11-24 11:15:36.276 : USRTHRD:1556182784: {0:5:3} 7872 Error 4 opening dom root in 0x7fc82834a180

2021-11-24 11:15:37.361 : default:1556182784: clsCredDomClose: Credctx deleted 0x7fc828228ed0
2021-11-24 11:15:37.361 : USRTHRD:1556182784: {0:5:3} -- trace dump on error exit --

2021-11-24 11:15:37.361 : USRTHRD:1556182784: {0:5:3} Error [kgfoAl06] in [kgfokge] at kgfo.c:3115

2021-11-24 11:15:37.361 : USRTHRD:1556182784: {0:5:3} ORA-12547: TNS:lost contact
ORA-12547: TNS:lost contact
ORA-15077: could not locate ASM instance serving a required diskgroup


2021-11-24 11:15:37.361 : USRTHRD:1556182784: {0:5:3} Category: 7

2021-11-24 11:15:37.361 : USRTHRD:1556182784: {0:5:3} DepInfo: 12547

2021-11-24 11:15:37.361 : USRTHRD:1556182784: {0:5:3} -- trace dump end --

SYSTEM.credentials.domains.root.ASM.Self.9d0ad57d52f57f81bf9bdc78d36d559f.root not found,ORA-15077: could not locate ASM instance serving a required diskgroup,无法连接到asm实例,虽然crsctl检查asm已启动。但实际上asm并未启动

[root@db2 trace]# ps -ef | grep asm
root     206099 200334  0 11:17 pts/2    00:00:00 grep --color=auto asm

2、原因分析

asm并未实际启动成功,无法定位磁盘组。

3、解决方案

手工启动asm实例。 run as asm owner

sqlplus / as sysasm

startup

4、根因

在该版本下,ASM有Flex ASM的特性。在该环境下,使用的模式正是Flex。

[grid@db1 ~]$  asmcmd
ASMCMD> showclustermode
ASM cluster : Flex mode enabled
ASMCMD> exit
[grid@db1 ~]$ srvctl config asm
ASM home: <CRS home>
Password file: +OCR_VOTE/orapwASM
Backup of Password file: 
ASM listener: LISTENER
ASM instance count: 3
Cluster ASM listener: ASMNET1LSNR_ASM

Flex ASM中ASM server启动时要连接所有asm network。

检查asm的监听

[grid@db1 ~]$ lsnrctl status ASMNET1LSNR_ASM

LSNRCTL for Linux: Version 12.2.0.1.0 - Production on 24-NOV-2021 19:00:17

Copyright (c) 1991, 2016, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=ASMNET1LSNR_ASM)))
STATUS of the LISTENER
------------------------
Alias                     ASMNET1LSNR_ASM
Version                   TNSLSNR for Linux: Version 12.2.0.1.0 - Production
Start Date                24-NOV-2021 11:10:01
Uptime                    0 days 7 hr. 50 min. 15 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /oracle/product/12.2.0.1/grid/network/admin/listener.ora
Listener Log File         /oracle/gridbase/diag/tnslsnr/xd1archdb1/asmnet1lsnr_asm/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=ASMNET1LSNR_ASM)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.*.*.111)(PORT=1526)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.*.*.111)(PORT=1526)))
The listener supports no services
The command completed successfully

可以见到asm的监听中未注册服务。

检查监听日志可见

Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.*.*.111)(PORT=1526)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.*.*.111)(PORT=1526)))
24-NOV-2021 11:10:03 * (ADDRESS=(PROTOCOL=ipc)(KEY=ASMNET1LSNR_ASM)) * service_register * LsnrAgt * 0
2021-11-24T11:10:06.129013+08:00
24-NOV-2021 11:10:06 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=db1)(USER=grid))(COMMAND=status)(ARGUMENTS=64)(SERVICE=ASMNET1LSNR_ASM)(VERSION=203424000)) * status * 0
2021-11-24T11:10:08.041279+08:00
Incoming connection from 10.*.*.111 rejected 
24-NOV-2021 11:10:08 * 12546
TNS-12546: TNS:permission denied
 TNS-12560: TNS:protocol adapter error
  TNS-00516: Permission denied

另外一个节点的的访问也被拒绝

2021-11-24T11:13:50.958237+08:00
Incoming connection from 10.*.*.112 rejected 
24-NOV-2021 11:13:50 * 12546
TNS-12546: TNS:permission denied
 TNS-12560: TNS:protocol adapter error
  TNS-00516: Permission denied

111 和 112 访问都被拒绝,结合112节点上的crs日志,存在ORA-12547错误,检查sqlnet.ora的白名单,可以发现其地址并未存在ACL中。

需要修改sqlnet.ora添加对应地址,重启asm的监听生效。

posted @ 2021-11-25 12:39  syksky  阅读(505)  评论(0编辑  收藏  举报