SYSTEM.credentials.domains.root.ASM.Self.XXXX.root not found
SYSTEM.credentials.domains.root.ASM.Self.XXXX.root not found
1、现象
Oracle 12.2.0.1.0 集群关闭后,无法启动。
crs状态检查卡在storage starting,其他资源也无法启动
[root@db2 ~]# /oracle/product/12.2.0.1/grid/bin/crsctl status res -t -init
--------------------------------------------------------------------------------
Name Target State Server State details
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE ONLINE xd1archdb2 STABLE
ora.cluster_interconnect.haip
1 ONLINE ONLINE xd1archdb2 STABLE
ora.crf
1 ONLINE OFFLINE STABLE
ora.crsd
1 ONLINE OFFLINE STABLE
ora.cssd
1 ONLINE ONLINE xd1archdb2 STABLE
ora.cssdmonitor
1 ONLINE ONLINE xd1archdb2 STABLE
ora.ctssd
1 ONLINE ONLINE xd1archdb2 OBSERVER,STABLE
ora.diskmon
1 OFFLINE OFFLINE STABLE
ora.driver.afd
1 ONLINE ONLINE xd1archdb2 STABLE
ora.drivers.acfs
1 ONLINE ONLINE xd1archdb2 STABLE
ora.evmd
1 ONLINE INTERMEDIATE xd1archdb2 STABLE
ora.gipcd
1 ONLINE ONLINE xd1archdb2 STABLE
ora.gpnpd
1 ONLINE ONLINE xd1archdb2 STABLE
ora.mdnsd
1 ONLINE ONLINE xd1archdb2 STABLE
ora.storage
1 ONLINE OFFLINE xd1archdb2 STARTING <===================
--------------------------------------------------------------------------------
crs的alert日志中,最后指向[ORAROOTAGENT(199820)]CRS-5019: All OCR locations are on ASM disk groups [OCR_VOTE], and none of these disk groups are mounted. Details are at "(:CLSN00140:)" in "/oracle/gridbase/diag/crs/xd1archdb2/crs/trace/ohasd_orarootagent_root.trc".
检查该日志,可以见到
2021-11-24 11:15:34.101 : USRTHRD:1556182784: {0:5:3} 8154 Error 4 querying length of attr ASM_STATIC_DISCOVERY_ADDRESS
2021-11-24 11:15:34.136 : CLSCRED:1556182784: (:CLSCRED1079:)clsCredOcrKeyExists: Obj dom : SYSTEM.credentials.domains.root.ASM.Self.9d0ad57d52f57f81bf9bdc78d36d559f.root not found
2021-11-24 11:15:34.136 : USRTHRD:1556182784: {0:5:3} 7872 Error 4 opening dom root in 0x7fc828518480
2021-11-24 11:15:35.190 : default:1556182784: clsCredDomClose: Credctx deleted 0x7fc828228ed0
2021-11-24 11:15:36.207 : CLSNS:1556182784: clsns_SetTraceLevel:trace level set to 1.
2021-11-24 11:15:36.210 : default:1556182784: Inited LSF context: 0x7fc828321c50
2021-11-24 11:15:36.214 : CLSCRED:1556182784: clsCredCommonInit: Inited singleton credctx.
2021-11-24 11:15:36.214 : CLSCRED:1556182784: (:CLSCRED0101:)clsCredDomInitRootDom: Using user given storage context for repository access.
2021-11-24 11:15:36.237 : USRTHRD:1556182784: {0:5:3} 8154 Error 4 querying length of attr ASM_DISCOVERY_ADDRESS
2021-11-24 11:15:36.241 : USRTHRD:1556182784: {0:5:3} 8154 Error 4 querying length of attr ASM_STATIC_DISCOVERY_ADDRESS
2021-11-24 11:15:36.276 : CLSCRED:1556182784: (:CLSCRED1079:)clsCredOcrKeyExists: Obj dom : SYSTEM.credentials.domains.root.ASM.Self.9d0ad57d52f57f81bf9bdc78d36d559f.root not found
2021-11-24 11:15:36.276 : USRTHRD:1556182784: {0:5:3} 7872 Error 4 opening dom root in 0x7fc82834a180
2021-11-24 11:15:37.361 : default:1556182784: clsCredDomClose: Credctx deleted 0x7fc828228ed0
2021-11-24 11:15:37.361 : USRTHRD:1556182784: {0:5:3} -- trace dump on error exit --
2021-11-24 11:15:37.361 : USRTHRD:1556182784: {0:5:3} Error [kgfoAl06] in [kgfokge] at kgfo.c:3115
2021-11-24 11:15:37.361 : USRTHRD:1556182784: {0:5:3} ORA-12547: TNS:lost contact
ORA-12547: TNS:lost contact
ORA-15077: could not locate ASM instance serving a required diskgroup
2021-11-24 11:15:37.361 : USRTHRD:1556182784: {0:5:3} Category: 7
2021-11-24 11:15:37.361 : USRTHRD:1556182784: {0:5:3} DepInfo: 12547
2021-11-24 11:15:37.361 : USRTHRD:1556182784: {0:5:3} -- trace dump end --
SYSTEM.credentials.domains.root.ASM.Self.9d0ad57d52f57f81bf9bdc78d36d559f.root not found
,ORA-15077: could not locate ASM instance serving a required diskgroup
,无法连接到asm实例,虽然crsctl检查asm已启动。但实际上asm并未启动
[root@db2 trace]# ps -ef | grep asm
root 206099 200334 0 11:17 pts/2 00:00:00 grep --color=auto asm
2、原因分析
asm并未实际启动成功,无法定位磁盘组。
3、解决方案
手工启动asm实例。 run as asm owner
sqlplus / as sysasm
startup
4、根因
在该版本下,ASM有Flex ASM的特性。在该环境下,使用的模式正是Flex。
[grid@db1 ~]$ asmcmd
ASMCMD> showclustermode
ASM cluster : Flex mode enabled
ASMCMD> exit
[grid@db1 ~]$ srvctl config asm
ASM home: <CRS home>
Password file: +OCR_VOTE/orapwASM
Backup of Password file:
ASM listener: LISTENER
ASM instance count: 3
Cluster ASM listener: ASMNET1LSNR_ASM
Flex ASM中ASM server启动时要连接所有asm network。
检查asm的监听
[grid@db1 ~]$ lsnrctl status ASMNET1LSNR_ASM
LSNRCTL for Linux: Version 12.2.0.1.0 - Production on 24-NOV-2021 19:00:17
Copyright (c) 1991, 2016, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=ASMNET1LSNR_ASM)))
STATUS of the LISTENER
------------------------
Alias ASMNET1LSNR_ASM
Version TNSLSNR for Linux: Version 12.2.0.1.0 - Production
Start Date 24-NOV-2021 11:10:01
Uptime 0 days 7 hr. 50 min. 15 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /oracle/product/12.2.0.1/grid/network/admin/listener.ora
Listener Log File /oracle/gridbase/diag/tnslsnr/xd1archdb1/asmnet1lsnr_asm/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=ASMNET1LSNR_ASM)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.*.*.111)(PORT=1526)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.*.*.111)(PORT=1526)))
The listener supports no services
The command completed successfully
可以见到asm的监听中未注册服务。
检查监听日志可见
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.*.*.111)(PORT=1526)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.*.*.111)(PORT=1526)))
24-NOV-2021 11:10:03 * (ADDRESS=(PROTOCOL=ipc)(KEY=ASMNET1LSNR_ASM)) * service_register * LsnrAgt * 0
2021-11-24T11:10:06.129013+08:00
24-NOV-2021 11:10:06 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=db1)(USER=grid))(COMMAND=status)(ARGUMENTS=64)(SERVICE=ASMNET1LSNR_ASM)(VERSION=203424000)) * status * 0
2021-11-24T11:10:08.041279+08:00
Incoming connection from 10.*.*.111 rejected
24-NOV-2021 11:10:08 * 12546
TNS-12546: TNS:permission denied
TNS-12560: TNS:protocol adapter error
TNS-00516: Permission denied
另外一个节点的的访问也被拒绝
2021-11-24T11:13:50.958237+08:00
Incoming connection from 10.*.*.112 rejected
24-NOV-2021 11:13:50 * 12546
TNS-12546: TNS:permission denied
TNS-12560: TNS:protocol adapter error
TNS-00516: Permission denied
111 和 112 访问都被拒绝,结合112节点上的crs日志,存在ORA-12547错误,检查sqlnet.ora的白名单,可以发现其地址并未存在ACL中。
需要修改sqlnet.ora添加对应地址,重启asm的监听生效。