升级到11.2.0.4后用srvctl无法启用数据库实例,报CRS-0254: authorization failure
在standby database上从11.2.0.3升级11.2.0.4,然后打了补丁PATCH SET UPDATE 11.2.0.4.190115后,无法用srvctl启动第二个节点数据库实例:
$ srvctl start instance -d rac -i rac2
PRCR-1013 : Failed to start resource ora.<dbname>.db
PRCR-1064 : Failed to start resource ora.<dbname>.db on node <node2>
CRS-2674: Start of 'ora.<dbname>.db' on '<node2>' failed
CRS-0254: authorization failure
CRS-2678: 'ora.<dbname>.db' on '<node2>' has experienced an unrecoverable failure
CRS-0267: Human intervention required to resume its availability.
CRS-5807: Agent failed to process the message
查看CRS日志
[grid@db ~]$ vi $ORACLE_HOME/log/<节点节>/crsd/crsd.log
2019-05-05 15:56:16.694: [ AGFW][2814367488]{2:33533:509} Initializing the resource ora.mpos_dg.db 2 1 for type ora.database.type
2019-05-05 15:56:16.694: [ AGFW][2814367488]{2:33533:509} SR: acl = owner:oracle:rwx,pgrp:oinstall:r--,other::r--,group:dba:r-x,group:oper:r-x,user:grid:r-x
2019-05-05 15:56:16.694: [ CRSSEC][2814367488]{2:33533:509} Exception: GroupEntry constructor failed to validate group name with error: 1 groupId: 0x7fcf2c0a7710 acl_string: group:oper:r-x
2019-05-05 15:56:16.695: [ CRSSEC][2814367488]{2:33533:509} Exception: ACL entry creation failed for: group:oper:r-x
2019-05-05 15:56:16.695: [ AGFW][2814367488]{2:33533:509} Error:fetchResource: CRS-0254: authorization failure
2019-05-05 15:56:16.695: [ AGFW][2814367488]{2:33533:509} Agfw Proxy Server sending the last reply to PE for message:RESOURCE_START[ora.mpos_dg.db 2 1] ID 4098:301055
2019-05-05 15:56:16.698: [UiServer][2801760000]{2:33533:509} Container [ Name: ORDER
MESSAGE:
TextMessage[CRS-2674: Start of 'ora.mpos_dg.db' on 'mpos2' failed]
MSGTYPE:
TextMessage[1]
OBJID:
TextMessage[ora.mpos_dg.db 2 1]
WAIT:
TextMessage[0]
]
2019-05-05 15:56:16.698: [UiServer][2801760000]{2:33533:509} Container [ Name: ORDER
MESSAGE:
TextMessage[CRS-0254: authorization failure]
MSGTYPE:
TextMessage[1]
OBJID:
TextMessage[ora.mpos_dg.db 2 1]
WAIT:
TextMessage[0]
]
2019-05-05 15:56:16.699: [ AGFW][2814367488]{2:33533:509} Agfw Proxy Server received the message: RESOURCE_CLEAN[ora.mpos_dg.db 2 1] ID 4100:301057
2019-05-05 15:56:16.699: [ CRSD][2814367488]{2:33533:509} {2:33533:509} Created alert : (:CRSAGF00126:) : Agent start failed
经检查第二个节点上没有oper这个系统组
# grep oper /etc/group
添加两个组:
# groupadd -g 503 oper
# groupadd -g 505 asmoper
启动成功。
-----------------------------------------------------------------------------------
参考官方文档:
In this Document
APPLIES TO:Oracle Database - Enterprise Edition - Version 11.2.0.4 and laterOracle Database Cloud Schema Service - Version N/A and later Oracle Database Exadata Express Cloud Service - Version N/A and later Oracle Database Exadata Cloud Machine - Version N/A and later Oracle Cloud Infrastructure - Database Service - Version N/A and later Information in this document applies to any platform. SYMPTOMSStarting of the instance is failing with following error $ srvctl start instance -d rac -i rac2
PRCR-1013 : Failed to start resource ora.<dbname>.db
PRCR-1064 : Failed to start resource ora.<dbname>.db on node <node2> CRS-2674: Start of 'ora.<dbname>.db' on '<node2>' failed CRS-0254: authorization failure CRS-2678: 'ora.<dbname>.db' on '<node2>' has experienced an unrecoverable failure CRS-0267: Human intervention required to resume its availability. CRS-5807: Agent failed to process the message
CAUSE1* crsd agent log showing the following output 2015-12-04 01:02:27.056: [UiServer][792693056]{4:24757:45} Done for ctx=0x2b61281a22e0 2* The respective group dba-oper is not existing. # grep dba-oper /etc/group
<<<<< No entries returned.
3* oracle user also not being assigned to dba-oper # id oracle uid=3000(oracle) gid=101(dba) groups=101(dba)
4* From osdbagrp command shows dba-oper is the operation group. $ osdbagrp -o
dba-oper
5* From the file, $RDBMS_HOME/rdbms/lib/config.c, confirms the same. $ cat $RDBMS_HOME/rdbms/lib/config.c
###
#define SS_DBA_GRP "dba" #define SS_OPER_GRP "dba-oper" #define SS_ASM_GRP "" ###
SOLUTION1* Stop the RDBMS instances and any other resources, like any customized listener running from RDBMS_HOME.
$ srvctl stop instance -d rac -i rac2
$ srvctl remove instance -d rac -i rac2
$ srvctl remove database -d rac
### 4* Do the relink all for the RDBMS binary. $ cd $ORACLE_HOME/bin $ relink -all
5* Re-register the database/instances resource. srvctl add database -d rac -o $RDBMS_HOME -p <spfile_location> srvctl add instance -d rac -i rac1 -n rac_node1 srvctl add instance -d rac -i rac2 -n rac_node2
6* Restart the instances. srvctl start database -d rac
|