升级到11.2.0.4后用srvctl无法启用数据库实例,报CRS-0254: authorization failure

在standby database上从11.2.0.3升级11.2.0.4,然后打了补丁PATCH SET UPDATE 11.2.0.4.190115后,无法用srvctl启动第二个节点数据库实例:

$ srvctl start instance -d rac -i rac2

PRCR-1013 : Failed to start resource ora.<dbname>.db
PRCR-1064 : Failed to start resource ora.<dbname>.db on node <node2>
CRS-2674: Start of 'ora.<dbname>.db' on '<node2>' failed
CRS-0254: authorization failure 
CRS-2678: 'ora.<dbname>.db' on '<node2>' has experienced an unrecoverable failure
CRS-0267: Human intervention required to resume its availability.
CRS-5807: Agent failed to process the message

查看CRS日志

[grid@db ~]$ vi $ORACLE_HOME/log/<节点节>/crsd/crsd.log

2019-05-05 15:56:16.694: [ AGFW][2814367488]{2:33533:509} Initializing the resource ora.mpos_dg.db 2 1 for type ora.database.type
2019-05-05 15:56:16.694: [ AGFW][2814367488]{2:33533:509} SR: acl = owner:oracle:rwx,pgrp:oinstall:r--,other::r--,group:dba:r-x,group:oper:r-x,user:grid:r-x
2019-05-05 15:56:16.694: [ CRSSEC][2814367488]{2:33533:509} Exception: GroupEntry constructor failed to validate group name with error: 1 groupId: 0x7fcf2c0a7710 acl_string: group:oper:r-x
2019-05-05 15:56:16.695: [ CRSSEC][2814367488]{2:33533:509} Exception: ACL entry creation failed for: group:oper:r-x
2019-05-05 15:56:16.695: [ AGFW][2814367488]{2:33533:509} Error:fetchResource: CRS-0254: authorization failure

2019-05-05 15:56:16.695: [ AGFW][2814367488]{2:33533:509} Agfw Proxy Server sending the last reply to PE for message:RESOURCE_START[ora.mpos_dg.db 2 1] ID 4098:301055
2019-05-05 15:56:16.698: [UiServer][2801760000]{2:33533:509} Container [ Name: ORDER
MESSAGE:
TextMessage[CRS-2674: Start of 'ora.mpos_dg.db' on 'mpos2' failed]
MSGTYPE:
TextMessage[1]
OBJID:
TextMessage[ora.mpos_dg.db 2 1]
WAIT:
TextMessage[0]
]
2019-05-05 15:56:16.698: [UiServer][2801760000]{2:33533:509} Container [ Name: ORDER
MESSAGE:
TextMessage[CRS-0254: authorization failure]
MSGTYPE:
TextMessage[1]
OBJID:
TextMessage[ora.mpos_dg.db 2 1]
WAIT:
TextMessage[0]
]
2019-05-05 15:56:16.699: [ AGFW][2814367488]{2:33533:509} Agfw Proxy Server received the message: RESOURCE_CLEAN[ora.mpos_dg.db 2 1] ID 4100:301057
2019-05-05 15:56:16.699: [ CRSD][2814367488]{2:33533:509} {2:33533:509} Created alert : (:CRSAGF00126:) : Agent start failed

经检查第二个节点上没有oper这个系统组

# grep oper /etc/group

 

添加两个组:

# groupadd -g 503 oper

# groupadd -g 505 asmoper

启动成功。

-----------------------------------------------------------------------------------

参考官方文档:

 

转到底部转到底部

 

In this Document

  Symptoms
  Cause
  Solution
  References

 

APPLIES TO:

Oracle Database - Enterprise Edition - Version 11.2.0.4 and later
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Express Cloud Service - Version N/A and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Information in this document applies to any platform.

SYMPTOMS

Starting of the instance is failing with following error

$ srvctl start instance -d rac -i rac2

 

PRCR-1013 : Failed to start resource ora.<dbname>.db
PRCR-1064 : Failed to start resource ora.<dbname>.db on node <node2>
CRS-2674: Start of 'ora.<dbname>.db' on '<node2>' failed
CRS-0254: authorization failure 
CRS-2678: 'ora.<dbname>.db' on '<node2>' has experienced an unrecoverable failure
CRS-0267: Human intervention required to resume its availability.
CRS-5807: Agent failed to process the message

 

CAUSE

1* crsd agent log showing the following output

2015-12-04 01:02:27.056: [UiServer][792693056]{4:24757:45} Done for ctx=0x2b61281a22e0
2015-12-04 01:02:27.082: [ AGFW][780085568]{4:24757:2} Received the reply to the message: RESOURCE_START[ora.LISTENER.lsnr <node2> 1] ID 4098:175 from the agent <GI_HOME>/bin/oraagent_oracle
2015-12-04 01:02:27.082: [ AGFW][780085568]{4:24757:2} Agfw Proxy Server sending the reply to PE for message:RESOURCE_START[ora.LISTENER.lsnr rac-node2 1] ID 4098:987651
2015-12-04 01:02:27.188: [ AGFW][780085568]{4:24757:2} Received the reply to the message: RESOURCE_START[ora.LISTENER.lsnr <node2> 1] ID 4098:175 from the agent <GI_HOME>/bin/oraagent_oracle
2015-12-04 01:02:27.188: [ AGFW][780085568]{4:24757:2} Agfw Proxy Server sending the last reply to PE for message:RESOURCE_START[ora.LISTENER.lsnr <node2> 1] ID 4098:987651
2015-12-04 01:02:27.212: [ AGFW][780085568]{4:24757:2} Agfw Proxy Server received the message: RESOURCE_START[ora.<dbname>.db 4 1] ID 4098:987673
2015-12-04 01:02:27.212: [ AGFW][780085568]{4:24757:2} Creating the resource: ora.<dbname>.db 2 1
2015-12-04 01:02:27.212: [ AGFW][780085568]{4:24757:2} Initializing the resource ora.<dbname>.db 2 1 for type ora.database.type
2015-12-04 01:02:27.212: [ AGFW][780085568]{4:24757:2} SR: acl = owner:oracle:rwx,pgrp:dba:rwx,other::r--,group:dba:r-x,group:dba-oper:r-x,user:oracle:r-x
2015-12-04 01:02:27.347: [ CRSSEC][780085568]{4:24757:2} Exception: GroupEntry constructor failed to validate group name with error: 1 groupId: 0x2b61300ed9e0 acl_string: group:dba-oper:r-x
2015-12-04 01:02:27.347: [ CRSSEC][780085568]{4:24757:2} Exception: ACL entry creation failed for: group:dba-oper:r-x
2015-12-04 01:02:27.347: [ AGFW][780085568]{4:24757:2} Error:fetchResource: CRS-0254: authorization failure

2* The respective group dba-oper is not existing.

# grep dba-oper /etc/group

 

<<<<< No entries returned.

3* oracle user also not being assigned to dba-oper

# id oracle

uid=3000(oracle) gid=101(dba) groups=101(dba)

4* From osdbagrp command shows dba-oper is the operation group.

$ osdbagrp -o

 

dba-oper

5* From the file, $RDBMS_HOME/rdbms/lib/config.c, confirms the same.

$ cat $RDBMS_HOME/rdbms/lib/config.c

 

###
#define SS_DBA_GRP "dba"
#define SS_OPER_GRP "dba-oper"
#define SS_ASM_GRP ""
###

 

 

SOLUTION

1* Stop the RDBMS instances and any other resources, like any customized listener running from RDBMS_HOME.

 

$ srvctl stop instance -d rac -i rac2
$ srvctl remove instance -d rac -i rac2



2* Remove the database resource from OCR

$ srvctl remove database -d rac


3* Edit the $ORACLE_HOME/rdbms/lib/config.c accordingly with the correct group. Before that take backup of config.c and config.o

###
#define SS_DBA_GRP "dba"
#define SS_OPER_GRP "dba"
#define SS_ASM_GRP ""
###

4* Do the relink all for the RDBMS binary.

$ cd $ORACLE_HOME/bin

$ relink -all

 

5* Re-register the database/instances resource.

srvctl add database -d rac -o $RDBMS_HOME -p <spfile_location>

srvctl add instance -d rac -i rac1 -n rac_node1

srvctl add instance -d rac -i rac2 -n rac_node2

 

6* Restart the instances.

srvctl start database -d rac

 

 

 

posted @ 2019-05-08 14:24  jimeper  阅读(1224)  评论(0编辑  收藏  举报