[oracle 工程师手记]RMAN duplicate 发生ORA-19504、ORA-17502、ORA-15001、ORA-27140 错误的解决过程

客户报告,用RMAN 的 duplicate 命令,在具备RAC环境的standby 端,创建standby 数据库时,失败。

报:ORA-19504、ORA-17502、ORA-15001、ORA-27140

执行的过程如下:

[oracle @ racddb001g ~] $ export ORACLE_SID = tmt011
[oracle @ racddb001g ~] $
[oracle @ racddb001g ~] $ sqlplus/as sysdba

SQL * Plus: Release 12.2.0.1.0 Production on Fri April 4 02:26:18 2021

Copyright (c) 1982, 2016, Oracle. All rights reserved.

Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 --64bit Production
Connected to.
SQL> shutdown immediate
ORA-01507: database is not mounted


The ORACLE instance has been shut down.
SQL> startup nomount pfile ='/media/dg/stby_inittmt01.ora'
The ORACLE instance has started.

Total System Global Area 1593835520 bytes
Fixed Size 8421136 bytes
Variable Size 453985776 bytes
Database Buffers 848860800 bytes
Redo Buffers 294367808 bytes
SQL> exit


[oracle@racddb001g ~]$ export NLS_DATE_FORMAT='yyyy/mm/dd hh24:mi:ss'
[oracle@racddb001g ~]$ rman target 'sys@tmt01H' auxiliary /


RMAN> duplicate target database for standby dorecover nofilenamecheck;

Duplicate Db started at 2021/04/04 02:28:29
Channel: ORA_AUX_DISK_1 assigned
Channel ORA_AUX_DISK_1: SID = 30 Instance = tmt011 Device Type = DISK
The current log has been archived.

Memory script content:
{
   set until scn 5645034;
   restore clone standby controlfile;
}
Running a memory script


Execution command: SET until clause

restore is starting at 2021/04/02 02:28:40
Use of channel ORA_AUX_DISK_1

Channel ORA_AUX_DISK_1: Restoring control file
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre:3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
ORA-15001: diskgroup "DG001" does not exist or is not mounted
ORA-27140: attach to post/wait facility failed
Failover to previous backup

Channel ORA_AUX_DISK_1: Restoring control file
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
ORA-15001: diskgroup "DG001" does not exist or is not mounted
ORA-27140: attach to post/wait facility failed
Failover to previous backup


Channel ORA_AUX_DISK_1: Starting restore of datafile backup set
Channel ORA_AUX_DISK_1: Restoring control file
Channel ORA_AUX_DISK_1: reading from backup piece /my/oracle/dbhome_1/dbs/c-2060537070-20210326-01
Channel ORA_AUX_DISK_1: ORA-19870: error restoring backup piece /my/oracle/dbhome_1/dbs/c-2060537070-20210326-01
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
Failover to previous backup


Channel ORA_AUX_DISK_1: Starting restore of datafile backup set
Channel ORA_AUX_DISK_1: Restoring control file
Channel ORA_AUX_DISK_1: reading from backup piece /my/oracle/dbhome_1/dbs/c-2060827010-20210401-00
Channel ORA_AUX_DISK_1: ORA-19870: error during restore of backup piece /my/oracle/dbhome_1/dbs/c-2060827010-20210401-00
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
Failover to previous backup


Channel ORA_AUX_DISK_1: Starting restore of datafile backup set
Channel ORA_AUX_DISK_1: Restoring control file
Channel ORA_AUX_DISK_1: reading from backup piece/media/dg/backup_db_07stlrd3_1_1
Channel ORA_AUX_DISK_1: ORA-19870: error restoring backup piece/media/dg/backup_db_07stlrd3_1_1
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
Failover to previous backup


Channel ORA_AUX_DISK_1: Starting restore of datafile backup set
Channel ORA_AUX_DISK_1: Restoring control file
Channel ORA_AUX_DISK_1: reading from backup piece/media/dg/backup_db_07stlrd3_1_1
Channel ORA_AUX_DISK_1: ORA-19870: error restoring backup piece/media/dg/backup_db_07stlrd3_1_1
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre: 3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
Failover to previous backup


Channel ORA_AUX_DISK_1: Starting restore of datafile backup set
Channel ORA_AUX_DISK_1: Restoring control file
Channel ORA_AUX_DISK_1: reading from backup piece /my/oracle/dbhome_1/dbs/c-2060537070-20210320-00
Channel ORA_AUX_DISK_1: ORA-19870: error during restore of backup piece /my/oracle/dbhome_1/dbs/c-2060537070-20210320-00
ORA-19505: failed to identify file "/my/oracle/dbhome_1/dbs/c-2060537070-20210320-00".
ORA-27037: unable to get file status
Failover to previous backup

RMAN-00571: ============================================== =============
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ============================================== =============
RMAN-03002: Duplicate Db command failed at 04/02/2021 02:45:39
RMAN-05501: abort copy of target database
RMAN-03015: error in stored script Memory Script
RMAN-06026: missing target--stop restore
RMAN-06024: cannot find backup or copy to restore control file

RMAN> 

 从上面的出错信息上看,这是duplicate过程创建辅助instance,向磁盘组 restore 控制文件失败了:

Channel ORA_AUX_DISK_1: Restoring control file
ORA-19504: failed to create file "+DG001/tmt01d/CONTROLFILE/control01.ctl".
ORA-17502: failed to create ksfdcre:3 file +DG001/tmt01d/CONTROLFILE/control01.ctl
ORA-15001: diskgroup "DG001" does not exist or is not mounted
ORA-27140: attach to post/wait facility failed
Failover to previous backup

 首先看看用户oracle 有没有对磁盘组的读写权限:

SQL> select NAME,STATE from v$asm_diskgroup;

NAME
------------------
STATE
------------------
DG001 
MOUNTED

SQL> select name,PATH from v$asm_disk;

NAME
-------------------------------------
PATH
-------------------------------------
DG001_0000
/dev/mapper/ora01

 查看磁盘组对应的物理磁盘,发现映射到 /dev/dm-11 设备。

$ ls -l /dev/mapper/ora01
lrwxrwxrwx 1 root root 8 Mar 27 00:52 /dev/mapper/ora01 -> ../dm-11

 该设备的 owner 和组,分别是 grid:asmadmin。

$ ls -l /dev/dm-*

brw-rw---- 1 grid asmadmin 253, 10 Mar 27 00:58 /dev/dm-10
brw-rw---- 1 grid asmadmin 253, 11 Mar 27 00:58 /dev/dm-11

 看一下 grid 用户的组:

# su - grid
$ id
uid=10000(grid) gid=11000(oinstall) groups=11000(oinstall),11002(asmadmin) ,11003(asmdba),11004(asmoper)

 再看一下oracle 用户的组:

# su - oracle
$ id
uid=10001(oracle) gid=11000(oinstall) groups=11000(oinstall),11001(dba),11003(asmdba),11005(racdba),11006(backupdba),11007(dgdba),11008(kmdba),11009(oper)

 发现 oracle 用户,根本不在 asmadmin 组里。可能是因为这个原因,没有办法访问数据库。

 请客户把 oracle 用户也加入到 asmadmin 中,但是再次执行duplicate 没有什么变化:

usermod -a -G asmadmin oracle

 现在,还需要考虑其它的原因,就是 oracle 可执行文件,是否有权限以 grid 用户身份运行。

<primary>

# su - oracle
$ ls -l $ORACLE_HOME/bin/oracle
-rwsr-s--x 1 oracle asmadmin 408674152  3月 19 00:22 /my/oracle/dbhome_1/bin/oracle

[root@rachdb001g ~]# su - grid
最終ログイン: 2021/04/01 (木) 16:29:18 JST
[grid@rachdb001g ~]$ ls -l $ORACLE_HOME/bin/oracle
-rwsr-s--x 1 grid oinstall 373409344 Mar 18 03:05 /opt/oracle/grid/12.2.0/grid/bin/oracle
[grid@rachdb001g ~]$ 


<standby>

#su - oracle
$ ls -l $ORACLE_HOME/bin/oracle
-rwsr-s--x 1 oracle asmadmin 408674152  3月 24 22:19 /my/oracle/dbhome_1/bin/oracle

# su - grid
$ ls -l $ORACLE_HOME/bin/oracle
-rwxr-x--x 1 grid oinstall 373409344 Mar 20 01:56 /opt/oracle/grid/12.2.0/grid/bin/oracle

 可以看到,主库和备库上的 grid 用户的 $ORACLE_HOME/bin/oracle 的权限是不一样的。

一个是: -rwsr-x--x ,一个是 -rwxr-x--x。

需要进行设置:

chown grid:oinstall $GI_HOME/bin/oracle
chmod 6751 $GI_HOME/bin/oracle 

 重新启动 standby 端的辅助instance,再次执行 duplicate, 已经可以成功执行。

 

posted @ 2021-04-05 21:48  健哥的数据花园  阅读(1380)  评论(0编辑  收藏  举报