代码改变世界

案例:Oracle报错ASM磁盘组不存在或没有mount

2017-08-09 23:29  AlfredZhao  阅读(4038)  评论(0编辑  收藏  举报

案例:Oracle报错ASM磁盘组不存在或没有mount

环境:RHEL 6.5 + Oracle Standby RAC 11.2.0.4
我做Standby RAC实验时,在恢复控制文件时,报错无法在磁盘组创建文件,具体报错内容如下:

[oracle@jystdrac1 standby]$ rman target /

Recovery Manager: Release 11.2.0.4.0 - Production on Wed Aug 9 22:35:41 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: JYZHAO (not mounted)

RMAN> restore controlfile from '/public/hotback/jyzhao/standby/control_for_standby.ctl';

Starting restore at 09-AUG-17
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=34 instance=jyzhao1 device type=DISK

channel ORA_DISK_1: restoring control file
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 08/09/2017 22:36:11
ORA-19870: error while restoring backup piece /public/hotback/jyzhao/standby/control_for_standby.ctl
ORA-19504: failed to create file "+DATA"
ORA-17502: ksfdcre:4 Failed to create file +DATA
ORA-15001: diskgroup "DATA" does not exist or is not mounted
ORA-15040: diskgroup is incomplete
ORA-15040: diskgroup is incomplete
ORA-15040: diskgroup is incomplete

RMAN> exit

马上去查磁盘组是否正常mount,结果是正常的:

[root@jystdrac1 ~]# su - grid
[grid@jystdrac1 ~]$ asmcmd
ASMCMD> lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512   4096  1048576     15360    15261                0           15261              0             N  DATA/
MOUNTED  EXTERN  N         512   4096  1048576      5120     5025                0            5025              0             N  FRA/
MOUNTED  NORMAL  N         512   4096  1048576      3072     2146             1024             561              0             Y  OCR/
ASMCMD> 

去看数据库的alert日志:

Wed Aug 09 22:29:50 2017
NOTE: Loaded library: System 
ORA-15025: could not open disk "/dev/asm-diske"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
ORA-15025: could not open disk "/dev/asm-diskf"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
ORA-15025: could not open disk "/dev/asm-diskg"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
Wed Aug 09 22:29:50 2017
SUCCESS: diskgroup DATA was dismounted
ERROR: diskgroup DATA was not mounted
ORA-15025: could not open disk "/dev/asm-diskh"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
SUCCESS: diskgroup FRA was dismounted
ERROR: diskgroup FRA was not mounted

居然说是无法打开asm磁盘,迅速去查asm磁盘情况:

[root@jystdrac1 ~]# ls -l /dev/asm*
brw-rw---- 1 grid asmadmin 8,  16 Aug  9 22:55 /dev/asm-diskb
brw-rw---- 1 grid asmadmin 8,  32 Aug  9 22:55 /dev/asm-diskc
brw-rw---- 1 grid asmadmin 8,  48 Aug  9 22:55 /dev/asm-diskd
brw-rw---- 1 grid asmadmin 8,  64 Aug  9 22:55 /dev/asm-diske
brw-rw---- 1 grid asmadmin 8,  80 Aug  9 22:11 /dev/asm-diskf
brw-rw---- 1 grid asmadmin 8,  96 Aug  9 22:11 /dev/asm-diskg
brw-rw---- 1 grid asmadmin 8, 112 Aug  9 22:55 /dev/asm-diskh

/dev/asm:
total 0
[root@jystdrac1 ~]# id grid
uid=700(grid) gid=800(oinstall) groups=800(oinstall),700(asmadmin),701(asmdba),702(asmoper)
[root@jystdrac1 ~]# id oracle
uid=800(oracle) gid=800(oinstall) groups=800(oinstall),701(asmdba),801(dba),802(oper)

发现磁盘组权限没有问题,grid和oracle用户的组也都符合官方要求。

拿着alert日志的关键报错信息“ORA-15025 ORA-27041 Linux-x86_64 Error: 13: Permission denied”去查MOS:
找到一篇MOS匹配:Database Will Not Mount: ORA-15025, ORA-27041, 'Permission denied', ORA-15081 (文档 ID 1378747.1)
最终说是

CAUSE

However the problem is that the Database oracle executable is not assigned to the ASM OSASM group (instead is assigned to the 'oinstall' group):


-rwsr-s--x 1 oracle oinstall 210973186 May 31 21:25 /appl/oracle/OracleHomes/11.2.0/bin/oracle

It needs to be:


-rwsr-s--x 1 oracle asmadmin 210973186 May 31 21:25 /appl/oracle/OracleHomes/11.2.0/bin/oracle

按照MOS的说法,去查oracle用户的这个oracle二进制文件的属组,果然发现是oinstall:

[root@jystdrac1 ~]# su - oracle
[oracle@jystdrac1 ~]$ cd $ORACLE_HOME/bin/
[oracle@jystdrac1 bin]$ ls -l oracle
-rwsr-s--x 1 oracle oinstall 239626641 Aug  6 00:42 oracle

将其改为MOS说的asmadmin,

As the <asm_home sfw owner>:
$ cd <asm_home>/bin
$ ./setasmgidwrap o=<db_home>/bin/oracle

ON NODE1:
[root@jystdrac1 bin]# pwd
/opt/app/oracle/product/11.2.0/dbhome_1/bin
[root@jystdrac1 bin]# ls -l oracle
-rwsr-s--x 1 oracle oinstall 239626641 Aug  6 00:42 oracle
[grid@jystdrac1 bin]$ ./setasmgidwrap o=/opt/app/oracle/product/11.2.0/dbhome_1/bin/oracle
[root@jystdrac1 bin]# ls -l oracle
-rwsr-s--x 1 oracle asmadmin 239626641 Aug  6 00:42 oracle

ON NODE2:
[root@jystdrac2 bin]# pwd
/opt/app/oracle/product/11.2.0/dbhome_1/bin
[root@jystdrac2 bin]# ls -l oracle
-rwsr-s--x 1 oracle oinstall 239626641 Aug  6 00:48 oracle
[root@jystdrac2 bin]# chown oracle:asmadmin oracle
[root@jystdrac2 bin]# ls -l oracle
-rwxr-x--x 1 oracle asmadmin 239626641 Aug  6 00:48 oracle
[root@jystdrac2 bin]# chmod 6751 oracle
[root@jystdrac2 bin]# ls -l oracle
-rwsr-s--x 1 oracle asmadmin 239626641 Aug  6 00:48 oracle

如上,两个节点都需要改正确,我这里特意演示了两种方法,在一节点使用MOS推荐的方法修改可以一次成功修改,且能保证不改错,只是等的时间长。在二节点直接通过操作系统命令修改,速度快,需要确认最终改对。

最后重启库再试恢复控制文件:

[oracle@jystdrac1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Wed Aug 9 23:16:38 2017

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount pfile=$ORACLE_HOME/dbs/pfile_for_standby.txt
ORACLE instance started.

Total System Global Area  534462464 bytes
Fixed Size                  2254952 bytes
Variable Size             436209560 bytes
Database Buffers           92274688 bytes
Redo Buffers                3723264 bytes
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
[oracle@jystdrac1 ~]$ rman target /

Recovery Manager: Release 11.2.0.4.0 - Production on Wed Aug 9 23:18:28 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: JYZHAO (not mounted)

RMAN> restore controlfile from '/public/hotback/jyzhao/standby/control_for_standby.ctl';

Starting restore at 09-AUG-17
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=34 instance=jyzhao1 device type=DISK

channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:16
output file name=+DATA/mynas/controlfile/current.256.951607115
output file name=+FRA/mynas/controlfile/current.256.951607123
Finished restore at 09-AUG-17

RMAN> 

成功备份没有问题,alert也不再报错,其实之前在一些客户的生产环境,遇到一些类型的问题,也都和这二进制文件的权限相关,如此看来,掌握这个oracle二进制文件的权限的小知识点也很重要哦。