重新初始化RAC的OCR盘和Votedisk盘,修复RAC系统

假设我们的RAC环境中OCR磁盘和votedisk磁盘全部被破坏,并且都没有备份,那么我们该如何恢复我们的RAC环境。
最近简单的办法就是重新初始化我们的ocr盘和votedisk盘,把集群中的所有相关资源重新注册到OCR磁盘和votedisk磁盘中。

 

1.停掉所有节点的Clusterware Stack

[root@rac3 bin]# ./crsctl stop crs
Stopping resources.
Successfully stopped CRS resources 
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.



[root@rac4 bin]# ./crsctl stop crs
Stopping resources.
Successfully stopped CRS resources 
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.

 

2.为安全期间,我们先备份一下ocr和votedisk

为防止我们的实验失败,我们先备份一下ocr盘和votedisk盘。
当然,在正式的RAC环境中是不会不备份ocr盘和votedisk盘,我们模拟的也是一种极端情况。

[root@rac3 bin]# ./crsctl query css votedisk
 0.     0    /dev/raw/raw2

located 1 votedisk(s).
[root@rac3 bin]# dd if=/dev/raw/raw2 of=/home/oracle/votedisk.bak
208864+0 records in
208864+0 records out


[root@rac3 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :     104344
         Used space (kbytes)      :       4340
         Available space (kbytes) :     100004
         ID                       : 1887132889
         Device/File Name         : /dev/raw/raw1
                                    Device/File integrity check succeeded

                                    Device/File not configured

         Cluster registry integrity check succeeded

[root@rac3 bin]# ./ocrconfig -export /home/oracle/ocr.bak

 

3.我们先破坏一下ocr和votedisk

[root@rac3 bin]# dd if=/dev/zero of=/dev/raw/raw1 bs=1M count=100
100+0 records in
100+0 records out
[root@rac3 bin]# dd if=/dev/zero of=/dev/raw/raw1 bs=1M count=130
dd: writing `/dev/raw/raw1': No space left on device
102+0 records in
101+0 records out
[root@rac3 bin]# dd if=/dev/zero of=/dev/raw/raw2 bs=1M count=130   
dd: writing `/dev/raw/raw2': No space left on device
102+0 records in
101+0 records out

现在ocr和votedisk已经被我们破坏,目前我们的RAC肯定是启不来了。

现在我们就利用重建的方式重新把信息注册到ocr和votedisk中去。

 

4.分别在每个节点上执行$CRS_HOME/install/rootdele.sh

[root@rac3 install]# ./rootdelete.sh 
Shutting down Oracle Cluster Ready Services (CRS):
OCR initialization failed with invalid format: PROC-22: The OCR backend has an invalid format
Shutdown has begun. The daemons should exit soon.
Checking to see if Oracle CRS stack is down...
Oracle CRS stack is not running.
Oracle CRS stack is down now.
Removing script for Oracle Cluster Ready services
Updating ocr file for downgrade
Cleaning up SCR settings in '/etc/oracle/scls_scr'
[root@rac4 install]# ./rootdelete.sh 
Shutting down Oracle Cluster Ready Services (CRS):
OCR initialization failed with invalid format: PROC-22: The OCR backend has an invalid format
Shutdown has begun. The daemons should exit soon.
Checking to see if Oracle CRS stack is down...
Oracle CRS stack is not running.
Oracle CRS stack is down now.
Removing script for Oracle Cluster Ready services
Updating ocr file for downgrade
Cleaning up SCR settings in '/etc/oracle/scls_scr'

 

5.在任意一个节点上执行脚本$CRS_HOME/install/rootdeinstall.sh

只需要一个节点上执行即可

[root@rac3 install]# ./rootdeinstall.sh 
Removing contents from OCR mirror device
2560+0 records in
2560+0 records out
Removing contents from OCR device
2560+0 records in
2560+0 records out

 

6.在和步骤5同一个节点上执行$CRS_HOME/root.sh脚本

[root@rac3 crs_1]# ./root.sh
WARNING: directory '/opt/ora10g/product/10.2.0' is not owned by root
WARNING: directory '/opt/ora10g/product' is not owned by root
WARNING: directory '/opt/ora10g' is not owned by root
WARNING: directory '/opt' is not owned by root
Checking to see if Oracle CRS stack is already configured

Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/opt/ora10g/product/10.2.0' is not owned by root
WARNING: directory '/opt/ora10g/product' is not owned by root
WARNING: directory '/opt/ora10g' is not owned by root
WARNING: directory '/opt' is not owned by root
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node <nodenumber>: <nodename> <private interconnect name> <hostname>
node 1: rac3 rac3-priv rac3
node 2: rac4 rac4-priv rac4
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Now formatting voting device: /dev/raw/raw2
Format of 1 voting devices complete.
Startup will be queued to init within 90 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
        rac3
CSS is inactive on these nodes.
        rac4
Local node checking complete.
Run root.sh on remaining nodes to start CRS daemons.

 

 

7.在其他节点执行$CRS_HOME/root.sh脚本,注意最后一个节点的输出

[root@rac4 crs_1]# ./root.sh 
WARNING: directory '/opt/ora10g/product/10.2.0' is not owned by root
WARNING: directory '/opt/ora10g/product' is not owned by root
WARNING: directory '/opt/ora10g' is not owned by root
WARNING: directory '/opt' is not owned by root
Checking to see if Oracle CRS stack is already configured

Current Oracle Cluster Registry mirror location '/dev/raw/raw7' in '/etc/oracle/ocr.loc' and '' does not match
Update either '/etc/oracle/ocr.loc' to use '' or variable CRS_OCR_LOCATIONS in rootconfig.sh with '/dev/raw/raw7' then rerun rootconfig.sh

发现有报错,报错信息应该是ocr mirror location和当前不匹配,这是之前我们试验ocr转移位置时留下的(/dev/raw/raw7),
/etc/oracle/ocr.loc文件里我们已经ocrmirrorconfig_loc参数注释掉了,系统怎么还能看得到那??

http://www.cnblogs.com/myrunning/p/4253696.html

 

[root@rac4 oracle]# cat ocr.loc 
#Device/file /dev/raw/raw1 getting replaced by device /dev/raw/raw8 
ocrconfig_loc=/dev/raw/raw1
#ocrmirrorconfig_loc=/dev/raw/raw7
local_only=false

我们把/etc/oracle/ocr.loc文件的"#ocrmirrorconfig_loc=/dev/raw/raw7" 去掉,重新执行$CRS_HOME/root.sh

[root@rac4 oracle]# cat ocr.loc 
#Device/file /dev/raw/raw1 getting replaced by device /dev/raw/raw8 
ocrconfig_loc=/dev/raw/raw1
local_only=false

 

去掉"#ocrmirrorconfig_loc=/dev/raw/raw7",重新执行$CRS_HOME/root.sh,发现问题解决:

[root@rac4 crs_1]# ./root.sh 
WARNING: directory '/opt/ora10g/product/10.2.0' is not owned by root
WARNING: directory '/opt/ora10g/product' is not owned by root
WARNING: directory '/opt/ora10g' is not owned by root
WARNING: directory '/opt' is not owned by root
Checking to see if Oracle CRS stack is already configured

Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/opt/ora10g/product/10.2.0' is not owned by root
WARNING: directory '/opt/ora10g/product' is not owned by root
WARNING: directory '/opt/ora10g' is not owned by root
WARNING: directory '/opt' is not owned by root
clscfg: EXISTING configuration version 3 detected.
clscfg: version 3 is 10G Release 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node <nodenumber>: <nodename> <private interconnect name> <hostname>
node 1: rac3 rac3-priv rac3
node 2: rac4 rac4-priv rac4
clscfg: Arguments check out successfully.

NO KEYS WERE WRITTEN. Supply -force parameter to override.
-force is destructive and will destroy any previous cluster
configuration.
Oracle Cluster Registry for cluster has already been initialized
Startup will be queued to init within 90 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
        rac3
        rac4
CSS is active on all nodes.
Waiting for the Oracle CRSD and EVMD to start
Oracle CRS stack installed and running under init(1M)
Running vipca(silent) for configuring nodeapps
The given interface(s), "eth0" is not public. Public interfaces should be used to configure virtual IPs.

发现问题,由于"eth0" is not public.vipca没有执行成功,这需要我们手动地在这个节点上执行vipca

 

8.重新执行vipca命令

执行vipca报出Error 0(Native: listNetInterfaces:[3])错误,如图:

 

这是因为我们需要重新设置一下RAC的公共网络及私有网络:

 

使用root用户重新手动执行vipca:

 

 

 

9.验证ONS/GSD/VIP有没有正常注册到集群中

[oracle@rac4 ~]$ crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.rac3.gsd   application    ONLINE    ONLINE    rac3        
ora.rac3.ons   application    ONLINE    ONLINE    rac3        
ora.rac3.vip   application    ONLINE    ONLINE    rac4        
ora.rac4.gsd   application    ONLINE    ONLINE    rac4        
ora.rac4.ons   application    ONLINE    ONLINE    rac4        
ora.rac4.vip   application    ONLINE    ONLINE    rac4  

 

10.使用netca命令重新配置监听

使用netca命令配置监听,该命令会自动把Listener注册到Clusterware中。

使用oracle用户手动执行netca命令:

 


确认一下我们刚刚配置的Listener有没有注册到监听中:

[oracle@rac4 ~]$ crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....C3.lsnr application    ONLINE    ONLINE    rac3        
ora.rac3.gsd   application    ONLINE    ONLINE    rac3        
ora.rac3.ons   application    ONLINE    ONLINE    rac3        
ora.rac3.vip   application    ONLINE    ONLINE    rac3        
ora....C4.lsnr application    ONLINE    ONLINE    rac4        
ora.rac4.gsd   application    ONLINE    ONLINE    rac4        
ora.rac4.ons   application    ONLINE    ONLINE    rac4        
ora.rac4.vip   application    ONLINE    ONLINE    rac4  

可以看到至此,我们把listener、ons、gsd、vip都已经注册到ocr中了,下一步还需要把ASM、数据库注册到ocr中我们的实验就完成了。

 

11.把ASM注册到OCR中

[oracle@rac3 ~]$ srvctl add asm -n rac3 -i +ASM1 -o /opt/ora10g/product/10.2.0/db_1
[oracle@rac3 ~]$ srvctl add asm -n rac4 -i +ASM2 -o /opt/ora10g/product/10.2.0/db_1  
[oracle@rac3 ~]$ 
[oracle@rac3 ~]$ srvctl config asm -n rac3
+ASM1 /opt/ora10g/product/10.2.0/db_1
[oracle@rac3 ~]$ srvctl config asm -n rac4
+ASM2 /opt/ora10g/product/10.2.0/db_1

 

12.启动ASM验证

[oracle@rac3 ~]$ srvctl start asm -n rac3
[oracle@rac3 ~]$ srvctl start asm -n rac4
[oracle@rac3 ~]$ crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....SM1.asm application    ONLINE    ONLINE    rac3        
ora....C3.lsnr application    ONLINE    ONLINE    rac3        
ora.rac3.gsd   application    ONLINE    ONLINE    rac3        
ora.rac3.ons   application    ONLINE    ONLINE    rac3        
ora.rac3.vip   application    ONLINE    ONLINE    rac3        
ora....SM2.asm application    ONLINE    ONLINE    rac4        
ora....C4.lsnr application    ONLINE    ONLINE    rac4        
ora.rac4.gsd   application    ONLINE    ONLINE    rac4        
ora.rac4.ons   application    ONLINE    ONLINE    rac4        
ora.rac4.vip   application    ONLINE    ONLINE    rac4   

通过上面的输出可以看出ASM已经成功启动,启动的时候可以关注一下asm的启动日志,方便有错误的时候即使发现问题:

[oracle@rac3 bdump]$ pwd
/opt/ora10g/admin/+ASM/bdump
[oracle@rac3 bdump]$ tail -f alert_+ASM1.log

 

13.把数据库注册到OCR中

[oracle@rac3 ~]$ srvctl add database -d racdb -o $ORACLE_HOME
[oracle@rac3 ~]$ crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....SM1.asm application    ONLINE    ONLINE    rac3        
ora....C3.lsnr application    ONLINE    ONLINE    rac3        
ora.rac3.gsd   application    ONLINE    ONLINE    rac3        
ora.rac3.ons   application    ONLINE    ONLINE    rac3        
ora.rac3.vip   application    ONLINE    ONLINE    rac3        
ora....SM2.asm application    ONLINE    ONLINE    rac4        
ora....C4.lsnr application    ONLINE    ONLINE    rac4        
ora.rac4.gsd   application    ONLINE    ONLINE    rac4        
ora.rac4.ons   application    ONLINE    ONLINE    rac4        
ora.rac4.vip   application    ONLINE    ONLINE    rac4        
ora.racdb.db   application    OFFLINE   OFFLINE               

 

14.把实例注册到OCR中

[oracle@rac3 ~]$ srvctl add instance -d racdb -n rac3 -i racdb1
[oracle@rac3 ~]$ srvctl add instance -d racdb -n rac4 -i racdb2
[oracle@rac3 ~]$ crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....SM1.asm application    ONLINE    ONLINE    rac3        
ora....C3.lsnr application    ONLINE    ONLINE    rac3        
ora.rac3.gsd   application    ONLINE    ONLINE    rac3        
ora.rac3.ons   application    ONLINE    ONLINE    rac3        
ora.rac3.vip   application    ONLINE    ONLINE    rac3        
ora....SM2.asm application    ONLINE    ONLINE    rac4        
ora....C4.lsnr application    ONLINE    ONLINE    rac4        
ora.rac4.gsd   application    ONLINE    ONLINE    rac4        
ora.rac4.ons   application    ONLINE    ONLINE    rac4        
ora.rac4.vip   application    ONLINE    ONLINE    rac4        
ora.racdb.db   application    OFFLINE   OFFLINE               
ora....b1.inst application    OFFLINE   OFFLINE               
ora....b2.inst application    OFFLINE   OFFLINE    

 

15.修改实例和ASM实例的依赖关系

[oracle@rac3 ~]$ srvctl modify instance -d racdb -i racdb1 -s +ASM1
[oracle@rac3 ~]$ srvctl modify instance -d racdb -i racdb2 -s +ASM2 
[oracle@rac3 ~]$ 

 

16.启动数据库进行验证

[oracle@rac3 ~]$ srvctl start database -d racdb
[oracle@rac3 ~]$ crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....SM1.asm application    ONLINE    ONLINE    rac3        
ora....C3.lsnr application    ONLINE    ONLINE    rac3        
ora.rac3.gsd   application    ONLINE    ONLINE    rac3        
ora.rac3.ons   application    ONLINE    ONLINE    rac3        
ora.rac3.vip   application    ONLINE    ONLINE    rac3        
ora....SM2.asm application    ONLINE    ONLINE    rac4        
ora....C4.lsnr application    ONLINE    ONLINE    rac4        
ora.rac4.gsd   application    ONLINE    ONLINE    rac4        
ora.rac4.ons   application    ONLINE    ONLINE    rac4        
ora.rac4.vip   application    ONLINE    ONLINE    rac4        
ora.racdb.db   application    ONLINE    ONLINE    rac3        
ora....b1.inst application    ONLINE    ONLINE    rac3        
ora....b2.inst application    ONLINE    ONLINE    rac4   

现在可以看到我们注册到CRS中的数据已经正常启动了

 

[oracle@rac3 ~]$ sqlplus '/as sysdba'

SQL*Plus: Release 10.2.0.1.0 - Production on Thu Jan 29 12:47:02 2015

Copyright (c) 1982, 2005, Oracle.  All rights reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options

SQL> col name for a50
SQL> select * from v$dbfile;

     FILE# NAME
---------- --------------------------------------------------
         4 +DATA/racdb/datafile/users.259.845203503
         3 +DATA/racdb/datafile/sysaux.257.845203501
         2 +DATA/racdb/datafile/undotbs1.258.845203501
         1 +DATA/racdb/datafile/system.256.845203499
         5 +DATA/racdb/datafile/undotbs2.264.845203661
         6 +DATA/racdb/datafile/rlst.268.852657465

6 rows selected.

SQL> 

 


17.手动添加service到ocr中

[oracle@rac3 ~]$ srvctl add service -d racdb -s racdbservice -r racdb1 -a racdb2 -P BASIC
[oracle@rac3 ~]$ crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....SM1.asm application    ONLINE    ONLINE    rac3        
ora....C3.lsnr application    ONLINE    ONLINE    rac3        
ora.rac3.gsd   application    ONLINE    ONLINE    rac3        
ora.rac3.ons   application    ONLINE    ONLINE    rac3        
ora.rac3.vip   application    ONLINE    ONLINE    rac3        
ora....SM2.asm application    ONLINE    ONLINE    rac4        
ora....C4.lsnr application    ONLINE    ONLINE    rac4        
ora.rac4.gsd   application    ONLINE    ONLINE    rac4        
ora.rac4.ons   application    ONLINE    ONLINE    rac4        
ora.rac4.vip   application    ONLINE    ONLINE    rac4        
ora.racdb.db   application    ONLINE    ONLINE    rac3        
ora....b1.inst application    ONLINE    ONLINE    rac3        
ora....b2.inst application    ONLINE    ONLINE    rac4        
ora....vice.cs application    OFFLINE   OFFLINE               
ora....db1.srv application    OFFLINE   OFFLINE    
[oracle@rac3 ~]$ srvctl start service -d racdb
[oracle@rac3 ~]$ crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....SM1.asm application    ONLINE    ONLINE    rac3        
ora....C3.lsnr application    ONLINE    ONLINE    rac3        
ora.rac3.gsd   application    ONLINE    ONLINE    rac3        
ora.rac3.ons   application    ONLINE    ONLINE    rac3        
ora.rac3.vip   application    ONLINE    ONLINE    rac3        
ora....SM2.asm application    ONLINE    ONLINE    rac4        
ora....C4.lsnr application    ONLINE    ONLINE    rac4        
ora.rac4.gsd   application    ONLINE    ONLINE    rac4        
ora.rac4.ons   application    ONLINE    ONLINE    rac4        
ora.rac4.vip   application    ONLINE    ONLINE    rac4        
ora.racdb.db   application    ONLINE    ONLINE    rac3        
ora....b1.inst application    ONLINE    ONLINE    rac3        
ora....b2.inst application    ONLINE    ONLINE    rac4        
ora....vice.cs application    ONLINE    ONLINE    rac3        
ora....db1.srv application    ONLINE    ONLINE    rac3  

 

至此,我们已经正确的重新初始化了我们的OCR盘和VoteDisk盘,并且没有用到备份。

 

posted @ 2015-03-03 20:06  蚂蚁快跑  阅读(754)  评论(0编辑  收藏  举报