导航

ocr_vote disk损坏恢复

Posted on 2019-03-19 10:26  datalife  阅读(658)  评论(0编辑  收藏  举报

1. 检查votedisk和 ocr备份
[root@rh6rac1 bin]./ocrconfig -showbackup
rh6rac1     2019/03/19 16:57:40     /oracle/grid/cdata/rh6rac-cluster/backup00.ocr

rh6rac1     2019/03/19 12:57:40     /oracle/grid/cdata/rh6rac-cluster/backup01.ocr

rh6rac1     2019/03/19 08:57:40     /oracle/grid/cdata/rh6rac-cluster/backup02.ocr

rh6rac1     2019/03/18 00:57:38     /oracle/grid/cdata/rh6rac-cluster/day.ocr

rh6rac2     2019/03/13 04:13:03     /oracle/grid/cdata/rh6rac-cluster/week.ocr


[root@rh6rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   4621da78f6cb4f45bfc427515ba7d5fc (/dev/asm-diskb) [OCRVOTE]
Located 1 voting disk(s).

2. 彻底关闭所有节点上的clusterware ,OHASD
[root@rh6rac1 bin]#./crsctl stop has -f

使用dd 命令 破坏ocr和votedisk所在diskgroup
dd if=/dev/zero of=/dev/asm-diskb bs=1024k count=1

[root@rh6rac1 bin]# ./crsctl start has
CRS-4123: Oracle High Availability Services has been started.

root@rh6rac1 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager

alertrh6rac1.log
[/oracle/grid/bin/orarootagent.bin(3725)]CRS-5822:Agent '/oracle/grid/bin/orarootagent_root' disconnected from server. Details at (:CRSAGF00117
:) {0:5:903} in /oracle/grid/log/rh6rac1/agent/crsd/orarootagent_root//orarootagent_root.log.2019-03-19 17:10:20.811:
[ctssd(3174)]CRS-2405:The Cluster Time Synchronization Service on host rh6rac1 is shutdown by user
2019-03-19 17:10:20.830:
[mdnsd(2820)]CRS-5602:mDNS service stopping by request.
[client(15431)]CRS-10001:19-Mar-19 17:10 ACFS-9290: Waiting for ASM to shutdown.
2019-03-19 17:10:31.655:
[cssd(2915)]CRS-1603:CSSD on node rh6rac1 shutdown by user.
2019-03-19 17:10:31.762:
[ohasd(2230)]CRS-2767:Resource state recovery not attempted for 'ora.cssdmonitor' as its target state is OFFLINE
2019-03-19 17:10:31.857:
[cssd(2915)]CRS-1660:The CSS daemon shutdown has completed
2019-03-19 17:10:35.402:
[gpnpd(2841)]CRS-2329:GPNPD on node rh6rac1 shutdown.
2019-03-19 17:12:07.003:
[ohasd(16139)]CRS-2112:The OLR service started on node rh6rac1.
2019-03-19 17:12:07.013:
[ohasd(16139)]CRS-1301:Oracle High Availability Service started on node rh6rac1.
2019-03-19 17:12:07.014:
[ohasd(16139)]CRS-8017:location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred
2019-03-19 17:12:10.417:
[/oracle/grid/bin/orarootagent.bin(16241)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
2019-03-19 17:12:14.840:
[gpnpd(16359)]CRS-2328:GPNPD started on node rh6rac1.
2019-03-19 17:12:17.341:
[cssd(16429)]CRS-1713:CSSD daemon is started in clustered mode
2019-03-19 17:12:19.093:
[ohasd(16139)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE
2019-03-19 17:12:25.937:
[cssd(16429)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /oracle/grid/log/rh6r
ac1/cssd/ocssd.log2019-03-19 17:12:40.950:
[cssd(16429)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /oracle/grid/log/rh6r
ac1/cssd/ocssd.log2019-03-19 17:12:55.959:
[cssd(16429)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /oracle/grid/log/rh6r
ac1/cssd/ocssd.log2019-03-19 17:13:10.968:
[cssd(16429)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /oracle/grid/log/rh6r
ac1/cssd/ocssd.log2019-03-19 17:13:25.976:
[cssd(16429)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /oracle/grid/log/rh6r
ac1/cssd/ocssd.log


ocssd.log:
2019-03-19 17:14:56.025: [    CSSD][1181738752]clssnmReadDiscoveryProfile: voting file discovery string(/dev/asm*)
2019-03-19 17:14:56.025: [    CSSD][1181738752]clssnmvDDiscThread: using discovery string /dev/asm* for initial discovery
2019-03-19 17:14:56.025: [   SKGFD][1181738752]Discovery with str:/dev/asm*:

2019-03-19 17:14:56.025: [   SKGFD][1181738752]UFS discovery with :/dev/asm*:

2019-03-19 17:14:56.025: [   SKGFD][1181738752]Execute glob on the string /dev/asm*
2019-03-19 17:14:56.025: [   SKGFD][1181738752]running stat on disk:/dev/asm-diske
2019-03-19 17:14:56.026: [   SKGFD][1181738752]running stat on disk:/dev/asm-diskd
2019-03-19 17:14:56.026: [   SKGFD][1181738752]running stat on disk:/dev/asm-diskc
2019-03-19 17:14:56.027: [   SKGFD][1181738752]running stat on disk:/dev/asm-diskb
2019-03-19 17:14:56.027: [   SKGFD][1181738752]Fetching UFS disk :/dev/asm-diskb:

2019-03-19 17:14:56.027: [   SKGFD][1181738752]Fetching UFS disk :/dev/asm-diskc:

2019-03-19 17:14:56.027: [   SKGFD][1181738752]Fetching UFS disk :/dev/asm-diskd:

2019-03-19 17:14:56.027: [   SKGFD][1181738752]Fetching UFS disk :/dev/asm-diske:

2019-03-19 17:14:56.027: [   SKGFD][1181738752]OSS discovery with :/dev/asm*:

2019-03-19 17:14:56.027: [   SKGFD][1181738752]Handle 0x7fe334136b60 from lib :UFS:: for disk :/dev/asm-diskb:

2019-03-19 17:14:56.027: [   SKGFD][1181738752]Handle 0x7fe334130f60 from lib :UFS:: for disk :/dev/asm-diskc:

2019-03-19 17:14:56.028: [   SKGFD][1181738752]Handle 0x7fe334131790 from lib :UFS:: for disk :/dev/asm-diskd:

2019-03-19 17:14:56.028: [   SKGFD][1181738752]Handle 0x7fe33413f4a0 from lib :UFS:: for disk :/dev/asm-diske:

2019-03-19 17:14:56.028: [   SKGFD][1181738752]Lib :UFS:: closing handle 0x7fe334136b60 for disk :/dev/asm-diskb:

2019-03-19 17:14:56.028: [   SKGFD][1181738752]Lib :UFS:: closing handle 0x7fe334130f60 for disk :/dev/asm-diskc:

2019-03-19 17:14:56.028: [   SKGFD][1181738752]Lib :UFS:: closing handle 0x7fe334131790 for disk :/dev/asm-diskd:

2019-03-19 17:14:56.028: [   SKGFD][1181738752]Lib :UFS:: closing handle 0x7fe33413f4a0 for disk :/dev/asm-diske:

2019-03-19 17:14:56.028: [    CSSD][1181738752]clssnmvDiskVerify: Successful discovery of 0 disks
2019-03-19 17:14:56.028: [    CSSD][1181738752]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
2019-03-19 17:14:56.028: [    CSSD][1181738752]clssnmvFindInitialConfigs: No voting files found
2019-03-19 17:14:56.028: [    CSSD][1181738752](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 sec
onds



正式的恢复ocr和votedisk所在diskgroup的步骤如下:
清除grid所有进程。
1. 以-excl -nocrs 方式启动cluster,这将可以启动ASM实例 但不启动CRS

[root@rh6rac1 bin]# ./crsctl start crs -excl -nocrs
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.mdnsd' on 'rh6rac1'
CRS-2676: Start of 'ora.mdnsd' on 'rh6rac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rh6rac1'
CRS-2676: Start of 'ora.gpnpd' on 'rh6rac1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rh6rac1'
CRS-2672: Attempting to start 'ora.gipcd' on 'rh6rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'rh6rac1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'rh6rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rh6rac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'rh6rac1'
CRS-2676: Start of 'ora.diskmon' on 'rh6rac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'rh6rac1' succeeded
CRS-2672: Attempting to start 'ora.drivers.acfs' on 'rh6rac1'
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'rh6rac1'
CRS-2672: Attempting to start 'ora.ctssd' on 'rh6rac1'
CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'rh6rac1' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rh6rac1'
CRS-2676: Start of 'ora.drivers.acfs' on 'rh6rac1' succeeded
CRS-2676: Start of 'ora.ctssd' on 'rh6rac1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rh6rac1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rh6rac1'
CRS-2676: Start of 'ora.asm' on 'rh6rac1' succeeded

2.重建原ocr和votedisk所在diskgroup,注意compatible.asm必须是11.2
[root@rh6rac1 bin]# su - grid
[grid@rh6rac1 ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.4.0 Production on Tue Mar 19 17:29:39 2019

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> create diskgroup OCRVOTE external redundancy disk '/dev/asm-diskb' ATTRIBUTE 'compatible.rdbms' = '11.2', 'compatible.asm' = '11.2';

Diskgroup created.

3.从ocr backup中恢复ocr并做ocrcheck检验:
[root@rh6rac1 bin]# ./ocrconfig -restore /oracle/grid/cdata/rh6rac-cluster/backup00.ocr
[root@rh6rac1 bin]#
[root@rh6rac1 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
     Version                  :          3
     Total space (kbytes)     :     262120
     Used space (kbytes)      :       3084
     Available space (kbytes) :     259036
     ID                       :  276343585
     Device/File Name         :   +OCRVOTE
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

     Cluster registry integrity check succeeded

     Logical corruption check succeeded
    
4. 准备恢复votedisk ,可能会遇到下面的错误:
[root@rh6rac1 bin]# ./crsctl replace votedisk  +OCRVOTE
CRS-4602: Failed 27 to add voting file 942480699ad84f50bfbd253181a05ad1.
Failed to replace voting disk group with +OCRVOTE.
CRS-4000: Command Replace failed, or completed with errors.
需要重新配置一下ASM的参数,并重启ASM:
SQL> alter system set asm_diskstring='/dev/asm*';

System altered.
SQL> create pfile from memory;

File created.

SQL> startup force mount
ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance
ASM instance started

Total System Global Area 1135747072 bytes
Fixed Size            2260728 bytes
Variable Size         1108320520 bytes
ASM Cache           25165824 bytes
ASM diskgroups mounted
ASM diskgroups volume enabled

[root@rh6rac1 bin]# ./crsctl replace votedisk  +OCRVOTE
Successful addition of voting disk 59835fcfdc874f55bfccc60a6be79ca4.
Successfully replaced voting disk group with +OCRVOTE.
CRS-4266: Voting file(s) successfully replaced

5. 重启has服务,检验cluster是否正常:
[root@rh6rac1 bin]# ./crsctl stop has -f
[root@rh6rac1 bin]# ./crsctl start has
CRS-4123: Oracle High Availability Services has been started.

[root@rh6rac1 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

[root@rh6rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   59835fcfdc874f55bfccc60a6be79ca4 (/dev/asm-diskb) [OCRVOTE]
Located 1 voting disk(s).

查看crs状态
./crsctl status res -t