Oracle 10g Rac root.sh Failure at final check of Oracle CRS stack 10 解决方法

 

一.问题说明

 

安装环境:Oracle linux 6.1

数据库: 10.2.0.1

 

安装Oracle 10g的RAC,在第一个节点执行root.sh 时报错,如下:

 

[root@rac1 ~]# /u01/app/10.2.0/grid/root.sh

WARNING: directory '/u01/app/10.2.0' is notowned by root

WARNING: directory '/u01/app' is not ownedby root

WARNING: directory '/u01' is not owned byroot

Checking to see if Oracle CRS stack isalready configured

 

Setting the permissions on OCR backupdirectory

Setting up NS directories

Oracle Cluster Registry configurationupgraded successfully

WARNING: directory '/u01/app/10.2.0' is notowned by root

WARNING: directory '/u01/app' is not ownedby root

WARNING: directory '/u01' is not owned byroot

Successfully accumulated necessary OCRkeys.

Using ports: CSS=49895 CRS=49896 EVMC=49898and EVMR=49897.

node <nodenumber>: <nodename><private interconnect name> <hostname>

node 1: rac1 rac1-priv rac1

node 2: rac2 rac2-priv rac2

Creating OCR keys for user 'root', privgrp'root'..

Operation successful.

Now formatting voting device: /dev/raw/raw3

 

Now formatting voting device: /dev/raw/raw4

Now formatting voting device: /dev/raw/raw5

Format of 3 voting devices complete.

Startup will be queued to init within 90seconds.

Adding daemons to inittab

Expecting the CRS daemons to be up within600 seconds.

 

Failure at final check of Oracle CRS stack.

10

 

二.MOS上有篇文档说明这个问题:

 

2.1 文档一:

Root.sh failed at Failure at final check ofOracle CRS stack 10 [ID 725878.1]

 

Case

This particular case is caused by the OSinit system does not working.

" Failure at final check of Oracle CRS stack.
10" 
means CRS daemon did not startup during 600 seconds period.

 

In the root.sh script, it adds CRS relatedentry in /etc/inittab, run "init q" and expect 3 CRS related daemonprocesses to start, eg:

init.cssd
init.crsd
init.evmd

 

With init system problem, none of thesedaemon processes are spawned, this causes CRS process startup failure as theyrely on the CRS daemon processes to start first.
--这里说明是init system problem 出现问题,导致进程无法启动。可以通过以下方法验证这个问题:


This can be verified by adding a simple entry in /etc/inittab:

test:2:once:/usr/bin/echo "HELLOTEST" > /tmp/test.log


run "init q" as root user. If the init is working, then there shouldbe a file /tmp/test.log generated.

 

Solution

 

--MOS上仅给出了AIX上的解决方案,如下:

Please consult with system administrator tofix initissue.

Here the solution is only valid for AIXplatform:

1. Starting the script install_assist (AIXGUI utility Installation Assistance)
2. Updating for example the date, then exit install_assist properly
3. Reboot the system
After that daemon process in /etc/inittab started, CRS installation completed.

 

 

2.2 文档二:

Clusterware Fails To Start DuringRoot.sh -- "Failure at final check of Oracle CRS stack 10" [ID329450.1]

 

The Oracle Clusterware runs as root, but for some operations itneed to run as the oracle user, and uses the "su -l" which invokesthe oracle user shell login/profile script. If that shell profile script hasinteractive or cpu bound operations or prompts this may affect theClusterware operation.

 

--这边文档说的是.bash_profile中的参数有交互性的参数,删除这些参数就可以了。

 

其他文档:

Troubleshooting 10g or 11.1 OracleClusterware Root.sh Problems [ID 240001.1]

 

 

三.问题分析

 

查看相关log:

[oracle@rac1 client]$ pwd

/u01/app/10.2.0/grid/log/rac1/client

[oracle@rac1 client]$ ls

clscfg_6337.log  clsc.log css.log  ocrconfig_6285.log

 

[oracle@rac1 client]$ tail -30 css.log

2012-07-12 23:23:15.565: [CSSCLNT][3681171200]clsssInitNative: connect failed, rc 9

2012-07-12 23:23:16.977: [CSSCLNT][3681171200]clsssInitNative: connect failed, rc 9

2012-07-12 23:23:18.390: [CSSCLNT][3681171200]clsssInitNative: connect failed, rc 9

2012-07-12 23:23:19.885: [CSSCLNT][3681171200]clsssInitNative: connect failed, rc 9

 

[oracle@rac1 client]$ tail -10 clsc.log

Oracle Database 10g CRS Release 10.2.0.1.0Production Copyright 1996, 2005 Oracle. All rights reserved.

2012-07-12 23:24:51.389: [default][4093163264]Terminating clsd session

2012-07-12 23:25:00.274: [default][135894784]Terminating clsd session

 

[oracle@rac1 client]$ tail clscfg_6337.log

Oracle Database 10g CRS Release 10.2.0.1.0Production Copyright 1996, 2005 Oracle. All rights reserved.

2012-07-12 23:12:24.477: [  CLSCFG][1566725888]clscfg: Nodelist is [rac1rac2 ]

 

[oracle@rac1 rac1]$ cat alertrac1.log

2012-07-12 10:06:10.703

[client(6285)]CRS-1006:The OCR location/dev/raw/raw2 is inaccessible. Details in/u01/app/10.2.0/grid/log/rac1/client/ocrconfig_6285.log.

2012-07-12 10:06:11.076

[client(6285)]CRS-1001:The OCR wasformatted using version 2.

2012-07-12 10:12:24.479

[client(6337)]CRS-1801:Cluster crsconfigured with nodes rac1 rac2 .

 

 

--在一个节点用root执行如下命令,清除OCR上的信息:

[root@rac1 ~]# sh /u01/app/10.2.0/grid/install/rootdeinstall.sh

Removing contents from OCR mirror device

2560+0 records in

2560+0 records out

10485760 bytes (10 MB) copied, 2.46509 s,4.3 MB/s

Removing contents from OCR device

2560+0 records in

2560+0 records out

10485760 bytes (10 MB) copied, 1.18886 s,8.8 MB/s

 

 

然后在运行root.sh 问题依旧。

 

 

尝试使用了如下方法:

1.     关闭防火墙

我在安装之前已经把防火墙关闭,所以这里只是检查一下。

 

[root@rac1 tmp]# service iptables status

iptables: Firewall is not running.

[root@rac1 tmp]# chkconfig iptables --list

iptables        0:off  1:off   2:off   3:off  4:off   5:off   6:off

 

2.     注释了如下文件:

[root@rac1 tmp]# cat /etc/pam.d/other

#%PAM-1.0

auth    required       pam_deny.so

account required       pam_deny.so

password required       pam_deny.so

session required       pam_deny.so

 

3.     删除相关socket

# rm -f /usr/tmp/.oracle/*

# rm -f /tmp/.oracle/*

# rm -f /var/tmp/.oracle/*

 

Unable To Connect To Cluster ManagerOra-29701 as Network Socket Files are Removed [ID 391790.1]

 

运行sh/u01/app/10.2.0/grid/install/rootdeinstall.sh清除后再次安装,问题依旧,可能还是兼容性的问题。

 

后来把OS换成Redhat 5.4,成功安装了,可能还是Oracle 10g在Oracle Linux 6上的兼容性问题,在Oracle Linux 6上,我测试过Oracle 11.2.0.3的RAC,安装没有问题。

 

 

 

 

 

 

-------------------------------------------------------------------------------------------------------

版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!

Skype: tianlesoftware

QQ:              tianlesoftware@gmail.com

Email:   tianlesoftware@gmail.com

Blog:     http://www.tianlesoftware.com

Weibo: http://weibo.com/tianlesoftware

Twitter: http://twitter.com/tianlesoftware

Facebook: http://www.facebook.com/tianlesoftware

Linkedin: http://cn.linkedin.com/in/tianlesoftware

 

 

-------加群需要在备注说明Oracle表空间和数据文件的关系,否则拒绝申请----

DBA1 群:62697716(满);   DBA2 群:62697977(满)  DBA3 群:62697850(满)  

DBA 超级群:63306533(满);  DBA4 群:83829929   DBA5群: 142216823

DBA6 群:158654907    DBA7 群:172855474   DBA总群:104207940

posted @ 2012-07-21 14:44  davedba  阅读(221)  评论(0编辑  收藏  举报