bay——巡检RAC操作.txt

 

Oracle Cluster Software 包括下列组件:
Event Management (EVM)
Cluster Synchronization Services (CSS)
Cluster Ready Service (CRS)

CRS、CSS、EVM、RACG 这些都是clusterware下的服务,真正对应的进程名称为:CRSD、OCSSD、WVMD、RACGIMON

CRS服务:负责监控集群中资源,负责集群的高可用性;
CSS服务:维护节点集群中,节点的关系,维护管理配置信息的;
EVM服务: 当CRS检查到某节点存在异常时,会产生一个事件。而EVM就负责把这个事件发布出来用的;
RACG服务:负责监控数据库健康状态的,负责服务的启动、停止;


crsctl 应该是cluster control 用的,它管理的是cluster level的内容,如crsd cssd css 等进程和配置的管理
srvctl 是资源的管理,是cluster内资源的管理,比如instance asm listener vip ons gsd??service .....

 


db_create_file_dest string +ASMDG
standby_archive_dest string ?/dbs/arch
diagnostic_dest string /oracle/db
log_archive_dest string
log_archive_dest_1 string LOCATION=+ASMDG/ARCLOG/ARC1
SQL> show SGA

Total System Global Area 5344731136 bytes
Fixed Size 2262656 bytes
Variable Size 1325402496 bytes
Database Buffers 3992977408 bytes
Redo Buffers 24088576 bytes
SQL> show parameter pga;

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 98222M

 


-查找超过800M大小文件,并显示查找出来文件的具体大小,可以使用下面命令

find . -type f -size +800M -print0 | xargs -0 du -h

--查看当前目录下每个文件夹的大小,并以倒叙排列

du --time -ah * |sort -nr

4.7G ./u01/app/11.2.0/grid/log/diag/tnslsnr/rac1/listener_scan1/trace/listener_scan1.log
6.0G ./u01/app/11.2.0/grid/crf/db/rac1/crfclust.bdb
3.8G ./u01/app/grid/diag/tnslsnr/rac1/listener/trace/listener.log

 

再次查看监听状态,就可以看到服务了:
[grid@www.cndba.cn ~]$ lsnrctl status LISTENER
[grid@node1 admin]$ ps -ef|grep lsnr

注意:集群中listener_scan1.log清理方式有所不同
lsnrctl命令行模式
set current_listener LISTENER
set log_status off
cat /dev/null > listener_scan1.log
set log_status on
cat listener_scan1.log

清理日志:
cat /dev/null > /oracle/grid/diag/tnslsnr/rac2/listener/alert/log.xml
/oracle/asm/log/diag/tnslsnr/rac2/listener_scan1/trace/listener_scan1.log

查看数据库:--------------------------------------

[oracle@rac2 ~]$ sqlplus / as sysdba

select instance_name,status from v$instance;
select instance_name,host_name,status from gv$instance;

show parameter cluster -->查看集群的参数,cluster_database为true表示为集群数据库,否则,非集群数据库
SELECT a.NAME,a.DATABASE_ROLE,a.OPEN_MODE,a.LOG_MODE FROM V$DATABASE a;

查看数据库状态
[oracle@rac2 ~]$srvctl status database -d dba
[oracle@rac2 ~]$srvctl status database -d dba -v

log_archive_dest_1 string LOCATION=+ASMDG/ARCLOG/ARC1


[root@rac1 admin]# su - oracle
[oracle@rac1 ~]$ crontab -l
0 22 * * * /home/oracle/expdp.sh > /dev/null 2>&1
0 23 * * * /home/oracle/incr0.sh > /dev/null 2>&1
0 */2 * * * /home/oracle/incr1.sh > /dev/null 2>&1
30 23 * * * /home/oracle/delete.sh > /dev/null 2>&1


exit

 

在RAC环境中,需要切换到grid用户来查询监听日志
su - grid
ps -ef | grep tns
lsnrctl


LSNRCTL> show log_file
LSNRCTL> status


[oracle@rac1 ~]$ cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
#public IP
10.100.2.10 rac1
10.100.2.20 rac2
#private IP
10.100.3.10 rac1-p
10.100.3.20 rac2-p
#virtual IP
10.100.2.110 rac1-v
10.100.2.120 rac2-v
#scan ip
10.100.2.100 rac-scan

#Backup
10.10.10.11 BAKServer01
172.16.1.10 WebServer01
172.16.1.11 WebServer02


[oracle@rac1 admin]$ cat tnsnames.ora
# tnsnames.ora Network Configuration File: /u01/app/oracle/product/11.2.0/dbhome_1/network/admin/tnsnames.ora
# Generated by Oracle configuration tools.

DBSERVER =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = rac-scan)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = DBserver)
)
)

 

[grid@rac1 ~]$ cat .bash_profile
# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/bin

export PATH
export ORACLE_BASE=/oracle/grid
export ORACLE_HOME=/oracle/asm
export ORACLE_SID=+ASM1
export PATH=$PATH:$ORACLE_HOME/bin:$HOME/bin
export PATH=$ORACLE_HOME/bin:$ORACLE_HOME/Apache/Apache/bin:$PATH
LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib:/usr/local/lib
export LD_LIBRARY_PATH
CLASSPATH=$ORACLE_HOME/JRE:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib
CLASSPATH=$CLASSPATH:$ORACLE_HOME/network/jlib
export CLASSPATH


参看集群状态


使用grid用户登陆oracle控制台

检查CRS是否正常启动
注:若命令不能运行,添加path,或直接进入命令所在目录运行。本例位于:
/u01/crs/oracle/product/11.1.0/crs/bin

Crsctl
---------------------------------------->>>>>>>>>>>>>>>>>>>>

Crsctl命令可以用来检查CRS进程栈,每个crs进程状态,管理Votedisk,跟踪CRS进程功能。

集群的资源状态: [grid@rac1 ~]$ crs_stat -t
集群的资源状态: [grid@rac1 ~]$ crsctl status res -t
集群服务状态 : [grid@rac1 ~]$ crsctl check crs

CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
–检查单个状态
[grid@rac1 ~]$ crsctl check evmd
CRS-272: This command remains for backward compatibility only
Event Manager is online
[grid@rac1 ~]$ crsctl check crsd
CRS-272: This command remains for backward compatibility only
Cluster Ready Services is online

[grid@rac2 ~]$ crsctl check crsd
CRS-272: This command remains for backward compatibility only
Cannot communicate with Cluster Ready Services

[grid@rac1 ~]$ crsctl check cssd
CRS-272: This command remains for backward compatibility only
Cluster Synchronization Services is online


仲裁盘信息:
[grid@rac1 ~]$ crsctl query css votedisk


配置crs栈是否自启动
CRS进程栈默认随着操作系统的启动而自启动,有时出于维护目的需要关闭这个特性,可以用root用户执行下面命令。
[root@rac1 bin]# ./crsctl disable crs
[root@rac1 bin]# ./crsctl enable crs
这个命令实际是修改了/etc/oracle/scls_scr/raw/root/crsstart这个文件里的内容。

–启动CRS:
[root@rac1 bin]# ./crsctl start crs

Attempting to start CRS stack
The CRS stack will be started shortly
–关闭CRS:
[root@rac1 bin]# ./crsctl stop crs


跟踪crs模块,提供辅助功能
CRS由CRS,CSS,EVM三个服务组成,每个服务又是由一系列module组成,crsctl允许对每个module进行跟踪,并把跟踪内容记录到日志中。
[root@rac1 bin]# ./crsctl lsmodules css
[root@rac1 bin]# ./crsctl lsmodules evm
–跟踪CSSD模块,需要root用户执行:
[root@rac1 bin]# ./crsctl debug log css "CSSD:1"
Configuration parameter trace is now set to 1.
Set CRSD Debug Module: CSSD Level: 1
–查看跟踪日志
[root@rac1 cssd]# pwd
/u01/app/oracle/product/crs/log/rac1/cssd
[root@rac1 cssd]# more ocssd.log


ocr命令系列
Oracle Clusterware把整个集群的配置信息放在共享存储上,这个存储就是OCR Disk.在整个集群中,只有一个节点能对OCR Disk进行读写操作,这个节点叫作Master Node,所有节点都会在内存中保留一份OCR的拷贝,同时哟一个OCR Process从这个内存中读取内容。OCR内容发生改变时,由Master Node的OCR Process负责同步到其他节点的OCR Process。
因为OCR的内容如此重要,Oracle每4个小时对其做一次备份,并且保留最后的3个备份,以及前一天,前一周的最后一个备份。这个备份由Master Node CRSD进程完成,备份的默认位置是$CRS_HOME\crs\cdata\<cluster_name>目录下。每次备份后,备份文件名自动更改,以反应备份时间顺序,最近一次的备份叫作backup00.ocr。这些备份文件除了保存在本地,DBA还应该在其他存储设备上保留一份,以防止意外的存储故障。

---------------------------------------->>>>>>>>>>>>>>>>>>>>
应用层:
应用层就是指RAC数据库了,这一层有若干资源组成,每个资源都是一个进程或者一组进程组成的完整服务,
这一层的管理和维护都是围绕这些资源进行的。有如下命令:srvctl, onsctl, crs_stat三个命令
列出当前RAC下所有的信息 :

 

=====================================================
查看所有节点实例和服务的状态

[grid@rac1 ~]$ srvctl config database
[grid@rac1 ~]$ srvctl status database -d racdb
[grid@rac1 ~]$ srvctl config database -d racdb -a

onsctl
这个命令用于管理配置ONS(Oracle Notification Service). ONS是Oracle Clusterware实现FAN Event Push模型的基础。
[root@rac1 bin]#ps -aef|grep ons

集群OCR配置情况:

[grid@rac1 ~]$ ocrcheck
[grid@rac2 ~]$ ocrconfig -showbackup --巡检过程中,对OCR自动备份进行检查:


检查节点应用: 参数-n -i -p

$ olsnodes
$ olsnodes -s列出集群中节点
$ srvctl status nodeapps
$ srvctl status nodeapps -n node1

$ srvctl stop database -d orcl (停止数据库所有实例)
$ srvctl stop nodeapps -n host1 (停节点1服务)
$ srvctl stop nodeapps -n host2 (停节点2服务)
crsctl check cluster -all 检查几圈状态


关闭数据库,全部节点都要关闭

srvctl stop database -d racdb


4.启动另外一个节点数据库

srvctl start instance -d racdb -i racdb1

.将一个节点数据库启动到mount状态5并修改数据库的归档模式并启动数据库

srvctl start instance -d racdb -i racdb2 -o mount

------------------------------------------------------------
查看监听信息

再次查看监听状态,就可以看到服务了:
[grid@rac2 ~]$ lsnrctl status LISTENER
[root@rac1 ~]# su - grid
[grid@rac1 ~]$ lsnrctl
[grid@rac1 ~]$ srvctl config listener -n rac1

看监听状态

[grid@rac1 ~]$ srvctl status instance -d DBserver -i DBserver1,DBserver2
Instance DBserver1 is running on node rac1
Instance DBserver2 is not running on node rac2


-------------------------------------------------------
>>>>>--在Oracle RAC环境下,使用grid帐号执行

运行asmcmd进入asm命令模式,如:

[grid@rac1 trace]$ pwd
/oracle/grid/diag/asm/+asm/+ASM1/trace/alert_+ASM1.log

export ORACLE_BASE=/oracle/grid
export ORACLE_HOME=/oracle/asm

/oracle/asm/bin crs_start


[oracle@racdb1 ~]$ export ORACLE_SID=+ASM1
[oracle@racdb1 ~]$ asmcmd
ASMCMD> lsct //列出当前ASM客户端的信息
ASMCMD> lsdg //列出所有磁盘组 使用情况
ASMCMD> lsof //列出数据文件信息
ASMCMD> lsdsk //列出盘的信息
ASMCMD> lsdsk --statistics -G ASMDG --查看磁盘I/O信息,datadg为磁盘组名
ASMCMD> iostat -G DATA 包括ASM磁盘列表


若要使用图形界面,可以使用ASM助手命令:asmca

[grid@oradb-node1 ~]$ sqlplus / as sysasm

Oracle11g中asm实例不能使用sys用户,使用sysasm用户:

>>>>>--挂载ASM磁盘组:
SQL> alter diskgroup archdg mount;
SQL> select name,state from v$asm_diskgroup;

>>>>>--查看磁盘组信息
SQL> select name,state,type,total_mb ,free_mb from v$asm_diskgroup;
NAME STATE TYPE TOTAL_MB FREE_MB
------------------------------ ----------- ------ ---------- ----------
ASMDG MOUNTED EXTERN 921600 822903
ASMVOTE MOUNTED EXTERN 102400 101547

col PATH for a40
set pagesize 400
select name,path from v$asm_disk_stat;

NAME PATH
---------- ---------------
VOL1 ORCL:VOL1
VOL2 ORCL:VOL2
VOL3 ORCL:VOL3
VOL4 ORCL:VOL4


>>>>>--查看磁盘信息--
col PATH for a10;
col name for a10;
col STATE for a10;
set linesize 455
select group_number,path,state,total_mb,free_mb from v$asm_disk;

GROUP_NUMBER PATH STATE TOTAL_MB FREE_MB
------------ ------------------------------ -------- ---------- ----------
2 /dev/mapper/mpathc NORMAL 921600 823137
3 /dev/mapper/mpathb NORMAL 102400 101547


检查ASM状态:-
查看磁盘

[root@rac1 ~]# ls -l /dev/sd*
[root@rac1 ~]# ls -l /dev/asm*
[root@rac1 ~]# lsblk

[grid@rac1 ~]$ srvctl status asm -a
[grid@rac1 ~]$ srvctl status asm -n node1
[grid@rac1 ~]$ srvctl config asm -n rac1
[grid@rac2 ~]$ srvctl config asm -a


查看ASM日志

[root@ATFDB1 ~]# su - grid
[grid@ATFDB1 ~]$ sqlplus / as sysasm

SQL> show parameter dump

[root@rac1 trace]# ls *.log
alert_+ASM1.log
[root@rac1 trace]# pwd
/u01/app/grid/diag/asm/+asm/+ASM1/trace


-------------------------------------------------------

rac CTSS时间同步
校验集群的时间:
cluvfy comp clocksync -verbose 检查也没同步

再次su - grid

crsctl check ctss

???NTP同步模式
开启NTP:

mv /etc/ntp.conf.bak /etc/ntp.conf

service ntpd status

/sbin/service ntpd start

# chkconfig ntpd off

ps -ef|grep ntp


节点1?:

[root@raclhr-11gR2-N1 ~]# crsctl check ctss
CRS-4700: The Cluster Time Synchronization Service is in?Observer mode.
[root@raclhr-11gR2-N1 ~]#? crsctl stat res -t -init

 

查看实例下表空间使用情况:

set pagesize 999
set linesize 999
col file_name for a50;
col TABLESPACE_NAME for a20;
SELECT a.tablespace_name "tablespace_name",
100-ROUND((NVL(b.bytes_free,0)/a.bytes_alloc)*100,2) "percent(%)",
ROUND(a.bytes_alloc/1024/1024,2) "free(M)",
ROUND((a.bytes_alloc-NVL(b.bytes_free,0))/1024/1024,2) "use(M)",
TO_CHAR(SYSDATE,'yyyy-mm-dd hh24:mi:ss') "time"
FROM (SELECT f.tablespace_name,
SUM(f.bytes) bytes_alloc,
SUM(DECODE(f.autoextensible,'YES',f.maxbytes,'NO',f.bytes)) maxbytes
FROM dba_data_files f GROUP BY tablespace_name) a,
(SELECT f.tablespace_name, SUM(f.bytes) bytes_free
FROM dba_free_space f GROUP BY tablespace_name) b
WHERE a.tablespace_name = b.tablespace_name
ORDER BY 1 ASC ;

查看数据文件使用情况:

col TABLESPACE_NAME for a20;
col FILE_NAME for a40;
select a.tablespace_name,a.FILE_NAME,bytes/1024/1024||'M' "size",a.AUTOEXTENSIBLE,a.INCREMENT_BY from dba_data_files a order by a.FILE_NAME;

 

posted @ 2019-07-02 13:42  上帝_BayaiM  阅读(266)  评论(0编辑  收藏  举报