代码改变世界

执行orachk检查数据库环境

2023-04-07 00:36  AlfredZhao  阅读(235)  评论(0编辑  收藏  举报

Exadata环境巡检需要执行专有的exachk,而普通Oracle环境可以通过执行orachk来检查集群和数据库相关健康状况。

1.使用orachk检查健康状态

使用root用户执行,期间可能需要多次输入另外节点的root密码,准备好密码正确输入即可:

[root@db01rac1 ~]# orachk
This version of AHF is older than 180 days and you should upgrade AHF using ahfctl upgrade.
 
Clusterware stack is running from /u01/app/19.3.0/grid. Is this the correct Clusterware Home?[y/n][y] 
root@db01rac2's password: 
root@db01rac2's password: 

Searching for running databases . . . . .

.  .  
List of running databases registered in OCR

1. demorac
2. None of above

Select databases from list for checking best practices. For multiple databases, select 1 for All or comma separated number like 1,2 etc [1-2][1]. 
.  .  .  .  .  .  


Either Cluster Verification Utility pack (cvupack) does not exist at /opt/oracle.ahf/common/cvu or it is an old or invalid cvupack

Checking Cluster Verification Utility (CVU) version at CRS Home - /u01/app/19.3.0/grid

This version of Cluster Verification Utility (CVU) was released on 10-Jul-2022 and it is older than 180 days. It is highly recommended that you download the latest version of CVU from MOS patch 30839369 to ensure the highest level of accuracy of the data contained within the report

Do you want to download latest version of Cluster Verification Utility (CVU) from my oracle support? [y/n] [y] n

Running older version of Cluster Verification Utility (CVU) from CRS Home - /u01/app/19.3.0/grid


Starting to run orachk in background on db01rac2 using socket
root@db01rac2's password: 
root@db01rac2's password: 
This version of AHF is older than 180 days and you should upgrade AHF using ahfctl upgrade.
 
.  
.  .  .  .
.  .  

Checking Status of Oracle Software Stack - Clusterware, ASM, RDBMS on db01rac1

.  .  . . . .  
.  .  . . . .  .  .  .  .  .  .  .  
-------------------------------------------------------------------------------------------------------
                                                 Oracle Stack Status                          
-------------------------------------------------------------------------------------------------------
  Host Name       CRS Installed  RDBMS Installed    CRS UP    ASM UP  RDBMS UP    DB Instance Name
-------------------------------------------------------------------------------------------------------
   db01rac1                   Yes          Yes          Yes      Yes      Yes               jydb1
-------------------------------------------------------------------------------------------------------
. 
.  .  .  .  .  .  


. 
. 
. 
.  

. 



*** Checking Best Practice Recommendations ( Pass / Warning / Fail ) ***


.  

Collections and audit checks log file is 
/u01/app/grid/oracle.ahf/data/db01rac1/orachk/user_root/output/orachk_db01rac1_demorac_040623_234130/log/orachk.log


============================================================
               Node name - db01rac1
============================================================
. . . . . . 


 Collecting - ASM Disk Groups
 Collecting - ASM Disk I/O stats
 Collecting - ASM Diskgroup Attributes
 Collecting - ASM disk partnership imbalance
 Collecting - ASM diskgroup attributes
 Collecting - ASM diskgroup usable free space
 Collecting - ASM initialization parameters
 Collecting - Database Parameters for demorac database
 Collecting - Files not opened by ASM
 Collecting - List of active logon and logoff triggers for demorac database
 Collecting - Percentage of asm disk  Imbalance
 Collecting - Testing
 Collecting - /proc/cmdline
 Collecting - /proc/modules
 Collecting - CPU Information
 Collecting - CRS active version
 Collecting - CRS oifcfg
 Collecting - CRS software version
 Collecting - CSS Reboot time
 Collecting - Cluster interconnect (clusterware)
 Collecting - Clusterware OCR healthcheck
 Collecting - Clusterware Resource Status
 Collecting - Disk I/O Scheduler on Linux
 Collecting - DiskFree Information
 Collecting - DiskMount Information
 Collecting - Huge pages configuration
 Collecting - Interconnect network card speed
 Collecting - Kernel parameters
 Collecting - Linux module config.
 Collecting - Maximum number of semaphore sets on system
 Collecting - Maximum number of semaphores on system
 Collecting - Maximum number of semaphores per semaphore set
 Collecting - Memory Information
 Collecting - NUMA Configuration
 Collecting - Network Interface Configuration
 Collecting - Network Performance
 Collecting - Network Service Switch
 Collecting - OS Packages
 Collecting - OS version
 Collecting - Operating system release information and kernel version
 Collecting - Oracle executable attributes
 Collecting - Patches for Grid Infrastructure
 Collecting - Patches for RDBMS Home
 Collecting - RDBMS and GRID software owner UID across cluster
 Collecting - Shared memory segments
 Collecting - Table of file system defaults
 Collecting - Voting disks (clusterware)
 Collecting - number of semaphore operations per semop system call
 Collecting - CHMAnalyzer to report potential Operating system resources usage
 Collecting - CRS Opatch version
 Collecting - CRS user time zone check
 Collecting - Custom rc init scripts (rc.local)
 Collecting - Disk Information
 Collecting - Grid Infastructure user shell limits configuration
 Collecting - Interconnect interface config
 Collecting - Network interface stats
 Collecting - Root user limits
 Collecting - Verify ORAchk scheduler configuration
 Collecting - Verify TCP Selective Acknowledgement is enabled
 Collecting - Verify no database server kernel out of memory errors
 Collecting - Verify the vm.min_free_kbytes configuration
 Collecting - root time zone check
 Collecting - slabinfo
 Collecting - umask setting for GI owner


Data collections completed. Checking best practices on db01rac1.
------------------------------------------------------------



 INFO =>     Important Automatic Storage Management (ASM) Notes and Technical White Papers
 INFO =>     Oracle Data Pump Best practices.
 WARNING =>  Linux swap configuration does not meet recommendation
 INFO =>     Most recent ADR incidents for /u01/app/oracle/product/19.3.0/db_1
 INFO =>     Oracle GoldenGate failure prevention best practices
 WARNING =>  OCR and OCR backup locations are the same path
 FAIL =>     The vm.min_free_kbytes configuration is not set as recommended
 CRITICAL => The RMAN snapshot control file location is not shared on all database nodes in the cluster for demorac
 INFO =>     $CRS_HOME/log/hostname/client directory has too many older log files.
 CRITICAL => ORAchk scheduler is not configured correctly
 WARNING =>  Package compat-libstdc++-33-3.2.3-61-x86_64 is recommended but not installed
 INFO =>     Important Storage Minimum Requirements for Grid & Database Homes
 CRITICAL => Operating system hugepages count does not satisfy total SGA requirements
 WARNING =>  NIC bonding is not configured for interconnect
 WARNING =>  NIC bonding is NOT configured for public network (VIP)
 WARNING =>  RAC interconnect network card speed does not meet recommendation
 INFO =>     Cluster health analyzer (CHA)  is not configured as recommended
 FAIL =>     system service rngd is not running
 WARNING =>  OSWatcher is not running as is recommended.
 INFO =>     Jumbo frames (MTU >= 9000) are not configured for interconnect
 WARNING =>  NTP is not running with correct setting
 WARNING =>  All disk groups should have compatible.rdbms attribute set to recommended values
 WARNING =>  All disk groups should have compatible.advm attribute set to recommended values
 FAIL =>     Database parameter DB_LOST_WRITE_PROTECT is not set to recommended value on jydb1 instance
 FAIL =>     Database parameter DB_BLOCK_CHECKING on STANDBY is NOT set to the recommended value. for demorac
 FAIL =>     Flashback on STANDBY is not configured for demorac
 INFO =>     Operational Best Practices
 INFO =>     Database Consolidation Best Practices
 INFO =>     Computer failure prevention best practices
 INFO =>     Data corruption prevention best practices
 INFO =>     Logical corruption prevention best practices
 INFO =>     Database/Cluster/Site failure prevention best practices
 INFO =>     Client failover operational best practices
 WARNING =>  fast_start_mttr_target should be greater than or equal to 300 on jydb1 instance
 FAIL =>     Standby redo logs should be configured on the standby for demorac
 WARNING =>  Oracle patch 26749785 is not applied on RDBMS_HOME /u01/app/oracle/product/19.3.0/db_1
 WARNING =>  Oracle patch 29259068 is not applied on RDBMS_HOME /u01/app/oracle/product/19.3.0/db_1
 INFO =>     Information about hanganalyze and systemstate dump
 FAIL =>     Database control files are not configured as recommended for demorac
 WARNING =>  Oracle patch 28907129 is not applied on RDBMS_HOME /u01/app/oracle/product/19.3.0/db_1
 INFO =>     While initialization parameter LOG_ARCHIVE_CONFIG is set it should be verified for your environment on Standby Database for demorac
 WARNING =>  Redo log files should be appropriately sized for demorac
 INFO =>     Database failure prevention best practices
 WARNING =>  Perl Patch 33912872 is not found in 19c RDBMS_HOME. /u01/app/oracle/product/19.3.0/db_1
 WARNING =>  Perl Patch 33912872 is not found in 19c CRS_HOME. /u01/app/19.3.0/grid
 WARNING =>  Oracle patch 32043701 is not applied on RDBMS_HOME /u01/app/oracle/product/19.3.0/db_1
 WARNING =>  Oracle patch 31211220 is not applied on RDBMS_HOME /u01/app/oracle/product/19.3.0/db_1
 WARNING =>  TFA Collector is either not installed or not running
 CRITICAL => Linux transparent huge pages are enabled
 FAIL =>     Listener(s) running under GI Home are not healthy
 FAIL =>     FRA space management problem file types are present without an RMAN backup completion within the last 7 days for demorac
 INFO =>     Oracle recovery manager(rman) best practices
 INFO =>     Database feature usage statistics for demorac
 WARNING =>  Linux Disk I/O Scheduler should be configured to Deadline


------------------------------------------------------------
                      CLUSTERWIDE CHECKS
------------------------------------------------------------

------------------------------------------------------------
Detailed report (html) -  /u01/app/grid/oracle.ahf/data/db01rac1/orachk/user_root/output/orachk_db01rac1_demorac_040623_234130/orachk_db01rac1_demorac_040623_234130.html
root@db01rac2's password: 
root@db01rac2's password: 





UPLOAD [if required] - /u01/app/grid/oracle.ahf/data/db01rac1/orachk/user_root/output/orachk_db01rac1_demorac_040623_234130.zip

2.拷贝结果文件

orachk执行完成之后,会在最后提示你,具体生成了哪些相关文件,供你进一步分析。

可以看到,生成的完整压缩包有100多M,如果用户环境传输大文件困难,其实也可以只拷贝html报告结果(这里有30M,压缩完只有7M不到):

[root@db01rac1 ~]# ls -lrth /u01/app/grid/oracle.ahf/data/db01rac1/orachk/user_root/output/orachk_db01rac1_demorac_040623_234130.zip
-r--r----- 1 root root 111M Apr  6 23:53 /u01/app/grid/oracle.ahf/data/db01rac1/orachk/user_root/output/orachk_db01rac1_demorac_040623_234130.zip
[root@db01rac1 ~]# ls -lrth /u01/app/grid/oracle.ahf/data/db01rac1/orachk/user_root/output/orachk_db01rac1_demorac_040623_234130/orachk_db01rac1_demorac_040623_234130.html
-rw-r----- 1 root dba 30M Apr  6 23:53 /u01/app/grid/oracle.ahf/data/db01rac1/orachk/user_root/output/orachk_db01rac1_demorac_040623_234130/orachk_db01rac1_demorac_040623_234130.html
[root@db01rac1 ~]# 
[root@db01rac1 ~]# cp /u01/app/grid/oracle.ahf/data/db01rac1/orachk/user_root/output/orachk_db01rac1_demorac_040623_234130/orachk_db01rac1_demorac_040623_234130.html /tmp/
[root@db01rac1 ~]# cd /tmp
[root@db01rac1 tmp]# ls -lrth orachk_db01rac1_demorac_040623_234130.html
-rw-r-----  1 root   root     30M Apr  7 00:07 orachk_db01rac1_demorac_040623_234130.html
[root@db01rac1 tmp]# tar zcvf orachk_db01rac1_demorac_040623_234130.tar.gz orachk_db01rac1_demorac_040623_234130.html 
orachk_db01rac1_demorac_040623_234130.html
[root@db01rac1 tmp]# ls -lrth orachk_db01rac1_demorac_040623_234130*
-rw-r----- 1 root root  30M Apr  7 00:07 orachk_db01rac1_demorac_040623_234130.html
-rw-r--r-- 1 root root 6.5M Apr  7 00:08 orachk_db01rac1_demorac_040623_234130.tar.gz
[root@db01rac1 tmp]# cp orachk_db01rac1_demorac_040623_234130.tar.gz /public/

3.分析生成的html报告

报告类似如下,实际报告内容很多,html展示也很清晰,这里只截取部分文字内容示例:

Oracle RAC Assessment Report
System Health Score is 88 out of 100 (detail)

OS Check	Linux transparent huge pages are enabled	All Database Servers	

OS Check	Operating system hugepages count does not satisfy total SGA requirements	All Database Servers	

Database Check	The RMAN snapshot control file location is not shared on all database nodes in the cluster	All Databases View
...