TFA-收集日志及分析

下载

https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=272133523880062&id=1513912.1&_afrWindowMode=0&_adf.ctrl-state=fghvcgapa_617a 

安装

[root@rhel75 ~]# ./ahf_setup 

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_59936_2020_06_04-02_25_33.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 20.1.3 Build Date: 202004290950

Default AHF Location : /opt/oracle.ahf

Do you want to install AHF at [/opt/oracle.ahf] ? [Y]|N : Y

AHF Location : /opt/oracle.ahf

AHF Data Directory stores diagnostic collections and metadata.
AHF Data Directory requires at least 5GB (Recommended 10GB) of free space.

Please Enter AHF Data Directory : /opt

AHF Data Directory : /opt/oracle.ahf/data

Do you want to add AHF Notification Email IDs ? [Y]|N : N

Extracting AHF to /opt/oracle.ahf

Configuring TFA Services

Discovering Nodes and Oracle Resources
Successfully generated certificates. 
 
Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.

.----------------------------------------------------------------------------.
| Host   | Status of TFA | PID   | Port  | Version    | Build ID             |
+--------+---------------+-------+-------+------------+----------------------+
| rhel75 | RUNNING       | 60857 | 13274 | 20.1.3.0.0 | 20130020200429095054 |
'--------+---------------+-------+-------+------------+----------------------'

Running TFA Inventory...

Adding default users to TFA Access list...

.----------------------------------------------------.
|            Summary of AHF Configuration            |
+-----------------+----------------------------------+
| Parameter       | Value                            |
+-----------------+----------------------------------+
| AHF Location    | /opt/oracle.ahf                  |
| TFA Location    | /opt/oracle.ahf/tfa              |
| Orachk Location | /opt/oracle.ahf/orachk           |
| Data Directory  | /opt/oracle.ahf/data             |
| Repository      | /opt/oracle.ahf/data/repository  |
| Diag Directory  | /opt/oracle.ahf/data/rhel75/diag |
'-----------------+----------------------------------'


Starting orachk daemon from AHF ...

AHF binaries are available in /opt/oracle.ahf/bin

AHF is successfully installed

Moving /tmp/ahf_install_59936_2020_06_04-02_25_33.log to /opt/oracle.ahf/data/rhel75/diag/ahf/

[root@rhel75 ~]#  

---查看进程
[root@rhel75 ~]# ps -ef |grep -i ahf
root      60857      1 11 02:26 ?        00:00:26 /opt/oracle.ahf/jre/bin/java -server -Xms32m -Xmx64m -Djava.awt.headless=true -Ddisable.checkForUpdate=true -XX:HeapDumpPath=/opt/oracle.ahf/data/rhel75/diag/tfa oracle.rat.tfa.TFAMain /opt/oracle.ahf/tfa
root      63344  55078  0 02:30 pts/8    00:00:00 grep --color=auto -i ahf
[root@rhel75 ~]# 

启停

[root@vm01 soft]# tfactl
tfactl> start
Oracle Trace File Analyzer (TFA) is already running
[root@vm01 soft]# tfactl
tfactl> stop
Stopping TFA from the Command Line
Stopped OSWatcher
Nothing to do !
Please wait while TFA stops
Please wait while TFA stops
TFA-00002 Oracle Trace File Analyzer (TFA) is not running
TFA Stopped Successfully
Telemetry adapter is not running
Successfully stopped TFA..

使用

仅收集数据库相关日志

[root@vm01 soft]# tfactl
tfactl> diagcollect -database orc

TFA will collect diagnostics for the last 1 hour(s).
Please enter the time of the incident [YYYY-MM-DD HH24:MI:SS], or <RETURN> to collect for the last 1 hour(s). (Q|q to Quit):


Collecting data for the last 1 hours for this component ...

Collecting data for all nodes

TFA is using system timezone for collection, All times shown in EDT.

Collection Id : 20230419045638vm01

Detailed Logging at : /opt/oracle.ahf/data/repository/collection_Wed_Apr_19_04_56_42_EDT_2023_node_all/diagcollect_20230419045638_vm01.log

Waiting up to 120 seconds for collection to start
2023/04/19 04:56:48 EDT : NOTE : Any file or directory name containing the string .com will be renamed to replace .com with dotcom
2023/04/19 04:56:48 EDT : Collection Name : tfa_Wed_Apr_19_04_56_40_EDT_2023.zip
2023/04/19 04:56:48 EDT : Collecting diagnostics from hosts : [vm01]
2023/04/19 04:56:49 EDT : Getting list of files satisfying time range [04/19/2023 03:56:48 EDT, 04/19/2023 04:56:49 EDT]
2023/04/19 04:56:49 EDT : Collecting Additional Diagnostic Information...
2023/04/19 04:56:51 EDT : Collecting ADR incident files...
2023/04/19 04:56:58 EDT : Executing SQL Script db_feature_usage.sql on orc with timeout of 600 seconds...
2023/04/19 04:56:59 EDT : Completed Collection of Additional Diagnostic Information...
2023/04/19 04:57:01 EDT : Completed Local Collection
2023/04/19 04:57:01 EDT : Not Redacting this Collection ...
2023/04/19 04:57:01 EDT : Completed collection of zip files.

.---------------------------------.
|        Collection Summary       |
+------+-----------+-------+------+
| Host | Status    | Size  | Time |
+------+-----------+-------+------+
| vm01 | Completed | 143kB |  13s |
'------+-----------+-------+------'

Logs are being collected to: /opt/oracle.ahf/data/repository/collection_Wed_Apr_19_04_56_42_EDT_2023_node_all
/opt/oracle.ahf/data/repository/collection_Wed_Apr_19_04_56_42_EDT_2023_node_all/vm01.tfa_Wed_Apr_19_04_56_40_EDT_2023.zip
tfactl> 

收集指定时间的所有trace日志

tfactl> diagcollect -for Apr/19/2023

Collecting data for all nodes

TFA is using system timezone for collection, All times shown in EDT.
Scanning files for Apr/19/2023

Collection Id : 20230419050047vm01

Detailed Logging at : /opt/oracle.ahf/data/repository/collection_Wed_Apr_19_05_00_51_EDT_2023_node_all/diagcollect_20230419050047_vm01.log

Waiting up to 120 seconds for collection to start
2023/04/19 05:00:58 EDT : NOTE : Any file or directory name containing the string .com will be renamed to replace .com with dotcom
2023/04/19 05:00:58 EDT : Collection Name : tfa_Wed_Apr_19_05_00_50_EDT_2023.zip
2023/04/19 05:00:58 EDT : Collecting diagnostics from hosts : [vm01]
2023/04/19 05:00:59 EDT : Scanning of files for Collection in progress...
2023/04/19 05:00:59 EDT : Collecting Additional Diagnostic Information...
2023/04/19 05:01:12 EDT : Executing Collection for ASM with timeout of 1800 seconds...
2023/04/19 05:01:12 EDT : Executing Collection for AFD with timeout of 1860 seconds...
2023/04/19 05:01:12 EDT : Executing Collection for CRS with timeout of 1920 seconds...
2023/04/19 05:01:13 EDT : Executing Collection for ACFS with timeout of 1980 seconds...
2023/04/19 05:01:13 EDT : Executing Collection for OS with timeout of 2040 seconds...
2023/04/19 05:01:14 EDT : Getting list of files satisfying time range [04/19/2023 00:00:00 EDT, 04/19/2023 05:00:59 EDT]
2023/04/19 05:01:16 EDT : Collecting ADR incident files...
2023/04/19 05:01:16 EDT : Executing Collection for SOSREPORT with timeout of 2100 seconds...
2023/04/19 05:02:45 EDT : Executing Collection for QOS with timeout of 2160 seconds...
2023/04/19 05:02:46 EDT : Executing Collection for DATAGUARD with timeout of 2220 seconds...
2023/04/19 05:02:46 EDT : Executing Collection for SYSLENS with timeout of 2280 seconds...
2023/04/19 05:02:49 EDT : Completed Collection of Additional Diagnostic Information...
2023/04/19 05:02:51 EDT : Completed Local Collection
2023/04/19 05:02:51 EDT : Not Redacting this Collection ...
2023/04/19 05:02:51 EDT : Completed collection of zip files.

.--------------------------------.
|       Collection Summary       |
+------+-----------+------+------+
| Host | Status    | Size | Time |
+------+-----------+------+------+
| vm01 | Completed | 11MB | 113s |
'------+-----------+------+------'

Logs are being collected to: /opt/oracle.ahf/data/repository/collection_Wed_Apr_19_05_00_51_EDT_2023_node_all
/opt/oracle.ahf/data/repository/collection_Wed_Apr_19_05_00_51_EDT_2023_node_all/vm01.tfa_Wed_Apr_19_05_00_50_EDT_2023.zip

也可以单独指定数据库和集群 也是比较常用的

[root@rac19cn1 diag]# ls
asm  clients  crs  rdbms  tnslsnr
//监听日志 集群日志默认也会收集


***********只收集 2020.11.2的database trace日志***********:
tfactl> diagcollect -database ora19c -for Nov/2/2020

***********只收集 2020.11.2的集群日志***********:
tfactl> diagcollect -crs -for Nov/2/2020

收集指定时间范围的数据库日志:

tfactl> diagcollect -database ora19c -from2020-11-02 18:00:00” -to “2020-11-03 08:00:00”

Collecting data for all nodes
Scanning files from nov/02/2020 18:00:00 to nov/03/2020 08:00:00
Collection Id : 20201106090421rac19cn1
Detailed Logging at : /oracle/gridbase/tfa/repository/collection_Fri_Nov_06_09_04_21_CST_2020_node_all/diagcollect_20201106090421_rac19cn1.log
2020/11/06 09:04:31 CST : NOTE : Any file or directory name containing the string .com will be renamed to replace .com with dotcom
2020/11/06 09:04:31 CST : Collection Name : tfa_Fri_Nov_06_09_04_21_CST_2020.zip
2020/11/06 09:04:32 CST : Collecting diagnostics from hosts : [rac19cn1, rac19cn2]
2020/11/06 09:04:32 CST : Scanning of files for Collection in progress...
2020/11/06 09:04:32 CST : Collecting additional diagnostic information...
2020/11/06 09:04:57 CST : Getting list of files satisfying time range [11/02/2020 18:00:00 CST, 11/03/2020 08:00:00 CST]
2020/11/06 09:05:47 CST : Completed collection of additional diagnostic information...
2020/11/06 09:12:57 CST : Collecting ADR incident files...
2020/11/06 09:12:57 CST : Completed Local Collection
2020/11/06 09:12:58 CST : Remote Collection in Progress...
.-------------------------------------.
|          Collection Summary         |
+----------+-----------+-------+------+
| Host     | Status    | Size  | Time |
+----------+-----------+-------+------+
| rac19cn2 | Completed | 10kB  |  83s |
| rac19cn1 | Completed | 185kB | 505s |
'----------+-----------+-------+------'

Logs are being collected to: /oracle/gridbase/tfa/repository/collection_Fri_Nov_06_09_04_21_CST_2020_node_all
/oracle/gridbase/tfa/repository/collection_Fri_Nov_06_09_04_21_CST_2020_node_all/rac19cn1.tfa_Fri_Nov_06_09_04_21_CST_2020.zip
/oracle/gridbase/tfa/repository/collection_Fri_Nov_06_09_04_21_CST_2020_node_all/rac19cn2.tfa_Fri_Nov_06_09_04_21_CST_2020.zip

***********收集一小时内数据库日志***********:
tfactl> diagcollect –database ora19c –since 1h

收集指定节点数据库日志:

diagcollect -database ora19c -node rac19cn1 -for Nov/2/2020

日志分析

Oracle提供了analyze命令来帮助我们分析数据库当前的trace文件

常用:
tfactl> analyze -search "ORA-" -since 1d                                                                                         INFO: analyzing all (Alert and Unix System Logs) logs for the last 1440 minutes...  Please wait...
INFO: analyzing host: rac19cn1
                    Report title: Analysis of Alert,System Logs
               Report date range: last ~1 day(s)
      Report (default) time zone: CST - China Standard Time
             Analysis started at: 06-Nov-2020 09:28:08 AM CST
           Elapsed analysis time: 15 second(s).
              Configuration file: /oracle/grid/crs_1/tfa/rac19cn1/tfa_home/ext/tnt/conf/tnt.prop
             Configuration group: all
                       Parameter: ORA-
             Total message count:         24,938, from 25-Aug-2020 06:22:19 PM CST to 06-Nov-2020 09:20:01 AM CST
Messages matching last ~1 day(s):          2,180, from 05-Nov-2020 09:30:01 AM CST to 06-Nov-2020 09:20:01 AM CST
                  Matching regex: ORA-
                  Case sensitive: false
                     Match count: 0

INFO: analyzing all (Alert and Unix System Logs) logs for the last 1440 minutes...  Please wait...
INFO: analyzing host: rac19cn2

                    Report title: Analysis of Alert,System Logs
               Report date range: last ~1 day(s)
      Report (default) time zone: CST - China Standard Time
             Analysis started at: 06-Nov-2020 09:28:26 AM CST
           Elapsed analysis time: 8 second(s).
              Configuration file: /oracle/grid/crs_1/tfa/rac19cn2/tfa_home/ext/tnt/conf/tnt.prop
             Configuration group: all
                       Parameter: ORA-
             Total message count:         35,060, from 25-Aug-2020 06:30:24 PM CST to 06-Nov-2020 09:20:02 AM CST
Messages matching last ~1 day(s):          4,940, from 05-Nov-2020 09:30:01 AM CST to 06-Nov-2020 09:20:02 AM CST
                  Matching regex: ORA-
                  Case sensitive: false
                     Match count: 0
以上os、db、asm、crs等所有日志都分析.

仅分析最近两天数据库实例的日志
tfactl> analyze -comp db -since 2d
-comp参数可以指定级别为os、db、asm、acfs、crs、all,默认的话是all,表示所有的都收集。

其他操作

查看当前哪些用户可以使用tfactl
tfactl> access lsusers                                                                  
.---------------------------------.
|      TFA Users in rac19cn1      |
+-----------+-----------+---------+
| User Name | User Type | Status  |
+-----------+-----------+---------+
| grid      | USER      | Allowed |
'-----------+-----------+---------'
.---------------------------------.
|      TFA Users in rac19cn2      |
+-----------+-----------+---------+
| User Name | User Type | Status  |
+-----------+-----------+---------+
| grid      | USER      | Allowed |
'-----------+-----------+---------'

TFA工具默认仅对root用户和grid用户授予使用权限
[oracle@rac19cn1 bin]$ ./tfactl 
TFA-00519 Oracle Trace File Analyzer (TFA) is not installed.
//oracle用户使用出现未安装

授予oracle用户使用TFA的权限
[root@rac19cn1 bin]#tfactl access add -user oracle
Successfully added 'oracle' to TFA Access list.
.---------------------------------.
|      TFA Users in rac19cn1      |
+-----------+-----------+---------+
| User Name | User Type | Status  |
+-----------+-----------+---------+
| grid      | USER      | Allowed |
| oracle    | USER      | Allowed |
'-----------+-----------+---------'
.---------------------------------.
|      TFA Users in rac19cn2      |
+-----------+-----------+---------+
| User Name | User Type | Status  |
+-----------+-----------+---------+
| grid      | USER      | Allowed |
| oracle    | USER      | Allowed |
'-----------+-----------+---------'
  

查看当前主机状态
tfactl> print status                                                                                        
.-----------------------------------------------------------------------------------------------.
| Host     | Status of TFA | PID  | Port | Version    | Build ID             | Inventory Status |
+----------+---------------+------+------+------------+----------------------+------------------+
| rac19cn1 | RUNNING       | 9127 | 5000 | 18.3.3.0.0 | 18330020190315044534 | COMPLETE         |
| rac19cn2 | RUNNING       | 1848 | 5000 | 18.3.3.0.0 | 18330020190315044534 | COMPLETE         |
'----------+---------------+------+------+------------+----------------------+------------------'

 新增一下 ahf 的官档

posted @ 2023-04-19 17:12  蚌壳里夜有多长  阅读(151)  评论(0编辑  收藏  举报