TFA-收集日志及分析
下载
https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=272133523880062&id=1513912.1&_afrWindowMode=0&_adf.ctrl-state=fghvcgapa_617a
安装
[root@rhel75 ~]# ./ahf_setup AHF Installer for Platform Linux Architecture x86_64 AHF Installation Log : /tmp/ahf_install_59936_2020_06_04-02_25_33.log Starting Autonomous Health Framework (AHF) Installation AHF Version: 20.1.3 Build Date: 202004290950 Default AHF Location : /opt/oracle.ahf Do you want to install AHF at [/opt/oracle.ahf] ? [Y]|N : Y AHF Location : /opt/oracle.ahf AHF Data Directory stores diagnostic collections and metadata. AHF Data Directory requires at least 5GB (Recommended 10GB) of free space. Please Enter AHF Data Directory : /opt AHF Data Directory : /opt/oracle.ahf/data Do you want to add AHF Notification Email IDs ? [Y]|N : N Extracting AHF to /opt/oracle.ahf Configuring TFA Services Discovering Nodes and Oracle Resources Successfully generated certificates. Starting TFA Services Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service. Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service. .----------------------------------------------------------------------------. | Host | Status of TFA | PID | Port | Version | Build ID | +--------+---------------+-------+-------+------------+----------------------+ | rhel75 | RUNNING | 60857 | 13274 | 20.1.3.0.0 | 20130020200429095054 | '--------+---------------+-------+-------+------------+----------------------' Running TFA Inventory... Adding default users to TFA Access list... .----------------------------------------------------. | Summary of AHF Configuration | +-----------------+----------------------------------+ | Parameter | Value | +-----------------+----------------------------------+ | AHF Location | /opt/oracle.ahf | | TFA Location | /opt/oracle.ahf/tfa | | Orachk Location | /opt/oracle.ahf/orachk | | Data Directory | /opt/oracle.ahf/data | | Repository | /opt/oracle.ahf/data/repository | | Diag Directory | /opt/oracle.ahf/data/rhel75/diag | '-----------------+----------------------------------' Starting orachk daemon from AHF ... AHF binaries are available in /opt/oracle.ahf/bin AHF is successfully installed Moving /tmp/ahf_install_59936_2020_06_04-02_25_33.log to /opt/oracle.ahf/data/rhel75/diag/ahf/ [root@rhel75 ~]# ---查看进程 [root@rhel75 ~]# ps -ef |grep -i ahf root 60857 1 11 02:26 ? 00:00:26 /opt/oracle.ahf/jre/bin/java -server -Xms32m -Xmx64m -Djava.awt.headless=true -Ddisable.checkForUpdate=true -XX:HeapDumpPath=/opt/oracle.ahf/data/rhel75/diag/tfa oracle.rat.tfa.TFAMain /opt/oracle.ahf/tfa root 63344 55078 0 02:30 pts/8 00:00:00 grep --color=auto -i ahf [root@rhel75 ~]#
启停
[root@vm01 soft]# tfactl tfactl> start Oracle Trace File Analyzer (TFA) is already running [root@vm01 soft]# tfactl tfactl> stop Stopping TFA from the Command Line Stopped OSWatcher Nothing to do ! Please wait while TFA stops Please wait while TFA stops TFA-00002 Oracle Trace File Analyzer (TFA) is not running TFA Stopped Successfully Telemetry adapter is not running Successfully stopped TFA..
使用
仅收集数据库相关日志
[root@vm01 soft]# tfactl tfactl> diagcollect -database orc TFA will collect diagnostics for the last 1 hour(s). Please enter the time of the incident [YYYY-MM-DD HH24:MI:SS], or <RETURN> to collect for the last 1 hour(s). (Q|q to Quit): Collecting data for the last 1 hours for this component ... Collecting data for all nodes TFA is using system timezone for collection, All times shown in EDT. Collection Id : 20230419045638vm01 Detailed Logging at : /opt/oracle.ahf/data/repository/collection_Wed_Apr_19_04_56_42_EDT_2023_node_all/diagcollect_20230419045638_vm01.log Waiting up to 120 seconds for collection to start 2023/04/19 04:56:48 EDT : NOTE : Any file or directory name containing the string .com will be renamed to replace .com with dotcom 2023/04/19 04:56:48 EDT : Collection Name : tfa_Wed_Apr_19_04_56_40_EDT_2023.zip 2023/04/19 04:56:48 EDT : Collecting diagnostics from hosts : [vm01] 2023/04/19 04:56:49 EDT : Getting list of files satisfying time range [04/19/2023 03:56:48 EDT, 04/19/2023 04:56:49 EDT] 2023/04/19 04:56:49 EDT : Collecting Additional Diagnostic Information... 2023/04/19 04:56:51 EDT : Collecting ADR incident files... 2023/04/19 04:56:58 EDT : Executing SQL Script db_feature_usage.sql on orc with timeout of 600 seconds... 2023/04/19 04:56:59 EDT : Completed Collection of Additional Diagnostic Information... 2023/04/19 04:57:01 EDT : Completed Local Collection 2023/04/19 04:57:01 EDT : Not Redacting this Collection ... 2023/04/19 04:57:01 EDT : Completed collection of zip files. .---------------------------------. | Collection Summary | +------+-----------+-------+------+ | Host | Status | Size | Time | +------+-----------+-------+------+ | vm01 | Completed | 143kB | 13s | '------+-----------+-------+------' Logs are being collected to: /opt/oracle.ahf/data/repository/collection_Wed_Apr_19_04_56_42_EDT_2023_node_all /opt/oracle.ahf/data/repository/collection_Wed_Apr_19_04_56_42_EDT_2023_node_all/vm01.tfa_Wed_Apr_19_04_56_40_EDT_2023.zip tfactl>
收集指定时间的所有trace日志
tfactl> diagcollect -for Apr/19/2023 Collecting data for all nodes TFA is using system timezone for collection, All times shown in EDT. Scanning files for Apr/19/2023 Collection Id : 20230419050047vm01 Detailed Logging at : /opt/oracle.ahf/data/repository/collection_Wed_Apr_19_05_00_51_EDT_2023_node_all/diagcollect_20230419050047_vm01.log Waiting up to 120 seconds for collection to start 2023/04/19 05:00:58 EDT : NOTE : Any file or directory name containing the string .com will be renamed to replace .com with dotcom 2023/04/19 05:00:58 EDT : Collection Name : tfa_Wed_Apr_19_05_00_50_EDT_2023.zip 2023/04/19 05:00:58 EDT : Collecting diagnostics from hosts : [vm01] 2023/04/19 05:00:59 EDT : Scanning of files for Collection in progress... 2023/04/19 05:00:59 EDT : Collecting Additional Diagnostic Information... 2023/04/19 05:01:12 EDT : Executing Collection for ASM with timeout of 1800 seconds... 2023/04/19 05:01:12 EDT : Executing Collection for AFD with timeout of 1860 seconds... 2023/04/19 05:01:12 EDT : Executing Collection for CRS with timeout of 1920 seconds... 2023/04/19 05:01:13 EDT : Executing Collection for ACFS with timeout of 1980 seconds... 2023/04/19 05:01:13 EDT : Executing Collection for OS with timeout of 2040 seconds... 2023/04/19 05:01:14 EDT : Getting list of files satisfying time range [04/19/2023 00:00:00 EDT, 04/19/2023 05:00:59 EDT] 2023/04/19 05:01:16 EDT : Collecting ADR incident files... 2023/04/19 05:01:16 EDT : Executing Collection for SOSREPORT with timeout of 2100 seconds... 2023/04/19 05:02:45 EDT : Executing Collection for QOS with timeout of 2160 seconds... 2023/04/19 05:02:46 EDT : Executing Collection for DATAGUARD with timeout of 2220 seconds... 2023/04/19 05:02:46 EDT : Executing Collection for SYSLENS with timeout of 2280 seconds... 2023/04/19 05:02:49 EDT : Completed Collection of Additional Diagnostic Information... 2023/04/19 05:02:51 EDT : Completed Local Collection 2023/04/19 05:02:51 EDT : Not Redacting this Collection ... 2023/04/19 05:02:51 EDT : Completed collection of zip files. .--------------------------------. | Collection Summary | +------+-----------+------+------+ | Host | Status | Size | Time | +------+-----------+------+------+ | vm01 | Completed | 11MB | 113s | '------+-----------+------+------' Logs are being collected to: /opt/oracle.ahf/data/repository/collection_Wed_Apr_19_05_00_51_EDT_2023_node_all /opt/oracle.ahf/data/repository/collection_Wed_Apr_19_05_00_51_EDT_2023_node_all/vm01.tfa_Wed_Apr_19_05_00_50_EDT_2023.zip
也可以单独指定数据库和集群 也是比较常用的
[root@rac19cn1 diag]# ls asm clients crs rdbms tnslsnr //监听日志 集群日志默认也会收集 ***********只收集 2020.11.2的database trace日志***********: tfactl> diagcollect -database ora19c -for Nov/2/2020 ***********只收集 2020.11.2的集群日志***********: tfactl> diagcollect -crs -for Nov/2/2020
收集指定时间范围的数据库日志:
tfactl> diagcollect -database ora19c -from “2020-11-02 18:00:00” -to “2020-11-03 08:00:00” Collecting data for all nodes Scanning files from nov/02/2020 18:00:00 to nov/03/2020 08:00:00 Collection Id : 20201106090421rac19cn1 Detailed Logging at : /oracle/gridbase/tfa/repository/collection_Fri_Nov_06_09_04_21_CST_2020_node_all/diagcollect_20201106090421_rac19cn1.log 2020/11/06 09:04:31 CST : NOTE : Any file or directory name containing the string .com will be renamed to replace .com with dotcom 2020/11/06 09:04:31 CST : Collection Name : tfa_Fri_Nov_06_09_04_21_CST_2020.zip 2020/11/06 09:04:32 CST : Collecting diagnostics from hosts : [rac19cn1, rac19cn2] 2020/11/06 09:04:32 CST : Scanning of files for Collection in progress... 2020/11/06 09:04:32 CST : Collecting additional diagnostic information... 2020/11/06 09:04:57 CST : Getting list of files satisfying time range [11/02/2020 18:00:00 CST, 11/03/2020 08:00:00 CST] 2020/11/06 09:05:47 CST : Completed collection of additional diagnostic information... 2020/11/06 09:12:57 CST : Collecting ADR incident files... 2020/11/06 09:12:57 CST : Completed Local Collection 2020/11/06 09:12:58 CST : Remote Collection in Progress... .-------------------------------------. | Collection Summary | +----------+-----------+-------+------+ | Host | Status | Size | Time | +----------+-----------+-------+------+ | rac19cn2 | Completed | 10kB | 83s | | rac19cn1 | Completed | 185kB | 505s | '----------+-----------+-------+------' Logs are being collected to: /oracle/gridbase/tfa/repository/collection_Fri_Nov_06_09_04_21_CST_2020_node_all /oracle/gridbase/tfa/repository/collection_Fri_Nov_06_09_04_21_CST_2020_node_all/rac19cn1.tfa_Fri_Nov_06_09_04_21_CST_2020.zip /oracle/gridbase/tfa/repository/collection_Fri_Nov_06_09_04_21_CST_2020_node_all/rac19cn2.tfa_Fri_Nov_06_09_04_21_CST_2020.zip ***********收集一小时内数据库日志***********: tfactl> diagcollect –database ora19c –since 1h
收集指定节点数据库日志:
diagcollect -database ora19c -node rac19cn1 -for Nov/2/2020
日志分析
Oracle提供了analyze命令来帮助我们分析数据库当前的trace文件
常用: tfactl> analyze -search "ORA-" -since 1d INFO: analyzing all (Alert and Unix System Logs) logs for the last 1440 minutes... Please wait... INFO: analyzing host: rac19cn1 Report title: Analysis of Alert,System Logs Report date range: last ~1 day(s) Report (default) time zone: CST - China Standard Time Analysis started at: 06-Nov-2020 09:28:08 AM CST Elapsed analysis time: 15 second(s). Configuration file: /oracle/grid/crs_1/tfa/rac19cn1/tfa_home/ext/tnt/conf/tnt.prop Configuration group: all Parameter: ORA- Total message count: 24,938, from 25-Aug-2020 06:22:19 PM CST to 06-Nov-2020 09:20:01 AM CST Messages matching last ~1 day(s): 2,180, from 05-Nov-2020 09:30:01 AM CST to 06-Nov-2020 09:20:01 AM CST Matching regex: ORA- Case sensitive: false Match count: 0 INFO: analyzing all (Alert and Unix System Logs) logs for the last 1440 minutes... Please wait... INFO: analyzing host: rac19cn2 Report title: Analysis of Alert,System Logs Report date range: last ~1 day(s) Report (default) time zone: CST - China Standard Time Analysis started at: 06-Nov-2020 09:28:26 AM CST Elapsed analysis time: 8 second(s). Configuration file: /oracle/grid/crs_1/tfa/rac19cn2/tfa_home/ext/tnt/conf/tnt.prop Configuration group: all Parameter: ORA- Total message count: 35,060, from 25-Aug-2020 06:30:24 PM CST to 06-Nov-2020 09:20:02 AM CST Messages matching last ~1 day(s): 4,940, from 05-Nov-2020 09:30:01 AM CST to 06-Nov-2020 09:20:02 AM CST Matching regex: ORA- Case sensitive: false Match count: 0 以上os、db、asm、crs等所有日志都分析. 仅分析最近两天数据库实例的日志 tfactl> analyze -comp db -since 2d -comp参数可以指定级别为os、db、asm、acfs、crs、all,默认的话是all,表示所有的都收集。
其他操作
查看当前哪些用户可以使用tfactl tfactl> access lsusers .---------------------------------. | TFA Users in rac19cn1 | +-----------+-----------+---------+ | User Name | User Type | Status | +-----------+-----------+---------+ | grid | USER | Allowed | '-----------+-----------+---------' .---------------------------------. | TFA Users in rac19cn2 | +-----------+-----------+---------+ | User Name | User Type | Status | +-----------+-----------+---------+ | grid | USER | Allowed | '-----------+-----------+---------' TFA工具默认仅对root用户和grid用户授予使用权限 [oracle@rac19cn1 bin]$ ./tfactl TFA-00519 Oracle Trace File Analyzer (TFA) is not installed. //oracle用户使用出现未安装 授予oracle用户使用TFA的权限 [root@rac19cn1 bin]#tfactl access add -user oracle Successfully added 'oracle' to TFA Access list. .---------------------------------. | TFA Users in rac19cn1 | +-----------+-----------+---------+ | User Name | User Type | Status | +-----------+-----------+---------+ | grid | USER | Allowed | | oracle | USER | Allowed | '-----------+-----------+---------' .---------------------------------. | TFA Users in rac19cn2 | +-----------+-----------+---------+ | User Name | User Type | Status | +-----------+-----------+---------+ | grid | USER | Allowed | | oracle | USER | Allowed | '-----------+-----------+---------' 查看当前主机状态 tfactl> print status .-----------------------------------------------------------------------------------------------. | Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status | +----------+---------------+------+------+------------+----------------------+------------------+ | rac19cn1 | RUNNING | 9127 | 5000 | 18.3.3.0.0 | 18330020190315044534 | COMPLETE | | rac19cn2 | RUNNING | 1848 | 5000 | 18.3.3.0.0 | 18330020190315044534 | COMPLETE | '----------+---------------+------+------+------------+----------------------+------------------'
新增一下 ahf 的官档
Autonomous Health Framework (AHF) - Including TFA and ORAchk/EXAchk (Doc ID 2550798.1) |
本文来自博客园,作者:蚌壳里夜有多长,转载请注明原文链接:https://www.cnblogs.com/dbahrz/p/17333971.html