代码改变世界

Oracle Autonomous Health Framework(AHF)

  abce  阅读(1194)  评论(0编辑  收藏  举报

最近因为log4j的安全漏洞升级了一下AHF。好久没有碰O,还是第一次使用AHF。

AHF包含了ORAchk、EXAchk、Trace File Analyze(TFA)。RACcheck被ORAchk取代了,RACcheck tool ([MOS ID 1268927.1])。

AHF一般每三个月更新一次(My Oracle Support note 2550798.1)。

安装之前,确保环境变量设置正确,umask的结果应该是22、022或0022。

RAC集群安装需要验证集群内节点之间的root用户的密码等价性。如果不想配置root用户等价性,可以在每个节点本地执行安装。使用tfactl syncnodes命令生成和部署相关的SSL证书。

升级和第一次安装有点类似,用root用户执行ahf_setup脚本。如果AHF已经存在了,重新安装会在已经存在的位置上升级。如果已经安装了,集群升级就不需要SSH验证了。集群升级使用已经存在daemon secure socket在节点之间进行通信。

 

1.升级安装

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
[root@testdb001 tfa]# ./ahf_setup
 
AHF Installer for Platform Linux Architecture x86_64
 
AHF Installation Log : /tmp/ahf_install_214000_401256_2022_01_04-14_40_36.log
 
Starting Autonomous Health Framework (AHF) Installation
 
AHF Version: 21.4.0 Build Date: 202112200745
 
AHF is already installed at /opt/oracle.ahf
 
Installed AHF Version: 20.4.4 Build Date: 202103031514
 
Do you want to upgrade AHF [Y]|N : Y
 
AHF will also be installed/upgraded on these Cluster Nodes :
 
1. testdb002
 
The AHF Location and AHF Data Directory must exist on the above nodes
AHF Location : /opt/oracle.ahf
AHF Data Directory : /u01/app/grid/oracle.ahf/data
 
Do you want to install/upgrade AHF on Cluster Nodes ? [Y]|N : Y
 
Upgrading /opt/oracle.ahf
TFA-00002 Oracle Trace File Analyzer (TFA) is not running
 
Shutting down AHF Services
Nothing to do !
Shutting down TFA
Removed symlink /etc/systemd/system/multi-user.target.wants/oracle-tfa.service.
Removed symlink /etc/systemd/system/graphical.target.wants/oracle-tfa.service.
. . . . .
. . .
Successfully shutdown TFA..
 
Starting AHF Services
Starting TFA..
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Waiting up to 100 seconds for TFA to be started..
. . . . .
Successfully started TFA Process..
. . . . .
TFA Started and listening for commands
No new directories were added to TFA
Directory /u01/app/grid/crsdata/testdb001/trace/chad was already added to TFA Directories.
 
 
AHF upgrade completed on testdb001
 
Upgrading AHF on Remote Nodes :
 
AHF will be installed on testdb002, Please wait.
 
AHF will prompt twice to install/upgrade per Remote Node. So total 2 prompts
 
Do you want to continue Y|[N] : Y
 
AHF will continue with Upgrading on remote nodes
 
Upgrading AHF on testdb002 :
 
[testdb002] Copying AHF Installer
root@testdb002's password:
 
[testdb002] Running AHF Installer
root@testdb002's password:
 
Do you want AHF to store your My Oracle Support Credentials for Automatic Upload ? Y|[N] : N
 
AHF is successfully upgraded to latest version
 
.-----------------------------------------------------------------.
| Host      | TFA Version | TFA Build ID         | Upgrade Status |
+-----------+-------------+----------------------+----------------+
| testdb001 |  21.4.0.0.0 | 21400020211220074549 | UPGRADED       |
| testdb002 |  21.4.0.0.0 | 21400020211220074549 | UPGRADED       |
'-----------+-------------+----------------------+----------------'
 
Moving /tmp/ahf_install_214000_401256_2022_01_04-14_40_36.log to /u01/app/grid/oracle.ahf/data/testdb001/diag/ahf/
 
[root@testdb001 tfa]# tfactl status
 
.--------------------------------------------------------------------------------------------------.
| Host      | Status of TFA | PID    | Port | Version    | Build ID             | Inventory Status |
+-----------+---------------+--------+------+------------+----------------------+------------------+
| testdb001 | RUNNING       | 404608 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
| testdb002 | RUNNING       | 334591 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
'-----------+---------------+--------+------+------------+----------------------+------------------'

  

2.全新安装

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
[root@abcdb01 tfa]# ./ahf_setup
 
AHF Installer for Platform Linux Architecture x86_64
 
AHF Installation Log : /tmp/ahf_install_214000_121264_2022_01_04-14_48_26.log
 
Starting Autonomous Health Framework (AHF) Installation
 
AHF Version: 21.4.0 Build Date: 202112200745
 
Default AHF Location : /opt/oracle.ahf
 
Do you want to install AHF at [/opt/oracle.ahf] ? [Y]|N : Y
 
AHF Location : /opt/oracle.ahf
 
AHF Data Directory stores diagnostic collections and metadata.
AHF Data Directory requires at least 5GB (Recommended 10GB) of free space.
 
Choose Data Directory from below options :
 
1. /u01/app/grid [Free Space : 149616 MB]
2. Enter a different Location
 
Choose Option [1 - 2] : 1
 
AHF Data Directory : /u01/app/grid/oracle.ahf/data
 
Do you want to add AHF Notification Email IDs ? [Y]|N : N
 
AHF will also be installed/upgraded on these Cluster Nodes :
 
1. abcdb02
 
The AHF Location and AHF Data Directory must exist on the above nodes
AHF Location : /opt/oracle.ahf
AHF Data Directory : /u01/app/grid/oracle.ahf/data
 
Do you want to install/upgrade AHF on Cluster Nodes ? [Y]|N : Y
 
Extracting AHF to /opt/oracle.ahf
 
Configuring TFA Services
 
Discovering Nodes and Oracle Resources
 
Not generating certificates as GI discovered
 
Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
 
.-----------------------------------------------------------------------------.
| Host    | Status of TFA | PID    | Port | Version    | Build ID             |
+---------+---------------+--------+------+------------+----------------------+
| abcdb01 | RUNNING       | 123355 | 5000 | 21.4.0.0.0 | 21400020211220074549 |
'---------+---------------+--------+------+------------+----------------------'
 
Running TFA Inventory...
 
Adding default users to TFA Access list...
 
.--------------------------------------------------------------.
|                 Summary of AHF Configuration                 |
+-----------------+--------------------------------------------+
| Parameter       | Value                                      |
+-----------------+--------------------------------------------+
| AHF Location    | /opt/oracle.ahf                            |
| TFA Location    | /opt/oracle.ahf/tfa                        |
| Orachk Location | /opt/oracle.ahf/orachk                     |
| Data Directory  | /u01/app/grid/oracle.ahf/data              |
| Repository      | /u01/app/grid/oracle.ahf/data/repository   |
| Diag Directory  | /u01/app/grid/oracle.ahf/data/abcdb01/diag |
'-----------------+--------------------------------------------'
 
 
Starting orachk scheduler from AHF ...
 
AHF install completed on abcdb01
 
Installing AHF on Remote Nodes :
 
AHF will be installed on abcdb02, Please wait.
 
AHF will prompt twice to install/upgrade per Remote Node. So total 2 prompts
 
Do you want to continue Y|[N] : Y
 
AHF will continue with Installing on remote nodes
 
Installing AHF on abcdb02 :
 
[abcdb02] Copying AHF Installer
root@abcdb02's password:
 
[abcdb02] Running AHF Installer
root@abcdb02's password:
 
AHF binaries are available in /opt/oracle.ahf/bin
 
AHF is successfully installed
 
Do you want AHF to store your My Oracle Support Credentials for Automatic Upload ? Y|[N] : N
 
Moving /tmp/ahf_install_214000_121264_2022_01_04-14_48_26.log to /u01/app/grid/oracle.ahf/data/abcdb01/diag/ahf/

  

问题处理

安装后,查看状态的时候,发现状态不正常

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
[root@abcdb01 tfa]# tfactl status
 
.------------------------------------------------------------------------------------------------.
| Host    | Status of TFA | PID    | Port | Version    | Build ID             | Inventory Status |
+---------+---------------+--------+------+------------+----------------------+------------------+
| abcdb01 | RUNNING       | 123355 | 5000 | 21.4.0.0.0 | 21400020211220074549 | RUNNING          |
| abcdb02 | RUNNING       | 297423 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
'---------+---------------+--------+------+------------+----------------------+------------------'
[root@abcdb01 tfa]# tfactl status
 
.------------------------------------------------------------------------------------------------.
| Host    | Status of TFA | PID    | Port | Version    | Build ID             | Inventory Status |
+---------+---------------+--------+------+------------+----------------------+------------------+
| abcdb01 | RUNNING       | 123355 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
| abcdb02 | RUNNING       | 297423 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
'---------+---------------+--------+------+------------+----------------------+------------------'
 
[root@abcdb01 tfa]# tfactl status
 
.------------------------------------------------------------------------------------------------.
| Host    | Status of TFA | PID    | Port | Version    | Build ID             | Inventory Status |
+---------+---------------+--------+------+------------+----------------------+------------------+
| abcdb01 | RUNNING       | 123355 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
| abcdb02 | NOT RUNNING   | -      |      |            |                      |                  |
'---------+---------------+--------+------+------------+----------------------+------------------'
[root@abcdb01 tfa]# tfactl status
 
.------------------------------------------------------------------------------------------------.
| Host    | Status of TFA | PID    | Port | Version    | Build ID             | Inventory Status |
+---------+---------------+--------+------+------------+----------------------+------------------+
| abcdb01 | RUNNING       | 123355 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
'---------+---------------+--------+------+------------+----------------------+------------------'
[root@abcdb01 tfa]# tfactl toolstatus
TFA-00104 Cannot establish connection with TFA Server. Please check TFA Certificates

  

修改方法:在好的节点执行同步操作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
[root@abcdb02 tmp]# tfactl syncnodes
 
Current Node List in TFA :
 
1. abcdb02
2. abcdb01
 
Node List in Cluster :
 
1. abcdb01
2. abcdb02
 
Node List to sync TFA Certificates :
     1  abcdb01
 
Do you want to update this node list? Y|[N]: Y
 
Please Enter all the remote nodes you want to sync...
 
Enter Remote Node List (separated by space) : 1
 
Node List to sync TFA Certificates :
     1  1
 
Unable to ping Host 1. Please verify.
 
.------------------------------------------------------------------------------------------------.
| Host    | Status of TFA | PID    | Port | Version    | Build ID             | Inventory Status |
+---------+---------------+--------+------+------------+----------------------+------------------+
| abcdb02 | RUNNING       | 308376 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
| abcdb01 | RUNNING       | 226067 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
'---------+---------------+--------+------+------------+----------------------+------------------'

  

其它

建议生产环境数据库均关闭TFA自动收集、分析功能(Autodiagcollect)从而避免类似情况发生影响生产环境数据库的正常运行。

1
2
3
4
5
6
7
8
9
10
[root@abcdb02 tmp]# tfactl get autodiagcollect
.-------------------------------------------------.
|                     abcdb02                     |
+-----------------------------------------+-------+
| Configuration Parameter                 | Value |
+-----------------------------------------+-------+
| Auto Diagcollection ( autodiagcollect ) | ON    |
'-----------------------------------------+-------'
 
[root@abcdb02 tmp]# tfactl set autodiagcollect=off

  

 

参考:

安装ahf后的后遗症

orachk.zip超大93G把根盘占满orachk -autostop

记一次生产数据库系统内存使用过高的案例

Oracle Autonomous Health Framework (AHF) – Including TFA and ORAchk/EXAchk (Doc ID 2550798.1)

Autonomous Health Framework (AHF) – Including TFA and ORAchk/EXAchk (Doc ID 2550798.1)

 

相关博文:
阅读排行:
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· .NET10 - 预览版1新功能体验(一)
历史上的今天:
2016-01-06 MySQL-innodb_flush_log_at_trx_commit
点击右上角即可分享
微信分享提示