ORA-00800 报错

公司平台从11g升级到19c之后,Linux平台下Oracle 19c启动时,告警日志出现ORA-00800错误的问题,并且能定位是启动VKTM进程时抛出的错误。

环境描述:

操作系统:Red Hat Enterprise Linux release 8.8

数据库 :19.24.0.0.0 企业版

问题描述:

在Oracle 19c启动时,在Oracle的告警日志中会出现下面这样一条告警信息:

Errors in file /oracle/oracle/diag/rdbms/prod/trace/gsp_vktm_1900.trc  (incident=51251) (PDBNAME=CDB$ROOT):
ORA-00800: soft external error, arguments: [Set Priority Failed], [VKTM], [Check traces and OS configuration], [Check Oracle documen
t and MOS notes], []
Incident details in: /oracle/prod/.../incdir_51251/gsp_vktm_1900_i51251.trc

分析解决:

$ oerr ora 00800
00800, 00000, "soft external error, arguments: [%s], [%s], [%s], [%s], [%s]"
// *Cause:  An improper system configuration or setting resulted in failure.
//          This failure is not fatal to the instance at the moment, however, this might result
//          in an unexpected behavior during query execution.
// *Action: Check the database trace files and rectify system settings or the configuration.
//          For additional information, refer to Oracle database documentation or refer to
//          My Oracle Support (MOS) notes.

可以看到,错误是由于不正确的系统配置或数据库设置导致的。这个失败目前对实例不是致命的,但是,这可能会导致在查询执行期间发生意外行为。所以最好还是解决掉这个问题。

参考官方支持文件: https://support.oracle.com/epmos/faces/DocumentDisplay?id=2718971

首先,我们检查oradism文件的权限 这是重点,我的库就是这个文件的权限不对。

$ cd $ORACLE_HOME/bin
$ ls -lrt oradism
-rwsr-x--- 1 root oinstall 147848 Apr 17  2019 oradism

这个文件权限应该+s的权限,但是奇怪的是,我们修改权限后,并不生效。

chown root $ORACLE_HOME/bin/oradism
chmod 4750 $ORACLE_HOME/bin/oradism

然后我们检查数据库的优先级别:VKTM还是LMS*

SQL> set linesize 680
SQL> col Parameter for a30
SQL> col "Session Value" for a16
SQL> col "Instance Value" for a16
SQL> col "Description"  for a30
SQL> select a.ksppinm "Parameter", b.ksppstvl "Session Value", c.ksppstvl "Instance Value", a.KSPPDESC "Description" 
  2  from x$ksppi a, x$ksppcv b, x$ksppsv c 
  3  where a.indx = b.indx and a.indx = c.indx and a.ksppinm like '_%' and a.ksppinm like '_highest_priority_process%';

Parameter                      Session Value    Instance Value   Description
------------------------------ ---------------- ---------------- ------------------------------
_highest_priority_processes    VKTM             VKTM             Highest Priority Process Name
                                                                 Mask

如上所示,此参数值设置正确,如果不正确的话,那么必须优先级为VKTM
alter system set "_high_priority_processes"='VKTM' scope=spfile;
然后我们检查Cgroup配置

$ ps -eaf|grep -i vktm |grep -v grep
oracle      1900       1  0 13:53 ?        00:00:00 ora_vktm_gsp
$ cat /proc/1900/cgroup | grep cpu
6:cpu,cpuacct:/user.slice
2:cpuset:/

$ ps -eaf|grep -i pmon|grep -v grep
oracle      1888       1  0 13:53 ?        00:00:00 ora_pmon_gsp
$ cat /proc/1888/cgroup | grep cpu
6:cpu,cpuacct:/user.slice
2:cpuset:/

检查发现设置显示其他路径,检查cpu.rt_runtime_us的值,如下所示

# cat /sys/fs/cgroup/cpu,cpuacct/system.slice/cpu.rt_runtime_us
0
# cat /sys/fs/cgroup/cpu,cpuacct/user.slice/cpu.rt_runtime_us
0

根据官方文档其值应该为0和950000,可以使用下面命令修改,但是系统重启后,此参数设置会失效

echo 0 > /sys/fs/cgroup/cpu,cpuacct/system.slice/cpu.rt_runtime_us
echo 950000 > /sys/fs/cgroup/cpu,cpuacct/user.slice/cpu.rt_runtime_us

说来也奇怪,修改完再去修改oradism文件权限就可以生效了。再次重启数据库,就没有故障码了。
如果要使其永久生效,我们必须在cgconfig.conf文件中设置,具体操作也很简单,官方文档中有详细步骤,具体如下所示:

Install libcgroup-tools* on the system. (You can find this package on OL7 latest repository)
# yum install libcgroup-tools
/etc/cgconfig.conf will be created automatically when you start cgconfig service

# systemctl start cgconfig
Edit /etc/cgconfig.conf with user.slice parameter below.

group user.slice {
cpu {
cpu.rt_period_us = 1000000;
cpu.rt_runtime_us = 950000;
}
}
Restart cgfconfig service so the value will take effect.

# systemctl restart cgconfig
Enable cgconfig so it will take effect during reboot.

#systemctl enable cgconfig
Reboot the server and check the value if it is now persistent.
posted @ 2024-10-25 13:41  老牛的田  阅读(35)  评论(0编辑  收藏  举报