ORA-27300 ORA-27301 ORA-27302 ORA-27157

有个数据库crash了,环境是

1
2
3
os:
[root@oracle ~]# cat /etc/redhat-release 
CentOS Linux release 7.2.1511 (Core)
1
db:11.2.0.4.0

 

今天上午10点25数据库挂的,报错是:

 

1
2
3
4
5
6
7
8
9
10
11
Fri Nov 23 10:25:34 2018
Errors in file /home/app/oracle/diag/rdbms/oracrm/oracrm/trace/oracrm_dbw2_20900.trc:
ORA-27157: OS post/wait facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1
Errors in file /home/app/oracle/diag/rdbms/oracrm/oracrm/trace/oracrm_dbw3_20902.trc:
ORA-27157: OS post/wait facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1

 

metalink上给的解释是:

 

1
2
3
The semaphores used by Oracle have been inadvertendly removed
The errors are signalling that something happened at the OS level with shared memory and/or semaphores. The semaphore sets could be removed manually, or they could be dying for some reason due to a hardware error.
Either when remounting the /dev/shm or You may want to check for any possibility of user dba using the "ipcrm" command to kill the semaphores (accidentally) since the error ora-27301 (OS failure message: Identifier removed) suggests that. Also, it could have been a bad memory stick or something else at the OS level. Someone could also have removed the shared memory segments at the OS level for some specific reason, or by accident. Most likely something had removed the shared memory and semaphore sets in use by 'oracle'. This can only be done by a root-level user or 'oracle' itself who owns the resources. If someone logged in as root and removed all IPC resources, Oracle would crash when it lost the allocated shared memory/semaphores.

 

solution:

 

This could be due to some outside user or application removing the semaphores/shared memory. 
To monitor the semaphore/shared memory state we can use the following methods:

  1. Setup a cronjob to run every 5-10min and dump the output of 'ipcs' and 'ps - ef' to a file with a timestamp. 
    Rotate your logs every 4-7 days to build a history. 
    Then if the problem re-occurs, we can at least try to make sure 'ipcrm' wasn't the culprit and get some general information of the state of the IPC resources plus the processes running.

  2. You can also consult with your sysadmin to check if there is any OS level auditing that can be turned on to audit the usage of commands like 'ipcrm' which can remove shared memory segments /semaphore sets.

Note: This issue can happen on different platform, but in case you encounter the issue in RHEL7.2, then please also check below RHEL7.2 specific information.

In  RHEL7.2 operating system setting RemoveIPC=YES crashes the database.The default value for RemoveIPC in RHEL7.2 is YES.

         Workaround :

      1) Set RemoveIPC=no in /etc/systemd/logind.conf if it is not in that file

      2) Reboot the server or restart systemd-logind as follows:
       # systemctl daemon-reload
       # systemctl restart systemd-logind

         OR

       Migrate to Oracle Linux 7.2 resolves the problem.

posted on   数据与人文  阅读(114)  评论(0编辑  收藏  举报

编辑推荐:
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 浏览器原生「磁吸」效果!Anchor Positioning 锚点定位神器解析
· 没有源码,如何修改代码逻辑?
阅读排行:
· 分享4款.NET开源、免费、实用的商城系统
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
< 2025年3月 >
23 24 25 26 27 28 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5

统计

点击右上角即可分享
微信分享提示