常见系统报错汇总

SSH

resource temporarily unavailable

场景描述

  • VNC登录时报 fork failed: resource temporarily unavailable且已建立SSH连接会话输入任何命令报 bash: fork: Cannot allocate memory
  • ansible all -m ping报 Failed to connect to the host via ssh: kex_exchange_identification: read: Connection reset by peer\r\nConnection reset by xxxx to port xx

可能原因
可能是内存不足或者进程数超限导致。
当系统内部的总进程数达到了 pid_max 时,再创建新进程时会报 fork:Cannot allocate memory错。

解决办法
1.查看系统内存使用率free是否过高
2.查看oom记录grep "Out of memory" /var/log/messages

Linux 内核有个机制叫OOM killer(Out Of Memory killer)。该机制会监控那些占用内存过大,尤其是瞬间占用内存很快的进程,然后防止内存耗尽而自动把该进程杀掉。

3.查看进程是否超限制pstree -p | wc -l
执行命令sysctl kernel.pid_max查看当前进程数限制额。如进程数超过限制,ps -efL定位启动进程较多的程序。临时修改方案sysctl -w kernel.pid_max=65535;永久修改echo "kernel.pid_max = 65535" >> /etc/sysctl.conf && sysctl -p

su: Module is unknown

  • 查看login配置文件是否有错误信息
    cat /etc/pam.d/login
  • 根据/var/log/secure查看日志解决问题
Apr 17 11:01:40 vm-1603172433 su: PAM unable to dlopen(/lib/security/pam_rootok.so): /lib/security/pam_rootok.so: cannot open shared object file: No such file or directory
Apr 17 11:01:40 vm-1603172433 su: PAM adding faulty module: /lib/security/pam_rootok.so
Apr 17 11:01:40 vm-1603172433 su: PAM unable to dlopen(/lib/security/pam_wheel.so): /lib/security/pam_wheel.so: cannot open shared object file: No such file or directory
Apr 17 11:01:40 vm-1603172433 su: PAM adding faulty module: /lib/security/pam_wheel.so
Apr 17 11:01:40 vm-1603172433 su: PAM unable to dlopen(/usr/lib64/security/pam\_tally.so): /usr/lib64/security/pam\_tally.so: cannot open shared object file: No such file or directory
  1. 我们看到日志中有告警提示无法打开共享文件
  2. ls /lib/security目录下确实没有对应的文件
  3. 拷贝对应的文件到/lib/security目录下
sudo cp /lib64/security/pam_rootok.so /lib/security/
sudo cp /lib64/security/pam_wheel.so /lib/security/
  1. 再次sudo su提权,成功。
posted @ 2023-01-11 11:24  MegaloBox  阅读(302)  评论(0编辑  收藏  举报