2021-05-09 维护重启后通过ILO无法登入系统,显示黑屏,SSH软件无法连接,报警报宕机。
数据库信息
Oracle 11.2.0.4
Linux Red Hat Enterprise Linux Server release 7.6 (Maipo)
解决:
1.查看服务器是否有硬件问题,未发现异常
2.尝试禁用HBA卡,成功登陆系统,但隔一段时间后无法打开新的连接窗口
3.怀疑是系统内某进程佔用负荷过高(CPU、内存)禁用HBA卡或拔掉光纤线,查看vmstat日志,发现无异常飙高
zzz ***Sun May 9 12:11:56 CST 2021
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 15717112 24 82470040 0 0 2908 695 0 0 11 3 83 3 0
1 0 0 15719288 24 82470080 0 0 5000 45 4250 4337 1 0 98 0 0
1 2 0 15783448 24 82472392 0 0 3332 3291 6834 6997 1 1 97 1 0
zzz ***Sun May 9 12:12:29 CST 2021
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 21664552 24 79940448 0 0 2908 695 0 0 11 3 83 3 0
3 7 0 21657500 24 79940624 0 0 1144 42016 15363 30440 2 1 97 1 0
1 3 0 21653532 24 79943136 0 0 0 210522 44690 130180 1 1 95 2 0
zzz ***Sun May 9 12:13:02 CST 2021
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 0 99840808 80 2259040 0 0 2908 695 0 0 11 3 83 3 0
2 0 0 99844848 80 2259056 0 0 64 0 8102 7758 2 1 98 0 0
0 0 0 99848672 80 2259492 0 0 24 0 3535 2067 1 0 98 0 0
zzz ***Sun May 9 12:13:36 CST 2021
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 99858632 80 2258984 0 0 2908 695 0 0 11 3 83 3 0
1 0 0 99859952 80 2259144 0 0 0 0 3764 2371 1 0 98 0 0
1 0 0 99858624 80 2258960 0 0 0 0 3970 2551 1 0 98 0 0
zzz ***Sun May 9 12:14:10 CST 2021
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 0 99838520 80 2267132 0 0 2908 695 0 0 11 3 83 3 0
10 0 0 99829304 80 2267564 0 0 0 20 18294 20611 2 4 94 0 0
1 0 0 99835808 80 2265804 0 0 0 24 16739 17576 2 3 95 0 0
zzz ***Sun May 9 12:14:44 CST 2021
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 99859136 80 2262756 0 0 2908 695 0 0 11 3 83 3 0
1 0 0 99857792 80 2262628 0 0 0 0 5279 1988 1 0 98 0 0
0 0 0 99854688 80 2262912 0 0 0 4 4837 2356 1 0 98 0 0
zzz ***Sun May 9 12:15:17 CST 2021
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 99857664 80 2262476 0 0 2908 695 0 0 11 3 83 3 0
1 0 0 99854832 80 2262628 0 0 0 0 4546 3287 1 0 98 0 0
1 0 0 99855280 80 2262864 0 0 0 0 3563 2420 1 0 98 0 0
4.怀疑是系统的某个关键进程没起来,执行命令查看,和正常系统比多发现此服务器少了个进程
root 9830 1 0 10:48 ? 00:00:01 /usr/lib/systemd/systemd-journald
[root@host01 ~]$ ps -ef|grep -i systemd
root 1 0 0 10:48 ? 00:00:04 /usr/lib/systemd/systemd --switched-root --system --deserialize 22
root 9859 1 0 10:48 ? 00:00:02 /usr/lib/systemd/systemd-udevd
dbus 22465 1 0 10:48 ? 00:00:02 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root 22481 1 0 10:48 ? 00:00:00 /usr/lib/systemd/systemd-logind
Monitor+ 42713 42233 0 12:01 pts/0 00:00:00 grep --color=auto -i systemd
5.查看该服务器的状态,发现是masked,被锁定了,查看服务启动状态,同样是masked
[root@host01 ~]# systemctl status systemd-journald
● systemd-journald.service
Loaded: masked (/dev/null; bad)
Active: inactive (dead) since Thu 2021-05-13 09:04:48 CST; 46min ago
Main PID: 452 (code=exited, status=0/SUCCESS)
Status: "Processing requests..."
May 13 09:04:33 localhost.localdomain systemd-journal[452]: Runtime journal is using 8.0M (ma…G).
May 13 09:04:33 localhost.localdomain systemd-journal[452]: Journal started
May 13 09:04:48 localhost.localdomain systemd-journal[452]: Journal stopped
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
Warning: systemd-journald.service changed on disk. Run 'systemctl daemon-reload' to reload units.
Hint: Some lines were ellipsized, use -l to show in full.
[root@host01 ~]# systemctl list-unit-files|grep -i journald
systemd-journald.service masked
systemd-journald.socket static
6.执行命令unmask,然后开启服务
[root@host01 083729c7194e4009b815519a35942f9b]# systemctl unmask systemd-journald
Removed symlink /etc/systemd/system/systemd-journald.service.
[root@host01 083729c7194e4009b815519a35942f9b]# systemctl status systemd-journald -l
● systemd-journald.service - Journal Service
Loaded: loaded (/usr/lib/systemd/system/systemd-journald.service; static; vendor preset: disabled)
Active: inactive (dead) since Thu 2021-05-13 09:04:48 CST; 1h 25min ago
Docs: man:systemd-journald.service(8)
man:journald.conf(5)
Main PID: 452 (code=exited, status=0/SUCCESS)
Status: "Processing requests..."
May 13 09:04:33 localhost.localdomain systemd-journal[452]: Runtime journal is using 8.0M (max allowed 4.0G, trying to leave 4.0G free of 125.7G available → current limit 4.0G).
May 13 09:04:33 localhost.localdomain systemd-journal[452]: Journal started
May 13 09:04:48 localhost.localdomain systemd-journal[452]: Journal stopped
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
[root@host01 083729c7194e4009b815519a35942f9b]# systemctl restart systemd-journald
[root@host01 083729c7194e4009b815519a35942f9b]# systemctl status systemd-journald
● systemd-journald.service - Journal Service
Loaded: loaded (/usr/lib/systemd/system/systemd-journald.service; static; vendor preset: disabled)
Active: active (running) since Thu 2021-05-13 10:30:33 CST; 3s ago
Docs: man:systemd-journald.service(8)
man:journald.conf(5)
Main PID: 56691 (systemd-journal)
Status: "Processing requests..."
CGroup: /system.slice/systemd-journald.service
└─56691 /usr/lib/systemd/systemd-journald
May 13 10:30:33 host01 systemd-journal[56691]: Runtime journal is using 8.0M (max allowed 4.0G, trying to leave 4.0G free of 125.7G available → curr…imit 4.0G).
May 13 10:30:33 host01 systemd-journal[56691]: Journal started
May 13 09:04:48 host01 systemd[1]: Current command vanished from the unit file, execution of the command list won't be resumed.
May 13 09:04:48 host01 systemd[1]: Cannot add dependency job for unit systemd-journald.service, ignoring: Unit is masked.
May 13 09:04:48 host01 systemd[1]: Stopped systemd-journald.service.
Hint: Some lines were ellipsized, use -l to show in full.
systemd-journald服务介绍:
http://www.jinbuguo.com/systemd/systemd-journald.service.html
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】凌霞软件回馈社区,博客园 & 1Panel & Halo 联合会员上线
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】博客园社区专享云产品让利特惠,阿里云新客6.5折上折
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· [.NET]调用本地 Deepseek 模型
· 一个费力不讨好的项目,让我损失了近一半的绩效!
· .NET Core 托管堆内存泄露/CPU异常的常见思路
· PostgreSQL 和 SQL Server 在统计信息维护中的关键差异
· C++代码改造为UTF-8编码问题的总结
· 一个费力不讨好的项目,让我损失了近一半的绩效!
· 实操Deepseek接入个人知识库
· CSnakes vs Python.NET:高效嵌入与灵活互通的跨语言方案对比
· 【.NET】调用本地 Deepseek 模型
· Plotly.NET 一个为 .NET 打造的强大开源交互式图表库