适应版本:
社区版本OCP:4.2.2-20240315150922
背景描述
- OCP纳管主机后进行主机标准化时,set clock source一直没有成功
分析过程
Bash
2024-05-10
14:44:37.552 INFO 823423 ---
[pool-manual-subtask-executor16,82ea1ce829564495,4c251a8e816d]
c.o.o.e.internal.template.HttpTemplate
: POST request to agent, url:http://10.186.61.51:62888/api/v1/system/setClockSource,
request body:SetClockSourceRequest(sourceType=tsc), params:null
2024-05-10 14:44:37.565 ERROR 823423 ---
[pool-manual-subtask-executor16,82ea1ce829564495,4c251a8e816d]
c.o.o.c.c.i.r.methods.RepairClockSource
: set clock source to tsc failed:
[AgentClient]:http request is failed, response:Unexpected error: symlink
/usr/lib/systemd/system/set_clocksource.service
/etc/systemd/system/multi-user.target.wants/set_clocksource.service: file
exists
2024-05-10 14:44:37.586 ERROR 823423 ---
[pool-manual-subtask-executor16,82ea1ce829564495,4c251a8e816d]
c.o.o.c.c.i.h.SystemCheckerHelperImpl
: Failed to repair 277. Please see the log for details
2024-05-10 14:44:37.592 ERROR 823423 ---
[pool-manual-subtask-executor16,82ea1ce829564495,4c251a8e816d]
c.o.ocp.core.util.ExceptionUtils
: Checked Exception:
com.oceanbase.ocp.core.exception.UnexpectedException occurred with code
error.common.unexpected, and args [4]
2024-05-10 14:44:37.597 ERROR 823423 ---
[pool-manual-subtask-executor16,82ea1ce829564495,4c251a8e816d]
c.o.o.c.t.e.c.w.subtask.SubtaskExecutor
: An unknown error has occurred. Cause: 4. Error message: {1}. Contact
the administrator.
com.oceanbase.ocp.core.exception.UnexpectedException: [OCP
UnexpectedException]: status=500 INTERNAL_SERVER_ERROR,
errorCode=COMMON_UNEXPECTED, args=4
at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at
java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at
com.oceanbase.ocp.core.util.ExceptionUtils.newException(ExceptionUtils.java:96)
at
com.oceanbase.ocp.core.util.ExceptionUtils.throwException(ExceptionUtils.java:90)
at
com.oceanbase.ocp.core.util.ExceptionUtils.unExpected(ExceptionUtils.java:71)
at
com.oceanbase.ocp.compute.checker.internal.task.RepairCheckItemTask.run(RepairCheckItemTask.java:59)
at
com.oceanbase.ocp.core.task.engine.runner.JavaSubtaskRunner.execute(JavaSubtaskRunner.java:64)
at
com.oceanbase.ocp.core.task.engine.runner.JavaSubtaskRunner.doRun(JavaSubtaskRunner.java:32)
at
com.oceanbase.ocp.core.task.engine.runner.JavaSubtaskRunner.run(JavaSubtaskRunner.java:26)
at
com.oceanbase.ocp.core.task.engine.runner.RunnerFactory.doRun(RunnerFactory.java:76)
at
com.oceanbase.ocp.core.task.engine.coordinator.worker.subtask.SubtaskExecutor.doRun(SubtaskExecutor.java:203)
at
com.oceanbase.ocp.core.task.engine.coordinator.worker.subtask.SubtaskExecutor.redirectConsoleOutput(SubtaskExecutor.java:197)
at
com.oceanbase.ocp.core.task.engine.coordinator.worker.subtask.SubtaskExecutor.lambda$submit$2(SubtaskExecutor.java:134)
at
java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolEx
ecutor.java:624)
at
java.lang.Thread.run(Thread.java:750)
Set state for subtask: 2609, operation:EXECUTE, state: DISREGARDED
|
- 查看/usr/lib/systemd/system和/etc/systemd/system/multi-user.target.wants/已经设置了软链接,说明设置了systemd开机启动。
问题结论
Bash
OCP 纳管主机时,已经将 set clock
source 会写入到/etc/systemd/system中,但在自动修复时,会重新加载到/etc/systemd/system中,如果自动修复检查时已经有这个文件则报错文件已存在
|
处理方案
Bash
从以上来看OCP 纳管主机时,已经将 set
clock source 会写入到/etc/systemd/system中,但在自动修复时,会重新加载到/etc/systemd/system中,如果自动修复检查时已经有这个文件则报错文件已存在
[root@localhost multi-user.target.wants]# systemctl list-unit-files | egrep
set_clocksource.service
set_clocksource.service
enabled
[root@localhost multi-user.target.wants]#
--方案
将/etc/systemd/system/multi-user.target.wants/set_clocksource.service
重命名
mv /etc/systemd/system/multi-user.target.wants/set_clocksource.service
/etc/systemd/system/multi-user.target.wants/set_clocksource.service.bak
|
- 白屏再进行修复,发现创建了一个相同的文件链接,同时报错已修复
-
-
补充:
Bash
用OAT部署的会写入在/etc/rc.local中
[root@10-186-57-25 ~]# cat /etc/rc.local
#!/bin/bash
# THIS FILE IS ADDED FOR COMPATIBILITY PURPOSES
#
# It is highly advisable to create own systemd services or udev rules
# to run scripts during boot instead of using this file.
#
# In contrast to previous versions due to parallel execution during boot
# this script will NOT be run after all other services.
#
# Please note that you must run 'chmod +x /etc/rc.d/rc.local' to ensure
# that this script will be executed during boot.
touch /var/lock/subsys/local
/usr/local/bin/set_deadline.sh
echo never > /sys/kernel/mm/transparent_hugepage/enabled
/usr/local/sbin/set_nic_irq_ob.sh start
echo tsc >
/sys/devices/system/clocksource/clocksource0/current_clocksource
/usr/local/bin/auto_start_ob.sh >> /var/log/ob.autostart.log
2>&1 &
/usr/local/bin/set_cpufreq.sh
|