node节点kubelet报错 node \“xxxxx“ not found

1.【报错信息】

服务器排水后重启,发现master02 一直NotReady

1.1 该节点一直NotReady

[root@crust-m01 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
crust-m01 Ready control-plane,master 84d v1.21.2
crust-m02 NotReady control-plane,master 84d v1.21.2
crust-m03 Ready control-plane,master 84d v1.21.2
crust-n01 Ready <none> 83d v1.21.2
crust-n02 Ready <none> 83d v1.21.2
crust-n03 Ready <none> 83d v1.21.2

1.2 查看该节点详细信息

[root@crust-m01 ~]# kubectl describe node crust-m02

输出如下:

……
Unschedulable: false
Lease:
HolderIdentity: crust-m02
AcquireTime: <unset>
RenewTime: Tue, 28 Sep 2021 14:37:08 +0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure Unknown Tue, 28 Sep 2021 14:32:16 +0800 Tue, 28 Sep 2021 14:38:17 +0800 NodeStatusUnknown Kubelet stopped posting node status.
DiskPressure Unknown Tue, 28 Sep 2021 14:32:16 +0800 Tue, 28 Sep 2021 14:38:17 +0800 NodeStatusUnknown Kubelet stopped posting node status.
PIDPressure Unknown Tue, 28 Sep 2021 14:32:16 +0800 Tue, 28 Sep 2021 14:38:17 +0800 NodeStatusUnknown Kubelet stopped posting node status.
Ready Unknown Tue, 28 Sep 2021 14:32:16 +0800 Tue, 28 Sep 2021 14:38:17 +0800 NodeStatusUnknown Kubelet stopped posting node status.
……

1.3 该节点上查看kubelet日志

[root@crust-m2 ~]# service kubelet status -l
Redirecting to /bin/systemctl status -l kubelet.service
kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since 二 2021-09-28 14:51:57 CST; 4min 6s ago
Docs: https://kubernetes.io/docs/
Main PID: 21165 (kubelet)
Tasks: 19
Memory: 43.0M
CGroup: /system.slice/kubelet.service
└─21165 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.4.1
928 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.119645 21165 kubelet.go:2291] "Error getting node" err="node \"crust-m2\" not found"
928 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.220694 21165 kubelet.go:2291] "Error getting node" err="node \"crust-m2\" not found"
928 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.321635 21165 kubelet.go:2291] "Error getting node" err="node \"crust-m2\" not found"
928 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.385100 21165 eviction_manager.go:255] "Eviction manager: failed to get summary stats" err="failed to get node info: node \"crust-m2\" not found"
928 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.422387 21165 kubelet.go:2291] "Error getting node" err="node \"crust-m2\" not found"
928 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.523341 21165 kubelet.go:2291] "Error getting node" err="node \"crust-m2\" not found"
928 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.624021 21165 kubelet.go:2291] "Error getting node" err="node \"crust-m2\" not found"
928 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.724418 21165 kubelet.go:2291] "Error getting node" err="node \"crust-m2\" not found"
928 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.825475 21165 kubelet.go:2291] "Error getting node" err="node \"crust-m2\" not found"
928 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.926199 21165 kubelet.go:2291] "Error getting node" err="node \"crust-m2\" not found"

2. 【排错】

  • 1.3中日志输出的启动命令如下
/usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.4.1

查看并分析启动命令中所有配置文件都没有问题

  • 发现1.3 中输出的错误err="node \"crust-m2\" not foundcrust-m2?
    而master上 kubectl get node 的信息 是 crust-m02
  • 看到master02节点的服务器名字,目前确实是 crust-m2
  • 结论
    之前启动的时候,master02的名字是 crust-m02
    因为 /etc/hostname 中错误写成了 crust-m2,因此重启后 服务器名字变成了crust-m2
    服务器重启后和之前注册的结果不一致,因此kubelet一直报错

3. 【修改】

  • 修改hostname文件,并执行hostname命令修改服务器名称
  • 重启kubelete

posted on   运维开发玄德公  阅读(269)  评论(0编辑  收藏  举报  

相关博文:
阅读排行:
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· 单线程的Redis速度为什么快?
< 2025年3月 >
23 24 25 26 27 28 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5

导航

统计

点击右上角即可分享
微信分享提示