node节点kubelet报错 node \“xxxxx“ not found
1.【报错信息】
服务器排水后重启,发现master02 一直NotReady
1.1 该节点一直NotReady
[root@crust-m01 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
crust-m01 Ready control-plane,master 84d v1.21.2
crust-m02 NotReady control-plane,master 84d v1.21.2
crust-m03 Ready control-plane,master 84d v1.21.2
crust-n01 Ready <none> 83d v1.21.2
crust-n02 Ready <none> 83d v1.21.2
crust-n03 Ready <none> 83d v1.21.2
1.2 查看该节点详细信息
[root@crust-m01 ~]# kubectl describe node crust-m02
输出如下:
……
Unschedulable: false
Lease:
HolderIdentity: crust-m02
AcquireTime: <unset>
RenewTime: Tue, 28 Sep 2021 14:37:08 +0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure Unknown Tue, 28 Sep 2021 14:32:16 +0800 Tue, 28 Sep 2021 14:38:17 +0800 NodeStatusUnknown Kubelet stopped posting node status.
DiskPressure Unknown Tue, 28 Sep 2021 14:32:16 +0800 Tue, 28 Sep 2021 14:38:17 +0800 NodeStatusUnknown Kubelet stopped posting node status.
PIDPressure Unknown Tue, 28 Sep 2021 14:32:16 +0800 Tue, 28 Sep 2021 14:38:17 +0800 NodeStatusUnknown Kubelet stopped posting node status.
Ready Unknown Tue, 28 Sep 2021 14:32:16 +0800 Tue, 28 Sep 2021 14:38:17 +0800 NodeStatusUnknown Kubelet stopped posting node status.
……
1.3 该节点上查看kubelet日志
[root@crust-m2 ~]# service kubelet status -l
Redirecting to /bin/systemctl status -l kubelet.service
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since 二 2021-09-28 14:51:57 CST; 4min 6s ago
Docs: https://kubernetes.io/docs/
Main PID: 21165 (kubelet)
Tasks: 19
Memory: 43.0M
CGroup: /system.slice/kubelet.service
└─21165 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.4.1
9月 28 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.119645 21165 kubelet.go:2291] "Error getting node" err="node \"crust-m2\" not found"
9月 28 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.220694 21165 kubelet.go:2291] "Error getting node" err="node \"crust-m2\" not found"
9月 28 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.321635 21165 kubelet.go:2291] "Error getting node" err="node \"crust-m2\" not found"
9月 28 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.385100 21165 eviction_manager.go:255] "Eviction manager: failed to get summary stats" err="failed to get node info: node \"crust-m2\" not found"
9月 28 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.422387 21165 kubelet.go:2291] "Error getting node" err="node \"crust-m2\" not found"
9月 28 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.523341 21165 kubelet.go:2291] "Error getting node" err="node \"crust-m2\" not found"
9月 28 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.624021 21165 kubelet.go:2291] "Error getting node" err="node \"crust-m2\" not found"
9月 28 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.724418 21165 kubelet.go:2291] "Error getting node" err="node \"crust-m2\" not found"
9月 28 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.825475 21165 kubelet.go:2291] "Error getting node" err="node \"crust-m2\" not found"
9月 28 14:56:03 crust-m2 kubelet[21165]: E0928 14:56:03.926199 21165 kubelet.go:2291] "Error getting node" err="node \"crust-m2\" not found"
2. 【排错】
- 1.3中日志输出的启动命令如下
/usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.4.1
查看并分析启动命令中所有配置文件都没有问题
- 发现1.3 中输出的错误
err="node \"crust-m2\" not found
,crust-m2
?
而master上kubectl get node
的信息 是crust-m02
- 看到master02节点的服务器名字,目前确实是 crust-m2
- 结论
之前启动的时候,master02的名字是crust-m02
因为/etc/hostname
中错误写成了crust-m2
,因此重启后 服务器名字变成了crust-m2
服务器重启后和之前注册的结果不一致,因此kubelet一直报错
3. 【修改】
- 修改hostname文件,并执行hostname命令修改服务器名称
- 重启kubelete
posted on 2021-10-09 10:09 运维开发玄德公 阅读(150) 评论(0) 编辑 收藏 举报 来源