k8s中如何安全地对节点进行关闭,进而对主机进行维护?


0、核心要点

 

如果要对k8s中的node节点进行停机维护,比如升级内核,硬件升级,修理,怎么样的操作是最安全的?

 

接下的部分,将会告诉你如何去做······

 

1、将主机上所有的pod进行驱逐

 

查看要驱逐的节点的名称

[root@nccztsjb-node-23 ~]# kubectl get nodes
NAME               STATUS   ROLES                       AGE    VERSION
nccztsjb-node-23   Ready    control-plane,master        294d   v1.23.8
nccztsjb-node-24   Ready    <none>                      294d   v1.23.8
nccztsjb-node-25   Ready    ingress,prometheus-server   294d   v1.23.8

 

 

执行驱逐的命令

[root@nccztsjb-node-23 ~]# kubectl drain nccztsjb-node-24
node/nccztsjb-node-24 cordoned
error: unable to drain node "nccztsjb-node-24" due to error:cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): kube-system/calico-node-5ndlp, kube-system/kube-proxy-kgxtc, continuing command...
There are pending nodes to be drained:
 nccztsjb-node-24
cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): kube-system/calico-node-5ndlp, kube-system/kube-proxy-kgxtc

 

 

如果有上面的提示信息,说明

 

在这个节点上,有daemonset类型的pod,不可以被删除,需要加上下面的参数

 

--ignore-daemonsets

[root@nccztsjb-node-23 ~]# kubectl drain --ignore-daemonsets nccztsjb-node-24
node/nccztsjb-node-24 already cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-5ndlp, kube-system/kube-proxy-kgxtc
node/nccztsjb-node-24 drained

 

 

这样,就会将该节点上所有的pod都给删除了(优雅的删除,需要考虑优雅关闭的时间)

 

2、确定node的运行

 

是不可调度的了(SchedulingDisabled)

[root@nccztsjb-node-23 ~]# kubectl get nodes
NAME               STATUS                     ROLES                       AGE    VERSION
nccztsjb-node-23   Ready                      control-plane,master        294d   v1.23.8
nccztsjb-node-24   Ready,SchedulingDisabled   <none>                      294d   v1.23.8
nccztsjb-node-25   Ready                      ingress,prometheus-server   294d   v1.23.8
[root@nccztsjb-node-23 ~]# 

 

 

3、确定节点上pod的运行情况

 

[root@nccztsjb-node-23 ~]# kubectl get pods -A -o wide  | grep nccztsjb-node-24 
kube-system            calico-node-5ndlp                            1/1     Running   2 (5d17h ago)   294d   172.20.58.65     nccztsjb-node-24   <none>           <none>
kube-system            kube-proxy-kgxtc                             1/1     Running   2 (5d17h ago)   140d   172.20.58.65     nccztsjb-node-24   <none>           <none>
[root@nccztsjb-node-23 ~]# 

 

 

可以发现,除了daemonset类型的pod,其他都被驱逐了。

 

4、主机维护

 

这个时候,就可以对主机进行维护的操作了。

 

对,没错是停机操作。

 

5、重新启用节点调度

 

重新将主机加回集群

 

[root@nccztsjb-node-23 ~]# kubectl uncordon nccztsjb-node-24
node/nccztsjb-node-24 uncordoned

 

 

[root@nccztsjb-node-23 ~]# kubectl get nodes
NAME               STATUS   ROLES                       AGE    VERSION
nccztsjb-node-23   Ready    control-plane,master        294d   v1.23.8
nccztsjb-node-24   Ready    <none>                      294d   v1.23.8
nccztsjb-node-25   Ready    ingress,prometheus-server   294d   v1.23.8

 

 

这样,节点就可以重新的接收调度的请求了。

posted @ 2022-11-16 10:23  Zhai_David  阅读(758)  评论(0编辑  收藏  举报