k8s中如何安全地对节点进行关闭,进而对主机进行维护?
0、核心要点
如果要对k8s中的node节点进行停机维护,比如升级内核,硬件升级,修理,怎么样的操作是最安全的?
接下的部分,将会告诉你如何去做······
1、将主机上所有的pod进行驱逐
查看要驱逐的节点的名称
[root@nccztsjb-node-23 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION nccztsjb-node-23 Ready control-plane,master 294d v1.23.8 nccztsjb-node-24 Ready <none> 294d v1.23.8 nccztsjb-node-25 Ready ingress,prometheus-server 294d v1.23.8
执行驱逐的命令
[root@nccztsjb-node-23 ~]# kubectl drain nccztsjb-node-24 node/nccztsjb-node-24 cordoned error: unable to drain node "nccztsjb-node-24" due to error:cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): kube-system/calico-node-5ndlp, kube-system/kube-proxy-kgxtc, continuing command... There are pending nodes to be drained: nccztsjb-node-24 cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): kube-system/calico-node-5ndlp, kube-system/kube-proxy-kgxtc
如果有上面的提示信息,说明
在这个节点上,有daemonset类型的pod,不可以被删除,需要加上下面的参数
--ignore-daemonsets
[root@nccztsjb-node-23 ~]# kubectl drain --ignore-daemonsets nccztsjb-node-24 node/nccztsjb-node-24 already cordoned WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-5ndlp, kube-system/kube-proxy-kgxtc node/nccztsjb-node-24 drained
这样,就会将该节点上所有的pod都给删除了(优雅的删除,需要考虑优雅关闭的时间)
2、确定node的运行
是不可调度的了(SchedulingDisabled)
[root@nccztsjb-node-23 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION nccztsjb-node-23 Ready control-plane,master 294d v1.23.8 nccztsjb-node-24 Ready,SchedulingDisabled <none> 294d v1.23.8 nccztsjb-node-25 Ready ingress,prometheus-server 294d v1.23.8 [root@nccztsjb-node-23 ~]#
3、确定节点上pod的运行情况
[root@nccztsjb-node-23 ~]# kubectl get pods -A -o wide | grep nccztsjb-node-24 kube-system calico-node-5ndlp 1/1 Running 2 (5d17h ago) 294d 172.20.58.65 nccztsjb-node-24 <none> <none> kube-system kube-proxy-kgxtc 1/1 Running 2 (5d17h ago) 140d 172.20.58.65 nccztsjb-node-24 <none> <none> [root@nccztsjb-node-23 ~]#
可以发现,除了daemonset类型的pod,其他都被驱逐了。
4、主机维护
这个时候,就可以对主机进行维护的操作了。
对,没错是停机操作。
5、重新启用节点调度
重新将主机加回集群
[root@nccztsjb-node-23 ~]# kubectl uncordon nccztsjb-node-24 node/nccztsjb-node-24 uncordoned
[root@nccztsjb-node-23 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION nccztsjb-node-23 Ready control-plane,master 294d v1.23.8 nccztsjb-node-24 Ready <none> 294d v1.23.8 nccztsjb-node-25 Ready ingress,prometheus-server 294d v1.23.8
这样,节点就可以重新的接收调度的请求了。