kubernetes之node 宕机,pod驱离问题解决
背景:
当node宕机时,希望该node节点上的pod能够快速疏散到其他节点,并提供服务。测试发现,要等待5分钟,上面的pod才会疏散。
网上介绍通过修改 /etc/kubernetes/manifests/kube-controller-manager.yaml
- --node-monitor-grace-period=10s - --node-monitor-period=2s - --pod-eviction-timeout=10s
然而验证不生效。
解决办法:
通过修改deployment解决
[root@node-01 testnginx]# kubectl describe pod nginx-deployment|grep -i toleration -A 2 Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: <none> -- Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: <none>
[root@node-01 testnginx]# cat test-nginx.yaml apiVersion: extensions/v1beta1 kind: Deployment metadata: name: my-nginx spec: replicas: 2 template: metadata: labels: app: my-nginx spec: tolerations: - key: "node.kubernetes.io/unreachable" operator: "Exists" effect: "NoExecute" tolerationSeconds: 2 - key: "node.kubernetes.io/not-ready" operator: "Exists" effect: "NoExecute" tolerationSeconds: 2 containers: - name: my-nginx image: nginx ports: - containerPort: 443
亲测有效!!!