kubernetes记录一起因为磁盘容量不足造成pod无线驱逐pod状态为Evicted
线上的pod遭到无线的驱逐
fxxx-xxxx-deploy-86684b76ff-2vkdx 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-5j6fd 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-5tlcs 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-69qsp 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-7d5d5 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-897xm 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-8qnbt 0/1 Evicted 0 6d17h <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-8s2h2 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-9g72f 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-dhpq8 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-djb2b 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-drn5c 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-gxvht 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-h25xv 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-hknlz 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-ltddg 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-mxtl9 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-nd6pd 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-nn776 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-pjkk2 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-pxbx7 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-qmh2d 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-qq42q 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-rt25l 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-vbb7c 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-vzc4v 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-wcp8m 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-whs28 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none> fxxx-xxxx-deploy-86684b76ff-xqkjn 0/1 Evicted 0 30m <none> 10.10.10.10 <none> <none>
查看event事件报错
[root@centos ~]# kubectl get event -A itsm <unknown> Normal Scheduled pod/fxxx-xxxx-deploy-86684b76ff-pxbx7 Successfully assigned itsm/fxxx-xxxx-deploy-86684b7 6ff-pxbx7 to 10.10.10.10 itsm 4h8m Warning Evicted pod/fxxx-xxxx-deploy-86684b76ff-pxbx7 The node had condition: [DiskPressure]. itsm <unknown> Normal Scheduled pod/fxxx-xxxx-deploy-86684b76ff-qmh2d Successfully assigned itsm/fxxx-xxxx-deploy-86684b7 6ff-qmh2d to 10.10.10.10 itsm 4h8m Warning Evicted pod/fxxx-xxxx-deploy-86684b76ff-qmh2d The node had condition: [DiskPressure]. itsm <unknown> Normal Scheduled pod/fxxx-xxxx-deploy-86684b76ff-qq42q Successfully assigned itsm/fxxx-xxxx-deploy-86684b7 6ff-qq42q to 10.10.10.10 itsm 4h8m Warning Evicted pod/fxxx-xxxx-deploy-86684b76ff-qq42q The node had condition: [DiskPressure]. itsm <unknown> Normal Scheduled pod/fxxx-xxxx-deploy-86684b76ff-rt25l Successfully assigned itsm/fxxx-xxxx-deploy-86684b7 6ff-rt25l to 10.10.10.10 itsm 4h8m Warning Evicted pod/fxxx-xxxx-deploy-86684b76ff-rt25l The node had condition: [DiskPressure]. itsm <unknown> Normal Scheduled pod/fxxx-xxxx-deploy-86684b76ff-vbb7c Successfully assigned itsm/fxxx-xxxx-deploy-86684b7 6ff-vbb7c to 10.10.10.10 itsm 4h8m Warning Evicted pod/fxxx-xxxx-deploy-86684b76ff-vbb7c The node had condition: [DiskPressure]. itsm <unknown> Normal Scheduled pod/fxxx-xxxx-deploy-86684b76ff-vzc4v Successfully assigned itsm/fxxx-xxxx-deploy-86684b7 6ff-vzc4v to 10.10.10.10 itsm 4h8m Warning Evicted pod/fxxx-xxxx-deploy-86684b76ff-vzc4v The node had condition: [DiskPressure]. itsm <unknown> Normal Scheduled pod/fxxx-xxxx-deploy-86684b76ff-wcp8m Successfully assigned itsm/fxxx-xxxx-deploy-86684b7 6ff-wcp8m to 10.10.10.10 itsm 4h8m Warning Evicted pod/fxxx-xxxx-deploy-86684b76ff-wcp8m The node had condition: [DiskPressure]. itsm <unknown> Normal Scheduled pod/fxxx-xxxx-deploy-86684b76ff-whs28 Successfully assigned itsm/fxxx-xxxx-deploy-86684b7 6ff-whs28 to 10.10.10.10 itsm 4h8m Warning Evicted pod/fxxx-xxxx-deploy-86684b76ff-whs28 The node had condition: [DiskPressure]. itsm <unknown> Normal Scheduled pod/fxxx-xxxx-deploy-86684b76ff-xqkjn Successfully assigned itsm/fxxx-xxxx-deploy-86684b7 6ff-xqkjn to 10.10.10.10
查看10.10.10.10的node节点数据都是提示磁盘空间释放失败
[root@centos ~]# kubectl describe node 10.10.10.10 省略xxx信息 Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 5700m (71%) 5850m (73%) memory 25338Mi (79%) 28560Mi (89%) ephemeral-storage 0 (0%) 0 (0%) Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FreeDiskSpaceFailed 4h10m (x352 over 313d) kubelet, 10.10.10.10 (combined from similar events): failed to garbage collect required amount of images . Wanted to free 2599863091 bytes, but freed 0 bytes Warning ImageGCFailed 3h10m (x359 over 313d) kubelet, 10.10.10.10 (combined from similar events): failed to garbage collect required amount of images . Wanted to free 2167268147 bytes, but freed 0 bytes
最终是清理了10.10.10.10的node节点的日志释放掉部分空间让pod启动成功,由于系统盘只有50G因为历史原因所以扩容集群把docker的数据目录迁移到新磁盘上面
[root@VM_10_10_centos ~]# df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 16G 0 16G 0% /dev tmpfs 16G 24K 16G 1% /dev/shm tmpfs 16G 1.8M 16G 1% /run tmpfs 16G 0 16G 0% /sys/fs/cgroup /dev/vda1 50G 38G 9.6G 80% / tmpfs 16G 12K 16G 1% /var/lib/kubelet/pods/ae0fb419-1f77-4707-bd44-d3d732706591/volumes/kubernetes.io~secret/tke-bridge-agent-token-5ncbf tmpfs 16G 12K 16G 1% /var/lib/kubelet/pods/71c2fe89-200f-4c83-8731-cd6b1ca33597/volumes/kubernetes.io~secret/default-token-96wzl tmpfs 16G 12K 16G 1% /var/lib/kubelet/pods/44dcc11f-8f18-4540-a4e9-fefc2a70a2e0/volumes/kubernetes.io~secret/kube-proxy-token-nst9k tmpfs 16G 12K 16G 1% /var/lib/kubelet/pods/81819a9a-d4d7-4f34-8de2-2edf9774828a/volumes/kubernetes.io~secret/default-token-96wzl 省略....................... [root@VM_248_116_centos ~]# du -sh /var/lib/docker 13G /var/lib/docker