k8s集群防“雪崩”
限制容器在node节点上的资源占用。
1. 节点信息总览
1.1 master 信息输出如下
“Capacity"和"Allocatable” 处可见,资源全部被允许被分配,即没有预留:
[root@devops-master ~]# kubectl describe nodes devops-master Name: devops-master Roles: master #以下是给角色打的标签,架构和操作系统,等都会在里边。 Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/arch=amd64 kubernetes.io/hostname=devops-master kubernetes.io/os=linux node-role.kubernetes.io/master= # flannel网卡的虚拟MAC地址,也可以在ip a 中看到 Annotations: flannel.alpha.coreos.com/backend-data: {"VtepMAC":"9e:d1:1a:e6:83:2e"} # vxlan指 可扩展的虚拟网络 flannel.alpha.coreos.com/backend-type: vxlan flannel.alpha.coreos.com/kube-subnet-manager: true flannel.alpha.coreos.com/public-ip: 10.252.97.56 kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock node.alpha.kubernetes.io/ttl: 0ke volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Wed, 29 Apr 2020 16:33:15 +0800 Taints: node-role.kubernetes.io/master:NoSchedule Unschedulable: false Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- MemoryPressure False Wed, 12 Aug 2020 18:50:58 +0800 Wed, 29 Apr 2020 16:33:11 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Wed, 12 Aug 2020 18:50:58 +0800 Wed, 29 Apr 2020 16:33:11 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Wed, 12 Aug 2020 18:50:58 +0800 Wed, 29 Apr 2020 16:33:11 +0800 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Wed, 12 Aug 2020 18:50:58 +0800 Wed, 29 Apr 2020 16:45:45 +0800 KubeletReady kubelet is posting ready status Addresses: InternalIP: 10.252.97.56 Hostname: devops-master #所有硬件资源 Capacity: cpu: 8 ephemeral-storage: 25792732Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 32765896Ki pods: 110 #以下是可分配资源 Allocatable: cpu: 8 ephemeral-storage: 23770581772 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 32663496Ki pods: 110 System Info: Machine ID: dff543df0a0c44e2962f1438f92b6868 System UUID: 42277530-DD16-E8F5-B3AB-C6831B9F49FA Boot ID: 45ec85f2-0237-4d25-b684-6ec886f0c824 Kernel Version: 3.10.0-514.el7.x86_64 OS Image: CentOS Linux 7 (Core) Operating System: linux Architecture: amd64 Container Runtime Version: docker://18.6.1 Kubelet Version: v1.15.2 Kube-Proxy Version: v1.15.2 PodCIDR: 10.244.0.0/24 Non-terminated Pods: (8 in total) #以下列出所有pod的信息 Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE --------- ---- ------------ ---------- --------------- ------------- --- kube-system coredns-bccdc95cf-vrxck 100m (1%) 0 (0%) 70Mi (0%) 170Mi (0%) 105d kube-system etcd-devops-master 0 (0%) 0 (0%) 0 (0%) 0 (0%) 105d kube-system kube-apiserver-devops-master 250m (3%) 0 (0%) 0 (0%) 0 (0%) 105d kube-system kube-controller-manager-devops-master 200m (2%) 0 (0%) 0 (0%) 0 (0%) 105d kube-system kube-flannel-ds-amd64-bh5gv 100m (1%) 100m (1%) 50Mi (0%) 50Mi (0%) 105d kube-system kube-proxy-6r9sg 0 (0%) 0 (0%) 0 (0%) 0 (0%) 105d kube-system kube-scheduler-devops-master 100m (1%) 0 (0%) 0 (0%) 0 (0%) 105d monitoring prometheus-operator-prometheus-node-exporter-qb48r 0 (0%) 0 (0%) 0 (0%) 0 (0%) 96d # 以下是已分配资源 Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 750m (9%) 100m (1%) memory 120Mi (0%) 220Mi (0%) ephemeral-storage 0 (0%) 0 (0%) Events: <none>
1.2 node信息如下
同样查看node节点信息,可见资源同样都被分配了。
Capacity: cpu: 4 ephemeral-storage: 43400496Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 16247820Ki pods: 110 Allocatable: cpu: 4 ephemeral-storage: 43400496Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 16247820Ki pods: 110
说明:下文会修改这个节点的cgroup资源限制
2. 配置docker的 cgroup驱动
确认docker驱动
# docker info | grep "Cgroup Driver" Cgroup Driver: cgroupfs
如果不是 cgroupfs,则可以通过以下方法配置
# vim /etc/docker/daemon.json { "exec-opts": ["native.cgroupdriver=cgroupfs"], ............. }
3. 配置kubelete的cgroup驱动
3.1 配置文件
/var/lib/kubelet/kubeadm-flags.env
作用:
用来为Kube组件和System进程预留资源,从而保证当节点出现满负荷时也能保证Kube和System进程有足够的资源。
3.2 默认配置
1 | KUBELET_KUBEADM_ARGS= "--cgroup-driver=cgroupfs --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.1" |
参数说明
Node Capacity: 是Node的所有硬件资源
kube-reserved: 是给kube组件预留的资源
system-reserved: 是给System进程预留的资源
eviction-threshold: 驱逐阈值
allocatable: 可配置值
节点上可配置值 = 总量 - kube组件预留值 - 系统预留值 - 驱逐阈值
3.3 修改如下
KUBELET_KUBEADM_ARGS="--cgroup-driver=cgroupfs \ --network-plugin=cni \ --pod-infra-container-image=nexus.10010sh.cn/pause:3.1 \ --enforce-node-allocatable=pods,kube-reserved,system-reserved \ --kube-reserved-cgroup=/system.slice/kubelet.service \ --system-reserved-cgroup=/system.slice \ --kube-reserved=cpu=1,memory=1Gi \ --system-reserved=cpu=1,memory=1Gi \ --eviction-hard=memory.available<5%,nodefs.available<10%,imagefs.available<10% \ --eviction-soft=memory.available<10%,nodefs.available<15%,imagefs.available<15% \ --eviction-soft-grace-period=memory.available=2m,nodefs.available=2m,imagefs.available=2m \ --eviction-max-pod-grace-period=30 \ --eviction-minimum-reclaim=memory.available=0Mi,nodefs.available=500Mi,imagefs.available=500Mi"
注解:
--cgroup-driver=cgroupfs \ --network-plugin=cni \ --pod-infra-container-image=nexus.10010sh.cn/pause:3.1 \ #开启为kube组件和系统守护进程预留资源的功能 --enforce-node-allocatable=pods,kube-reserved,system-reserved \ #设置k8s组件的cgroup --kube-reserved-cgroup=/system.slice/kubelet.service \ #设置系统守护进程的cgroup --system-reserved-cgroup=/system.slice \ # kubernetes预留 --kube-reserved=cpu=1,memory=1Gi \ # 系统预留 --system-reserved=cpu=1,memory=1Gi \ #驱逐pod的硬限制 --eviction-hard=memory.available<5%,nodefs.available<10%,imagefs.available<10% \ #驱逐pod的软限制 --eviction-soft=memory.available<10%,nodefs.available<15%,imagefs.available<15% \ #达到驱逐阈值后多久开始驱逐 --eviction-soft-grace-period=memory.available=2m,nodefs.available=2m,imagefs.available=2m \ #驱逐前最大等待时间 --eviction-max-pod-grace-period=30 \ #至少回收多少资源才停止驱逐 --eviction-minimum-reclaim=memory.available=0Mi,nodefs.available=500Mi,imagefs.available=500Mi"
3.4 修改kubelet 启动文件
[Unit] Description=kubelet: The Kubernetes Node Agent Documentation=https://kubernetes.io/docs/ [Service] ExecStart=/usr/bin/kubelet #添加如下两行 ExecStartPre=/bin/mkdir -p /sys/fs/cgroup/cpuset/system.slice/kubelet.service ExecStartPre=/bin/mkdir -p /sys/fs/cgroup/hugetlb/system.slice/kubelet.service Restart=always StartLimitInterval=0 RestartSec=10 [Install] WantedBy=multi-user.target
4. 重启服务查看结果
重启服务
如果修改了docker则需重启docker
重启kubelet
查看修改结果
Capacity: cpu: 4 ephemeral-storage: 43400496Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 16247820Ki pods: 110 Allocatable: cpu: 2 ephemeral-storage: 43400496Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 13658395636 pods: 110
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· 单线程的Redis速度为什么快?