K8S-POD优先级与抢占
1、概念
在定义pod时,可以在spec.priorityClassName中指定PriorityClass,根据其中定义的优先级在PrioritySort中排序,优先调度优先级高的pod,优先级相同的pod根据进入队列的时间戳先后调度,当未找到
合适的运行节点时,调度器会将POD转为pending状态,并为其启动“抢占”过程,在集群中删除一个或者多个低优先的POD,让节点满足该优先级高的POD调度。
pod优先级使用32位正整数,可用值为[0,1000000000],值越大优先级越高,大于1000000000的优先级预留给系统级的关键pod,以防止这些pod被驱逐。
例如:API-server,Controller-manager,Scheduler和etcd 的pod直接使用system-cluster-critical PriorityClass,优先级为:2000000000
metric-server,CoreDNS、Dashboard等使用system-node-critical 的pod使用system-node-critical PriorityClass,优先级为:2000001000
在定义pod时不定义PriorityClass时,默认优先级的值为:0
2、PriorityClass定义
如果集群上存在多个设定了全局默认的优先级的PriorityClass对象,优先级小的会生效,如:
定义默认优先级的demoappv11
apiVersion: apps/v1
kind: Deployment
metadata:
name: demoappv11
spec:
replicas: 4
selector:
matchLabels:
app: demoappv11
template:
metadata:
labels:
app: demoappv11
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- {key: app, operator: In, values: ["demoappv11"]}
topologyKey: kubernetes.io/hostname
containers:
- name: demoappv11
image: harbor.myland.com/baseimages/centos/centos-tsinghua:7.9.2009
command: ["sleep","1000"]
resources:
requests:
memory: 2Gi
cpu: 150m
优先级为222222的demoappv12
kind: PriorityClass
apiVersion: scheduling.k8s.io/v1
metadata:
name: demoappv12
value: 222222
description: "demoappv12"
globalDefault: false
preemptionPolicy: PreemptLowerPriority
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: demoappv12
spec:
replicas: 4
selector:
matchLabels:
app: demoappv12
template:
metadata:
labels:
app: demoappv12
spec:
priorityClassName: demoappv12
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- {key: app, operator: In, values: ["demoappv12"]}
topologyKey: kubernetes.io/hostname
containers:
- name: demoappv12
image: ikubernetes/demoapp:v1.2
resources:
requests:
memory: 2Gi
cpu: 150m
由于内存不足,调度器开始清理优先级低的demoappv11
直到demoappv11处于pending状态