5-kunernetes资源调度

1、创建一个pod的工作流程

master节点组件

1、apiserver  --> etcd
2、scheduler
3、controller-manager

node节点有那些组件

1、kubelet
2、proxy
3、docker

执行kubectl apply -f pod.yaml 会执行的内容

1、kubectl将yaml内容转换成json，提交给apiserver，apiserver将数据存储在etcd中
2、scheduler会监听到创建新pod事件，根据pod属性调度到指定节点，并且给pod打个标签具体是那个节点
3、apiserver拿到调度的结算结果并写到etcd中
4、kubelet从apiserver获取分配到那台节点上
5、kubelet根据调用docker sock创建容器
6、docker根据kubelet需求创建完容器后将容器状态返回给kubelet
7、kubelet会将pod状态更新到apiserver
8、apiserver将状态数据写入到etcd
9、kubectl get pods

2、Pod中影响调度的主要属性　　

resources:{}         #资源调度依据
schedulerName: default-scheduler  #默认是不需要改
nodeName: ""         #根据节点名称进行调度
nodeSelector:{}      #根据节点的标签进行调度
affinity: {}         #亲和性
tolerations: []      #污点

3、资源限制对pod调度的影响

pod和container的资源请求和限制

spec.containers[].resources.limits.cpu       #CPU最大使用的资源
spec.containers[].resources.limits.memory    #内存最大使用的的资源
spec.containers[].resources.requests.cpu     #CPU使用量配置
spec.containers[].resources.requests.memory  #内存使用量配置

不指定这个两个值，pod可以使用宿主机的所有资源，时间长可能会造成雪崩效应

requests  #资源配额，配置启动需要的资源大小
limits    #最大限制，如果不指定，会使用宿主机全部资源
requests  #资源配额超出了所有机器可以分配的资源会创建失败

CPU单位：　　

2000m = 2核
1000m = 1核
500m  = 0.5核
100m  = 0.1核

k8s会根据Request的值去找有足够资源的node来调度此pod

例如：

cat deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web2
spec:
  replicas: 1
  selector:
    matchLabels:
      project: demo
      app: java
  template:
    metadata:
      labels:
        project: demo
        app: java
    spec:
      containers:
      - name: web
        image: lizhenliang/java-demo
        ports:
        - containerPort: 8080
        resources:
          requests:         #资源配额，启动需要多少资源
            cpu: 500m
            memory: 500mi   #资源调度的依据
          limits:           #最大使用配置
            cpu: 1
            memory: 600mi

生效配置

kubectl apply -f deployment.yaml

查询节点资源分配详细情况，可用和已用，pod运行时间等

kubectl describe node node-1

4、nodeSelector & nodeAffinity 　

nodeSelector 用于将pod调度到匹配的Label的node上

给节点打标签

kubectl label nodes [node] key=value
kubectl label nodes node-1 disktype=ssd  #给节点打标签
kubectl get node --show-labels           #查看节点的标签
kubectl get pods --show-labels           #查看pod的标签
kubectl get svc --show-labels            #查看svc的标签

例子：

给node-1节点打标签

kubectl label nodes node-1 disktype=ssd

编写清单文件

cat nodeselector.yaml

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  nodeSelector:
    disktype: "ssd"  #根据标签进行调度，之后调度到有ssd这个标签的机器上
  containers:
  - name: web
    image: lizhenliang/nginx-php

nodeAffinity:节点亲和类似与nodeSelector

nodeAffinity:节点亲和类似与nodeSelector可根据节点数的标签来约束pod可用调度到那些节点。

nodeAffinity相比nodeSelector:

1、匹配有更多的逻辑组合，不只是字符串的完全相等

2、调度分软策略和硬策略，而不是硬性要求

硬(required)：必须满足
软(preferred)：尝试满足

操作符：ln、Notln、Exists、DoesNotExist、Git、Lt

例如：

cat nodeaffinity.yaml

apiVersion: v1
kind: Pod
metadata:
  name: with-node-affinity
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: disktype
            operator: In
            values:
            - ssd
  containers:
  - name: with-node-affinity
    image: nginx

5、Taint（污点）　

5.1、Taints：避免pod调度到特定的node上

应用场景：

l 专用节点，例如配置了特殊硬件的节点

l 基于Taint的驱逐　　

设置污点

kubectl taint node [node] key=value:Noschedule/PerferNoSchedule/NoExecute 三选一
# 设置node-1污点
kubectl taint node node-1 gpu=yes:Noschedule

查看污点

kubectl describe node | grep Taint

去除污点

kubectl taint node [node] key:effect
kubectl taint node node-2 gpu

其中[effect]可取值

1、Noschedule：一定不能被调度

2、PerferNoSchedule：尽量不要调度

3、NoExecute：不仅不会调度，还会驱逐node上已有的pod

示例：

允许master节点所有pod可以调度

kubectl describe nodestest-k8s-master | grep Taint

# 下面的命令来自于上面查询的结果，节点名称，和污点 
kubectl taint nodes test-k8s-master node-role.kubernetes.io/master:NoSchedule-

5.2、Tolerations （容忍污点）

Tolerations：允许pod调度到持有Taints的节点上

例如：

cat tolerations.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: web
  name: web2
spec:
  replicas: 1
  selector:
    matchLabels:
      app: web
  strategy: {}
  template:
    metadata:
      labels:
        app: web4
    spec:
      tolerations:
      - key: "gpu"
        operator: "Equal"
        value: "yes"
        effect: "NoSchedule"
      containers:
      - image: nginx
        name: nginx

　生效清单文件

kubectl apply -f tolerations.yaml

注意：污点和污点容忍主要是在控制器节点跑特定的pod，如果不配置污点容忍，部署的Pod就不会分配在有污点的节点，也有几率分配到没有污点的节点

6、NodeName

nodeName：用于将Pod调度到指定节点上，不经过调度器scheduler

应用场景：调度器出问题了，无法实现调度，需要指定到指定的节点上

例子：

cat nodename.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: web
  name: web2
spec:
  replicas: 1
  selector:
    matchLabels:
      app: web
  strategy: {}
  template:
    metadata:
      labels:
        app: web4
    spec:
      nodeName: node-2
      containers:
      - image: nginx
        name: nginx

7、DaemonSet控制器

DaemonSet功能

1、在每个node上运行一个pod
2、新加入的Node也同样会自动运行一个pod
   应用场景：网络插件、监控、agent、日志收集agent

注意：会受污点的影响，有污点的节点不会创建pod

配置容忍污点部署daemonset才会在每个节点上创建pod

例子：

cat daemonset.yaml

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: web3
spec:
  selector:
    matchLabels:
      project: demo
      app: java
  template:
    metadata:
      labels:
        project: demo
        app: java
    spec:
      tolerations:
        - key: gpu
          operator: Equal
          value: "yes"
          effect: "NoSchedule"
        - key: node-role.kubernetes.io/master
          effect: NoSchedule
      containers:
      - name: web
        image: lizhenliang/java-demo

调度失败原因的分析　　

查看调度结果：

kubectl get pod <podname> -o wide

查看调度失败原因：

kubectl deacribe pod <podname>

1、节点CPU、内存不足
2、有污点、没有容忍
3、没有匹配到节点标签

容器处于pending状态　　

1、正在下载镜像
2、CPU不足： 0/3 nodes are available: 3 Insufficient cpu.
3、没有匹配标签的节点：0/3 nodes are available: 3 node(s) didn't match node selector
4、没有污点容忍：0/3 nodes are available: 1 node(s) had taint {disktype: ssd}, that the pod didn't tolerate, 1 node(s) had taint {gpu: yes}, that the pod didn't tolerate, 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.

posted @ 2020-10-05 16:35 缺个好听的昵称阅读(464) 评论(0) 编辑收藏举报

刷新页面返回顶部

缺个好听的昵称