第5章:Kubernetes调度

Pod创建流程,Pod中影响调度的主要属性

master节点有哪些组件?

1、apiserver -> etcd
2、scheduler
3、controller-manager

node节点有哪些组件?
1、kubelet
2、proxy
3、docker

创建一个Pod的工作流程

创建一个Pod的工作流程?
0、kubectl apply -f pod.yaml
1、kubectl将yaml转换成json,提交给apiserver,apiserver将数据存储etcd
2、scheduler会监听到创建新pod的事件,根据pod属性调度到指定节点,并且给pod打个标签具体是哪个节点
3、apiserver拿到调度的结果并写到etcd中
4、kubelet从apiserver获取分配到我节点的pod
5、kubelet调用docker sock创建容器
6、docker根据kubelet需求创建完容器后将容器状态回报给kubelet
7、kubelet会将pod状态更新到apiserver
8、apiserer将状态数据写到etcd
9、kubectl get pods

Pod中影响调度的主要属性

 

资源限制对Pod调度的影响

 

 

requests:资源配额,必须小于等于limits
limits:资源限制
如果不指定这两个值,那么pod能使用宿主机所有资源。

cpu单位:
2000m = 2
1000m = 1
500m = 0.5
100m = 0.1

 

当资源消耗超过限制,节点被拖垮
master 节点无法识别,调度
不分配
[root@k8s-m1 chp5]# kubectl describe nodes k8s-n1
Name:               k8s-n1
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    disktype=ssd
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=k8s-n1
                    kubernetes.io/os=linux
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    projectcalico.org/IPv4Address: 172.16.1.24/24
                    projectcalico.org/IPv4IPIPTunnelAddr: 10.244.215.64
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Thu, 23 Jul 2020 22:34:32 +0800
Taints:             gpu=yes:NoSchedule
Unschedulable:      false
Lease:
  HolderIdentity:  k8s-n1
  AcquireTime:     <unset>
  RenewTime:       Sun, 09 Aug 2020 23:13:47 +0800
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Thu, 06 Aug 2020 04:03:43 +0800   Thu, 06 Aug 2020 04:03:43 +0800   CalicoIsUp                   Calico is running on this node
  MemoryPressure       False   Sun, 09 Aug 2020 23:12:38 +0800   Fri, 07 Aug 2020 22:09:45 +0800   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Sun, 09 Aug 2020 23:12:38 +0800   Fri, 07 Aug 2020 22:09:45 +0800   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Sun, 09 Aug 2020 23:12:38 +0800   Fri, 07 Aug 2020 22:09:45 +0800   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Sun, 09 Aug 2020 23:12:38 +0800   Fri, 07 Aug 2020 22:09:45 +0800   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:  10.0.0.24
  Hostname:    k8s-n1
Capacity:
  cpu:                2
  ephemeral-storage:  27245572Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             4026372Ki
  pods:               110
Allocatable:
  cpu:                2
  ephemeral-storage:  25109519114
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             3923972Ki
  pods:               110
System Info:
  Machine ID:                 60f04d707dff4506a290fb01a760a736
  System UUID:                DAA64D56-F012-CD06-9629-DB9504A8AD2F
  Boot ID:                    31b9eb98-e462-43ee-a665-f940eaff6ac8
  Kernel Version:             3.10.0-957.el7.x86_64
  OS Image:                   CentOS Linux 7 (Core)
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  docker://19.3.12
  Kubelet Version:            v1.18.0
  Kube-Proxy Version:         v1.18.0
PodCIDR:                      10.244.1.0/24
PodCIDRs:                     10.244.1.0/24
Non-terminated Pods:          (9 in total)
  Namespace                   Name                               CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                   ----                               ------------  ----------  ---------------  -------------  ---
  default                     nginx-f89759699-qjjjb              0 (0%)        0 (0%)      0 (0%)           0 (0%)         10d
  default                     web-6b9c78c94f-tqxvj               0 (0%)        0 (0%)      0 (0%)           0 (0%)         4d
  default                     web-6b9c78c94f-zs9pk               0 (0%)        0 (0%)      0 (0%)           0 (0%)         4d
  default                     web-k8s-n1                         0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d12h
  default                     web2-57b578c78c-kkh8p              500m (25%)    1 (50%)     0 (0%)           0 (0%)         3d23h
  default                     web3-8chqf                         0 (0%)        0 (0%)      0 (0%)           0 (0%)         66m
  kube-system                 calico-node-62gxk                  250m (12%)    0 (0%)      0 (0%)           0 (0%)         16d
  kube-system                 kube-proxy-t2xst                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         17d
  kube-system                 metrics-server-7875f8bf59-n2lr2    0 (0%)        0 (0%)      0 (0%)           0 (0%)         10d
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests    Limits
  --------           --------    ------
  cpu                750m (37%)  1 (50%)
  memory             0 (0%)      0 (0%)
  ephemeral-storage  0 (0%)      0 (0%)
  hugepages-1Gi      0 (0%)      0 (0%)
  hugepages-2Mi      0 (0%)      0 (0%)
Events:              <none>
kubectl describe nodes k8s-n1

超过节点的cpu核心限制

 

nodeSelector nodeAffinity

nodeSelector

nodeSelector:用于将Pod调度到匹配Label 的Node上
给节点打标签:
kubectl label nodes [node] key=value

如果yml文件里面node便签不存在,没有的话
调度的时候会报错
节点没有含有的标签
[root@k8s-m1 chp5]# kubectl get node --show-labels
NAME STATUS ROLES AGE VERSION LABELS
k8s-m1 Ready master 17d v1.18.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-m1,kubernetes.io/os=linux,node-role.kubernetes.io/master=
k8s-n1 Ready <none> 17d v1.18.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disktype=ssh,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-n1,kubernetes.io/os=linux
k8s-n2 Ready <none> 17d v1.18.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-n2,kubernetes.io/os=linux

[root@k8s-m1 chp5]# kubectl label nodes k8s-n1 disktype=ssd
error: 'disktype' already has a value (ssh), and --overwrite is false
[root@k8s-m1 chp5]# kubectl label nodes k8s-n1 disktype=ssd --overwrite
node/k8s-n1 labeled

[root@k8s-m1 chp5]# cat nodeSelector.yml
apiVersion: v1
kind: Pod
metadata:
  name: nodeselector
spec:
  nodeSelector:
    disktype: "ssd"
  containers:
  - name: web
    image: lizhenliang/nginx-php

[root@k8s-m1 chp5]# kubectl get pod -o wide |grep nodeselector
nodeselector                      0/1     Pending   0          26s     <none>           <none>   <none>           <none>

 

nodeAffinity

nodeAffinity:节点亲和类似于 nodeSelector,可以根据节点上的标签来约束Pod可以调度到哪些节点。
相比 nodeselector:

  • 匹配有更多的逻辑组合,不只是字符串的完全相等
  • 调度分为软策略和硬策略,而不是硬性要求
    • 硬( required):必须满足
    • 软( preferred):尝试满足,但不保证

操作符:In、NotIn、 Exists、 DoesNotExist、Gt、Lt

指定节点,污点,DS控制器,调度失败分析

Tant(污点)

Taints:避免Pod调度到特定Node上

所有的在master 上面没有部署pod
提高安全


应用场景:

  • 专用节点,例如配备了特殊硬件的节点
  • 基于 Taint的驱逐
设置污点
kubectl taint node [node] key=value:effect
其中[effect]可取值
NoSchedule:一定不能被调度。
PreferNoSchedule:尽量不要调度,
NoExecute:不仅不会调度,还会驱逐Node上已有的Pod

去掉污点
kubectl taint node [node] key=value:effect-

Toleration(污点容忍)

Toleration:允许Pod调度到持有 Taints的Node上

[root@k8s-m1 chp5]# kubectl describe nodes|grep Taint
Taints: node-role.kubernetes.io/master:NoSchedule
Taints: gpu=yes:NoSchedule
Taints: <none>

 

[root@k8s-m1 chp5]# cat toleration.yml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-toleration
spec:
  selector:
    matchLabels:
      project: demo
      app: java
  template:
    metadata:
      labels:
        project: demo
        app: java
    spec:
      tolerations:
      - key: gpu
        operator: Equal
        value: "yes"
        effect: "NoSchedule"
      containers:
      - name: web3
        image: lizhenliang/java-demo
        ports:
        - containerPort: 8080
toleration.yml

 

[root@k8s-m1 chp5]# kubectl apply -f toleration.yml
deployment.apps/web-toleration created

[root@k8s-m1 chp5]# kubectl get pod -o wide |grep web-toleration
web-toleration-5b867ffcf7-sjs9v 1/1 Running 0 32s 10.244.111.221 k8s-n2 <none> <none>

nodeName

nodeName:用于将Pod调度到指定的Node上,不经过调度器

[root@k8s-m1 chp5]# cat nodeName-pod.yml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-nodename
spec:
  selector:
    matchLabels:
      project: demo
      app: java
  template:
    metadata:
      labels:
        project: demo
        app: java
    spec:
      nodeName: k8s-n2
      containers:
      - name: web-nodename
        image: lizhenliang/java-demo
        ports:
        - containerPort: 8080
nodeName-pod.yml

 

[root@k8s-m1 chp5]# kubectl get pods -o wide |grep web-nodename
web-nodename-6646f87bf5-d52rj   1/1     Running   0          77s     10.244.111.219   k8s-n2   <none>           <none>
调度器才考虑污点
不使用调度器,就不来考虑污点

 

DaemonSet

Daemon Set功能

  • 在每一个Node上运行一个Pod
  • 新加入的Node也同样会自动运行一个Pod

应用场景:网络插件、 Agent

[root@k8s-m1 ~]# kubectl describe node|grep Taint
Taints: node-role.kubernetes.io/master:NoSchedule
Taints: <none>
Taints: gpu=yes:NoSchedule

[root@k8s-m1 ~]# kubectl taint node k8s-n2 gpu-
node/k8s-n2 untainted
[root@k8s-m1 ~]# kubectl describe node|grep Taint
Taints: node-role.kubernetes.io/master:NoSchedule
Taints: <none>
Taints: <none>

[root@k8s-m1 ~]# kubectl taint node k8s-n2 gpu=yes:NoSchedule
node/k8s-n2 tainted

 

[root@k8s-m1 chp5]# cat daemon-toleration.yml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: web3
spec:
  selector:
    matchLabels:
      project: demo
      app: java
  template:
    metadata:
      labels:
        project: demo
        app: java
    spec:
      tolerations:
      - key: gpu
        operator: Equal
        value: "yes"
        effect: "NoSchedule"
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: web3
        image: lizhenliang/java-demo
        ports:
        - containerPort: 8080
daemon-toleration.yml

 

[root@k8s-m1 chp5]# kubectl apply -f daemon-toleration.yml
daemonset.apps/web3 created
[root@k8s-m1 chp5]# kubectl get pod -o wide|grep web3
web3-8chqf 1/1 Running 0 34m 10.244.215.86 k8s-n1 <none> <none>
web3-sh7dq 1/1 Running 0 34m 10.244.111.218 k8s-n2 <none> <none>
web3-x87zb 1/1 Running 0 34m 10.244.42.144 k8s-m1 <none> <none>

 

Cluster Management Commands:
  certificate   修改 certificate 资源.
  cluster-info  显示集群信息
  top           Display Resource (CPU/Memory/Storage) usage.
  cordon        标记 node 为 unschedulable
  uncordon      标记 node 为 schedulable
  drain         Drain node in preparation for maintenance
  taint         更新一个或者多个 node 上的 taints

 

调度失败原因分析

查看调度结果:
kubectl get pod <NAME> -o wide
查看调度失败原因:kubectl describe pod <NAME>

  • 节点CPU/内存不足
  • 有污点,没容忍
  • 没有匹配到节点标签

容器处于pending状态:
1、正在下载镜像
2、
CPU不足:
0/3 nodes are available: 3 Insufficient cpu.
没有匹配标签的节点:
0/3 nodes are available: 3 node(s) didn't match node selector
没有污点容忍:
0/3 nodes are available: 1 node(s) had taint {disktype: ssd}, that the pod didn't tolerate, 1 node(s) had taint {gpu: yes}, that the pod didn't tolerate, 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.

 

posted @ 2020-08-11 10:42  前海渔文乐  阅读(619)  评论(0编辑  收藏  举报