深入Pod —— 探针

容器内应用的监测机制，根据不同的探针来判断容器应用当前的状态

一、类型

1、StartupProbe

k8s 1.16 版本新增的探针，用于判断应用程序是否已经启动了。
当配置了 startupProbe 后，会先禁用其他探针，优先执行，直到 startupProbe 成功后，其他探针才会继续。
作用：由于有时候不能准确预估应用一定是多长时间启动成功，因此配置另外两种方式不方便配置初始化时长来检测，而配置了 statupProbe 后，只有在应用启动成功了，才会执行另外两种探针，可以更加方便的结合使用另外两种探针使用。

startupProbe:
  httpGet:
    path: /api/startup
    port: 80
  failureThreshold: 3
  periodSeconds: 10
  successThreshold: 1
  timeoutSeconds: 5

2、LivenessProbe（代表是否健康）

用于探测容器中的应用是否运行，如果探测失败，kubelet 会根据配置的重启策略进行重启，若没有配置，默认就认为容器启动成功，不会执行重启策略。

例如：Java 内存溢出后，容器挂掉了，但Pod还在，它会要基于 yaml 中重启策略 restartPolicy 来进行重启。但是它如何知道容器挂掉了呢？这就是基于探针技术。

livenessProbe:
  failureThreshold: 5
  httpGet:
    path: /health
    port: 8080
    scheme: HTTP
  initialDelaySeconds: 60
  periodSeconds: 10 # 间隔时间
  successThreshold: 1
  timeoutSeconds: 5

3、ReadinessProbe（代表是否初始化完成）

用于探测容器内的程序是否健康，它的返回值如果返回 success，那么就认为该容器已经完全启动，并且该容器是可以接收外部流量的。

例如：在服务启动完成后，还要进行初始化、加载数据到内存或进行一些数据处理，在这些处理完之前，不要让这个容器启动成功，且不能接受外部流量请求，这个就是利用探针技术

readinessProbe:
  failureThreshold: 3 # 错误次数
  httpGet:
    path: /ready
    port: 8181
    scheme: HTTP
  periodSeconds: 10 # 间隔时间
  successThreshold: 1
  timeoutSeconds: 1

二、探测方式

1、ExecAction

在容器内部执行一个命令，如果返回值为 0，则任务容器时健康的。

livenessProbe:
  exec:
    command:
      - cat
      - /health

2、TCPSocketAction

通过 tcp 连接监测容器内端口是否开放，如果开放则证明该容器健康

livenessProbe:
  tcpSocket:
    port: 80

3、HTTPGetAction

Java应用生产环境用的较多的方式！！！发送 HTTP 请求到容器内的应用程序，如果接口返回的状态码在 200~400 之间，则认为容器健康。

livenessProbe:
  failureThreshold: 5
  httpGet:
    path: /health
    port: 8080
    scheme: HTTP
    httpHeaders:
      - name: xxx
        value: xxx

三、参数配置

一个容器彻底失败时，到应用停止时，耗时 = (failureThreshold * timeoutSeconds) + (failureThreshold * periodSeconds - failureThreshold)

initialDelaySeconds: 60 # 初始化时间，有startup开始探针之前，readiness和liveness在这个时间之前不会执行，但也不会替代startup，因为无法估算具体时间
timeoutSeconds: 2 # 超时时间
periodSeconds: 5 # 监测间隔时间
successThreshold: 1 # 检查 1 次成功就表示成功
failureThreshold: 2 # 监测失败 2 次就表示失败

四、使用

1、可以用来参考

kubectl get deployment -n kube-system
# 获取 core-dns 配置 yaml 信息（用于学习参考配置参数！！！！！！！！！！！！！！！）
kubectl edit deploy coredns -n kube-system

2、在 nginx-demo.yaml 中指定 StartupProbe 探针和探测方式

# startupProbe: 指定了超时时间5s、失败次数3次，间隔时间10s，，并休眠3s
startupProbe:
  #httpGet:
  #  path: /index.html
  #tcpSocket:
  #  port: 80
  exec:
    command:
      - sh
      - -c
      - "echo success > /inited"
  failureThreshold: 3
  periodSeconds: 10
  successThreshold: 1
  timeoutSeconds: 5

3、在 nginx-demo.yaml 中指定 livenessProbe 探针和探测方式

startupProbe:
  exec:
    command:
      - sh
      - -c
      - "sleep 3;echo success > /inited"
  failureThreshold: 3
  periodSeconds: 10
  successThreshold: 1
  timeoutSeconds: 5
# livenessProbe，指定了超时时间5s、失败次数3次，间隔时间10s，当访问成功 /started.html 时，则运行通过
livenessProbe:
  httpGet:
    path: /started.html
    port: 80
  failureThreshold: 3
  periodSeconds: 10
  successThreshold: 1
  timeoutSeconds: 5

开始 describe详情中会报错 /started.html 404，然后我们在中间重试第1次或者第2次时，执行命令，向容器内加入html

echo 'started' > started.html
kubectl cp started.html nginx-po:/usr/share/nginx/html

加入后，第3次重试，则运行正常

4、在 nginx-demo.yaml 中指定 ReadinessProbe 探针和探测方式

startupProbe:
  exec:
    command:
      - sh
      - -c
      - "sleep 3;echo success > /inited"
  failureThreshold: 3
  periodSeconds: 10
  successThreshold: 1
  timeoutSeconds: 5
# readinessProbe，指定了超时时间5s、失败次数3次，间隔时间10s，当访问成功 /started.html 时，则运行通过
readinessProbe:
  httpGet:
    path: /started.html
    port: 80
  failureThreshold: 5
  periodSeconds: 10
  successThreshold: 1
  timeoutSeconds: 3

开始 describe详情中会报错 /started.html 404，然后我们用外网访问不了，然后在中间重试第1次或者第2次时，执行命令，向容器内加入html

echo 'started' > started.html
kubectl cp started.html nginx-po:/usr/share/nginx/html

加入后，第3次重试，则外网正常访问

4、删除并创建pod

kubectl delete po nginx-po
kubectl get pods
kubectl create -f nginx-demo.yaml
kubectl describe po nginx-po
kubectl exec -it nginx-po -c nginx -- cat /inited    # 在容器内执行命令
kubectl cp started.html nginx-po:/usr/share/nginx/html  # 将文件复制进容器中

posted @ 2023-10-12 12:35 yifanSJ 阅读(85) 评论(0) 编辑收藏举报

刷新页面返回顶部

yifanSJ