Kubernetes——Pod 存活性探测

Pod 存活性探测

  有不少程序长时间持续运行后逐渐转为不可用状态,并且仅能通过重启恢复,Kubernetes 的容器存活性探测机制可发现诸如此类的问题,并依据探测结果结合重启策略触发后续的行为。

  存活性探测是隶属于容器级别的配置,kubelet 可基于它判定何时需要重启一个容器。

1、设置 exec 探针

  exec 类型的探针通过在目标容器中执行由用户定义的命令来判断容器的健康状态,若命令状态返回值为 0 则表示 "成功' 通过检测,其值非 0 均为 "失败" 状态。"spec.containers.livenessProbe.exec" 字段用于定义此类检测,它只有一个可用属性 "command",用于指定要执行的命令。

1
2
3
4
5
6
7
8
spec:
  containers:
  - name: liveness-exec-demo
    image: busybox
    args: ["/bin/sh", "-c", "touch /tmp/healthy;sleep 60; rm -rf /tmp/healthy;sleep 600"]
    livenessProbe:
      exec:
        command: ["test", "-e", "/tmp/healthy"]

  上面基于 busybox 镜像启动一个运行 "touch /tmp/healthy;sleep 60; rm -rf /tmp/healthy;sleep 600" 命令的容器,此命令在容器启动时创建 /tmp/healthy 文件,并于 60秒 之后将其删除。

  存活性探针运行 "test -e /tmp/healthy" 命令检查 /tmp/healthy 文件的存在性,若文件存在则返回状态码0,表示成功通过测试。

2、设置 HTTP 探针

  基于 HTTP 的探测(HTTPGetAction)向目标容器发起一个 HTTP 请求,根据其相应码进行结果判定,响应码形如 2xx 或 3xx 时表示通过。"spec.containers.livenessProbe.httpGet" 字段用于定义此类检测,它的可用配置字段包括如下几个:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
kubectl explain deployment.spec.template.spec.containers.livenessProbe.httpGet
KIND:     Deployment
VERSION:  apps/v1
 
RESOURCE: httpGet <Object>
 
DESCRIPTION:
     HTTPGet specifies the http request to perform.
 
     HTTPGetAction describes an action based on HTTP Get requests.
 
FIELDS:
   host <string>
     Host name to connect to, defaults to the pod IP. You probably want to set
     "Host" in httpHeaders instead.
 
   httpHeaders  <[]Object>
     Custom headers to set in the request. HTTP allows repeated headers.
 
   path <string>
     Path to access on the HTTP server.
 
   port <string> -required-
     Name or number of the port to access on the container. Number must be in
     the range 1 to 65535. Name must be an IANA_SVC_NAME.
 
   scheme   <string>
     Scheme to use for connecting to the host. Defaults to HTTP.

  下面是一个定义资源清单文件 livenesshttp.yaml 中的示例,它通过 lifecycle 中的 postStart hook 创建了一个专用于 httpGet 测试的页面文件 healthz:

复制代码
      livenessProbe:
        httpGet:
          path: /actuator/health
          port: 8291
          scheme: HTTP
        initialDelaySeconds: 60
        timeoutSeconds: 1
        periodSeconds: 5
        successThreshold: 1
        failureThreshold: 30
复制代码

3、设置 TCP 探针

  基于 TCP 的存活性探测(TCPSocketAction)用于向容器的特定端口发起 TCP 请求并尝试建立连接进行结果判定,连接建立成功即为通过检测。相比较来说,它比基于 HTTP 的探测要更高效 、更节约资源,但精准度略低,毕竟建立连接成功未必意味着页面资源可用。"spec.containers.livessProbeS.tcpSocket" 字段用于定义此类检测,它主要包括以下属性:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
kubectl explain deployment.spec.template.spec.containers.livenessProbe.tcpSocket
KIND:     Deployment
VERSION:  apps/v1
 
RESOURCE: tcpSocket <Object>
 
DESCRIPTION:
     TCPSocket specifies an action involving a TCP port. TCP hooks not yet
     supported
 
     TCPSocketAction describes an action based on opening a socket
 
FIELDS:
   host <string>
     Optional: Host name to connect to, defaults to the pod IP.
 
   port <string> -required-
     Number or name of the port to access on the container. Number must be in
     the range 1 to 65535. Name must be an IANA_SVC_NAME.

  下面是一个定义在资源清单文件 liveness-tcp.yaml 中的示例,它向 Pod IP 的 80/tcp 端口发起连接请求,并根据连接建立的状态判定测试结果:、

1
2
3
4
5
6
7
8
9
10
spec:
  containers:
  - name: liveness-tcp-demo
    image: nginx:1.18-alpine
    ports:
    - name: http
      containerPort: 80
    livenessProbe:
      tcpSocket:
        port: http

4、存活性探测行为属性:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
kubectl explain deployment.spec.template.spec.containers.livenessProbe
KIND:     Deployment
VERSION:  apps/v1
 
RESOURCE: livenessProbe <Object>
 
DESCRIPTION:
     Periodic probe of container liveness. Container will be restarted if the
     probe fails. Cannot be updated. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
 
     Probe describes a health check to be performed against a container to
     determine whether it is alive or ready to receive traffic.
 
FIELDS:
   exec <Object>
     One and only one of the following should be specified. Exec specifies the
     action to take.
 
   failureThreshold <integer>
     Minimum consecutive failures for the probe to be considered failed after
     having succeeded. Defaults to 3. Minimum value is 1.
 
   httpGet  <Object>
     HTTPGet specifies the http request to perform.
 
   initialDelaySeconds  <integer>
     Number of seconds after the container has started before liveness probes
     are initiated. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
 
   periodSeconds    <integer>
     How often (in seconds) to perform the probe. Default to 10 seconds. Minimum
     value is 1.
 
   successThreshold <integer>
     Minimum consecutive successes for the probe to be considered successful
     after having failed. Defaults to 1. Must be 1 for liveness and startup.
     Minimum value is 1.
 
   tcpSocket    <Object>
     TCPSocket specifies an action involving a TCP port. TCP hooks not yet
     supported
 
   timeoutSeconds   <integer>
     Number of seconds after which the probe times out. Defaults to 1 second.
     Minimum value is 1. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes 
posted @   左扬  阅读(228)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· 开源Multi-agent AI智能体框架aevatar.ai,欢迎大家贡献代码
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· AI技术革命,工作效率10个最佳AI工具
levels of contents
点击右上角即可分享
微信分享提示