Kubernetes——Pod 存活性探测
Pod 存活性探测
有不少程序长时间持续运行后逐渐转为不可用状态,并且仅能通过重启恢复,Kubernetes 的容器存活性探测机制可发现诸如此类的问题,并依据探测结果结合重启策略触发后续的行为。
存活性探测是隶属于容器级别的配置,kubelet 可基于它判定何时需要重启一个容器。
1、设置 exec 探针
exec 类型的探针通过在目标容器中执行由用户定义的命令来判断容器的健康状态,若命令状态返回值为 0 则表示 "成功' 通过检测,其值非 0 均为 "失败" 状态。"spec.containers.livenessProbe.exec" 字段用于定义此类检测,它只有一个可用属性 "command",用于指定要执行的命令。
1 2 3 4 5 6 7 8 | spec: containers: - name: liveness-exec-demo image: busybox args: [ "/bin/sh" , "-c" , "touch /tmp/healthy;sleep 60; rm -rf /tmp/healthy;sleep 600" ] livenessProbe: exec: command: [ "test" , "-e" , "/tmp/healthy" ] |
上面基于 busybox 镜像启动一个运行 "touch /tmp/healthy;sleep 60; rm -rf /tmp/healthy;sleep 600" 命令的容器,此命令在容器启动时创建 /tmp/healthy 文件,并于 60秒 之后将其删除。
存活性探针运行 "test -e /tmp/healthy" 命令检查 /tmp/healthy 文件的存在性,若文件存在则返回状态码0,表示成功通过测试。
2、设置 HTTP 探针
基于 HTTP 的探测(HTTPGetAction)向目标容器发起一个 HTTP 请求,根据其相应码进行结果判定,响应码形如 2xx 或 3xx 时表示通过。"spec.containers.livenessProbe.httpGet" 字段用于定义此类检测,它的可用配置字段包括如下几个:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | kubectl explain deployment.spec.template.spec.containers.livenessProbe.httpGet KIND: Deployment VERSION: apps /v1 RESOURCE: httpGet <Object> DESCRIPTION: HTTPGet specifies the http request to perform. HTTPGetAction describes an action based on HTTP Get requests. FIELDS: host <string> Host name to connect to, defaults to the pod IP. You probably want to set "Host" in httpHeaders instead. httpHeaders <[]Object> Custom headers to set in the request. HTTP allows repeated headers. path <string> Path to access on the HTTP server. port <string> -required- Name or number of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME. scheme <string> Scheme to use for connecting to the host. Defaults to HTTP. |
下面是一个定义资源清单文件 livenesshttp.yaml 中的示例,它通过 lifecycle 中的 postStart hook 创建了一个专用于 httpGet 测试的页面文件 healthz:
livenessProbe:
httpGet:
path: /actuator/health
port: 8291
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 1
periodSeconds: 5
successThreshold: 1
failureThreshold: 30
3、设置 TCP 探针
基于 TCP 的存活性探测(TCPSocketAction)用于向容器的特定端口发起 TCP 请求并尝试建立连接进行结果判定,连接建立成功即为通过检测。相比较来说,它比基于 HTTP 的探测要更高效 、更节约资源,但精准度略低,毕竟建立连接成功未必意味着页面资源可用。"spec.containers.livessProbeS.tcpSocket" 字段用于定义此类检测,它主要包括以下属性:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | kubectl explain deployment.spec.template.spec.containers.livenessProbe.tcpSocket KIND: Deployment VERSION: apps /v1 RESOURCE: tcpSocket <Object> DESCRIPTION: TCPSocket specifies an action involving a TCP port. TCP hooks not yet supported TCPSocketAction describes an action based on opening a socket FIELDS: host <string> Optional: Host name to connect to, defaults to the pod IP. port <string> -required- Number or name of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME. |
下面是一个定义在资源清单文件 liveness-tcp.yaml 中的示例,它向 Pod IP 的 80/tcp 端口发起连接请求,并根据连接建立的状态判定测试结果:、
1 2 3 4 5 6 7 8 9 10 | spec: containers: - name: liveness-tcp-demo image: nginx:1.18-alpine ports: - name: http containerPort: 80 livenessProbe: tcpSocket: port: http |
4、存活性探测行为属性:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | kubectl explain deployment.spec.template.spec.containers.livenessProbe KIND: Deployment VERSION: apps /v1 RESOURCE: livenessProbe <Object> DESCRIPTION: Periodic probe of container liveness. Container will be restarted if the probe fails. Cannot be updated. More info: https: //kubernetes .io /docs/concepts/workloads/pods/pod-lifecycle #container-probes Probe describes a health check to be performed against a container to determine whether it is alive or ready to receive traffic. FIELDS: exec <Object> One and only one of the following should be specified. Exec specifies the action to take. failureThreshold <integer> Minimum consecutive failures for the probe to be considered failed after having succeeded. Defaults to 3. Minimum value is 1. httpGet <Object> HTTPGet specifies the http request to perform. initialDelaySeconds <integer> Number of seconds after the container has started before liveness probes are initiated. More info: https: //kubernetes .io /docs/concepts/workloads/pods/pod-lifecycle #container-probes periodSeconds <integer> How often ( in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1. successThreshold <integer> Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness and startup. Minimum value is 1. tcpSocket <Object> TCPSocket specifies an action involving a TCP port. TCP hooks not yet supported timeoutSeconds <integer> Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1. More info: https: //kubernetes .io /docs/concepts/workloads/pods/pod-lifecycle #container-probes |
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· 开源Multi-agent AI智能体框架aevatar.ai,欢迎大家贡献代码
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· AI技术革命,工作效率10个最佳AI工具