k8s--容器探测

容器探测介绍

容器探测用于检测容器中的应用实例是否正常工作,是保障业务可用性的一种传统机制。如果经过探测,实例的状态不符合预期,那么 k8s 就会把该问题实例“摘除”,不承担业务流量,k8s 提供了两种探针来实现容器探测,分别是

  • liveness probes:存活性探针,用于检测应用实例当前是否处于正常运行状态,如果不是,k8s 会重启容器
  • readness probes:就绪性探针,用于检测应用实例当前是否可以接受请求,如果不能,k8s 不会转发流量

上面两种探针目前均支持三种探测方式

  • Exec命令:在容器内执行一次命令,如果命令执行的退出码为0,则认为程序正常,否则不正常
……
  livenessProbe:
    exec:
      command:
      - cat
      - /tmp/healthy
……
  • TCPSocket:将会尝试访问一个用户容器的端口,如果能够建立这条连接,则认为程序正常,否则不正常
……      
  livenessProbe:
    tcpSocket:
      port: 8080
……
  • HTTPGet:调用容器内 Web 应用的 URL,如果返回的状态码在 200 和 399 之间,则认为程序正常,否则不正常
……
  livenessProbe:
    httpGet:
      path: / #URI地址
      port: 80 #端口号
      host: 127.0.0.1 #主机地址
      scheme: HTTP #支持的协议,http 或者 https
……

下面以 liveness probes 为例,做几个演示

方式一:Exec

创建 pod-liveness-exec.yaml

apiVersion: v1
kind: Pod
metadata:
  name: pod-liveness-exec
  namespace: zouzou
spec:
  containers:
  - name: nginx
    image: nginx:1.14
    ports: 
    - name: nginx-port
      containerPort: 80
    livenessProbe:
      exec:
        command: ["/bin/cat","/tmp/hello.txt"] # 执行一个查看文件的命令,是容器内部的文件,但容器内部没有这个文件

创建 pod,观察效果

# 创建 pod
[root@dce-10-6-215-215 tmp]# kubectl apply -f pod-liveness-exec.yaml
pod/pod-liveness-exec created

# 查看 pod,发现 pod 重启了 2 次
[root@dce-10-6-215-215 tmp]# kubectl get pod pod-liveness-exec -n zouzou
NAME                READY   STATUS    RESTARTS   AGE
pod-liveness-exec   1/1     Running   2          70s

# 查看 pod 的详细信息,可以看到  Liveness probe failed: /bin/cat: /tmp/hello.txt: No such file or directory
[root@dce-10-6-215-215 tmp]# kubectl describe pod pod-liveness-exec -n zouzou
Name:         pod-liveness-exec
Namespace:    zouzou
Priority:     0
.....
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  85s                default-scheduler  Successfully assigned zouzou/pod-liveness-exec to dce-10-6-215-200
  Normal   Pulled     22s (x3 over 82s)  kubelet            Container image "nginx:1.14" already present on machine
  Normal   Created    22s (x3 over 81s)  kubelet            Created container nginx
  Normal   Killing    22s (x2 over 52s)  kubelet            Container nginx failed liveness probe, will be restarted
  Normal   Started    21s (x3 over 81s)  kubelet            Started container nginx
  Warning  Unhealthy  2s (x8 over 72s)   kubelet            Liveness probe failed: /bin/cat: /tmp/hello.txt: No such file or directory # 这里说了错误信息

可以看到,容器一直在重启,因为容器里面没有那个文件,接下来,我们删除这个 pod,在修改文件,改为一个正确的

kubectl delete -f pod-liveness-exec.yaml # 删除 pod

修改 pod-liveness-exec.yaml 文件,修改后的如下

apiVersion: v1
kind: Pod
metadata:
  name: pod-liveness-exec
  namespace: zouzou
spec:
  containers:
  - name: nginx
    image: nginx:1.14
    ports: 
    - name: nginx-port
      containerPort: 80
    livenessProbe:
      exec:
        command: ["/bin/ls","/tmp"] # 执行一个查看文件的命令,是容器内部的文件,这个肯定是可以的,因为 /tmp 目录存在

创建 pod,查看效果

# 创建pod
kubectl apply -f pod-liveness-exec.yaml

查看 event

# 发现没有重启
[root@dce-10-6-215-215 tmp]# kubectl get pod pod-liveness-exec -n zouzou
NAME                READY   STATUS    RESTARTS   AGE
pod-liveness-exec   1/1     Running   0          50s

# 查看 event,正常的
[root@dce-10-6-215-215 tmp]# kubectl describe pod pod-liveness-exec -n zouzou
Name:         pod-liveness-exec
Namespace:    zouzou
Priority:     0
Node:         dce-10-6-215-200/10.6.215.200
......
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  61s   default-scheduler  Successfully assigned zouzou/pod-liveness-exec to dce-10-6-215-200
  Normal  Pulled     59s   kubelet            Container image "nginx:1.14" already present on machine
  Normal  Created    59s   kubelet            Created container nginx
  Normal  Started    58s   kubelet            Started container nginx

方式二:TCPSocket

tcpSocket 就是访问某个端口,看是不是通的

创建 pod-liveness-tcpsocket.yaml,内容如下

apiVersion: v1
kind: Pod
metadata:
  name: pod-liveness-tcpsocket
  namespace: zouzou
spec:
  containers:
  - name: nginx
    image: nginx:1.14
    ports: 
    - name: nginx-port
      containerPort: 80
    livenessProbe:
      tcpSocket:
        port: 8080 # 尝试访问8080端口,因为我们只有一个 nginx 容器,只有 80 端口是正常的,所以8080是访问不通的

创建 pod,观察效果

# 创建 pod
[root@dce-10-6-215-215 tmp]# kubectl apply -f pod-liveness-tcpsocket.yaml
pod/pod-liveness-tcpsocket created

# 查看信息,发现有 RESTART
[root@dce-10-6-215-215 tmp]# kubectl get pod pod-liveness-tcpsocket -n zouzou
NAME                     READY   STATUS    RESTARTS   AGE
pod-liveness-tcpsocket   1/1     Running   1          34s

# 查看 event,报错信息说明端口是不通的
[root@dce-10-6-215-215 tmp]# kubectl describe pod pod-liveness-tcpsocket -n zouzou
Name:         pod-liveness-tcpsocket
Namespace:    zouzou
Priority:     0
Node:         dce-10-6-215-200/10.6.215.200
Start Time:   Fri, 15 Apr 2022 18:25:10 +0800
......
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  48s                default-scheduler  Successfully assigned zouzou/pod-liveness-tcpsocket to dce-10-6-215-200
  Normal   Pulled     18s (x2 over 45s)  kubelet            Container image "nginx:1.14" already present on machine
  Normal   Killing    18s                kubelet            Container nginx failed liveness probe, will be restarted
  Normal   Created    17s (x2 over 45s)  kubelet            Created container nginx
  Normal   Started    17s (x2 over 44s)  kubelet            Started container nginx
  Warning  Unhealthy  8s (x4 over 38s)   kubelet            Liveness probe failed: dial tcp 172.29.34.232:8080: connect: connection refused # 8080 端口访问失败

可以看到,容器一直在重启,因为容器里面 8080 端口是不通的。接下来,我们删除这个 pod,在修改文件,改为一个正确的 

# 删除 pod
kubectl delete -f pod-liveness-tcpsocket.yaml

修改 pod-liveness-tcpsocket.yaml,内容如下

apiVersion: v1
kind: Pod
metadata:
  name: pod-liveness-tcpsocket
  namespace: zouzou
spec:
  containers:
  - name: nginx
    image: nginx:1.14
    ports: 
    - name: nginx-port
      containerPort: 80
    livenessProbe:
      tcpSocket:
        port: 80 # 80 端口是可以正常访问的

创建 pod,查看效果

# 创建 pod
kubectl apply -f pod-liveness-tcpsocket.yaml

查看 event

# 查看 pod ,发现没有重启
[root@dce-10-6-215-215 tmp]# kubectl get pod pod-liveness-tcpsocket -n zouzou
NAME                     READY   STATUS    RESTARTS   AGE
pod-liveness-tcpsocket   1/1     Running   0          93s

# 查看 event,都是正常的
[root@dce-10-6-215-215 tmp]# kubectl describe pod pod-liveness-tcpsocket -n zouzou
Name:         pod-liveness-tcpsocket
Namespace:    zouzou
Priority:     0
Node:         dce-10-6-215-200/10.6.215.200
Start Time:   Fri, 15 Apr 2022 18:30:32 +0800
.....
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  100s  default-scheduler  Successfully assigned zouzou/pod-liveness-tcpsocket to dce-10-6-215-200
  Normal  Pulled     97s   kubelet            Container image "nginx:1.14" already present on machine
  Normal  Created    97s   kubelet            Created container nginx
  Normal  Started    96s   kubelet            Started container nginx

方式三:HTTPGet 

httpget 就是访问一个 url,看能不能访问成功

创建 pod-liveness-httpget.yaml,内容如下

apiVersion: v1
kind: Pod
metadata:
  name: pod-liveness-httpget
  namespace: zouzou
spec:
  containers:
  - name: nginx
    image: nginx:1.14
    ports:
    - name: nginx-port
      containerPort: 80
    livenessProbe:
      httpGet:  # 其实就是访问http://127.0.0.1:80/hello,nginx 容器里没有这个地址  
        scheme: HTTP #支持的协议,http或者https
        port: 80 #端口号
        path: /hello #URI地址
  • host:连接使用的主机名,默认是 Pod 的 IP。也可以在 HTTP 头中设置 “Host” 来代替。
  • scheme :用于设置连接主机的方式(HTTP 还是 HTTPS)。默认是 "HTTP"。
  • path:访问 HTTP 服务的路径。默认值为 "/"。
  • httpHeaders:请求中自定义的 HTTP 头。HTTP 头字段允许重复。
  • port:访问容器的端口号或者端口名。如果数字必须在 1~65535 之间。

对于 HTTP 探测,kubelet 发送一个 HTTP 请求到指定的路径和端口来执行检测。 除非 httpGet 中的 host 字段设置了,否则 kubelet 默认是给 Pod 的 IP 地址发送探测。 如果 scheme 字段设置为了 HTTPS,kubelet 会跳过证书验证发送 HTTPS 请求。 大多数情况下,不需要设置host 字段。 这里有个需要设置 host 字段的场景,假设容器监听 127.0.0.1,并且 Pod 的 hostNetwork 字段设置为了 true。那么 httpGet 中的 host 字段应该设置为 127.0.0.1。 可能更常见的情况是如果 Pod 依赖虚拟主机,你不应该设置 host 字段,而是应该在 httpHeaders 中设置 Host

针对 HTTP 探针,kubelet 除了必需的 Host 头部之外还发送两个请求头部字段: User-Agent 和 Accept。这些头部的默认值分别是 kube-probe/{{ skew currentVersion >}} (其中 1.24 是 kubelet 的版本号)和 */*

创建 pod,观察效果

# 创建 pod
[root@dce-10-6-215-215 tmp]# kubectl apply -f pod-liveness-httpget.yaml
pod/pod-liveness-httpget created

# 查看 pod,发现有 RESTARTS
[root@dce-10-6-215-215 tmp]# kubectl get pod pod-liveness-httpget -n zouzou
NAME                   READY   STATUS    RESTARTS   AGE
pod-liveness-httpget   1/1     Running   1          37s

# 查看 event,发现返回的状态码为 404
[root@dce-10-6-215-215 tmp]# kubectl describe pod pod-liveness-httpget -n zouzou
Name:         pod-liveness-httpget
Namespace:    zouzou
Priority:     0
Node:         dce-10-6-215-200/10.6.215.200
Start Time:   Fri, 15 Apr 2022 18:36:18 +0800
.....
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  49s                default-scheduler  Successfully assigned zouzou/pod-liveness-httpget to dce-10-6-215-200
  Normal   Killing    16s                kubelet            Container nginx failed liveness probe, will be restarted
  Normal   Pulled     15s (x2 over 46s)  kubelet            Container image "nginx:1.14" already present on machine
  Normal   Created    15s (x2 over 46s)  kubelet            Created container nginx
  Normal   Started    15s (x2 over 46s)  kubelet            Started container nginx
  Warning  Unhealthy  6s (x4 over 36s)   kubelet            Liveness probe failed: HTTP probe failed with statuscode: 404 # 状态码为 404

可以看到,容器一直在重启,因为容器里面没有那个地址,接下来,我们删除这个 pod,在修改文件,改为一个正确的

# 删除 pod
kubectl delete -f pod-liveness-httpget.yaml

修改 pod-liveness-httpget.yaml 文件,修改后的如下

apiVersion: v1
kind: Pod
metadata:
  name: pod-liveness-httpget
  namespace: zouzou
spec:
  containers:
  - name: nginx
    image: nginx:1.14
    ports:
    - name: nginx-port
      containerPort: 80
    livenessProbe:
      httpGet:  # 其实就是访问http://127.0.0.1:80/,nginx 容器里可以访问这个地址  
        scheme: HTTP #支持的协议,http或者https
        port: 80 #端口号
        path: / #URI地址

创建 pod,查看效果

# 创建 pod
[root@dce-10-6-215-215 tmp]# kubectl apply -f pod-liveness-httpget.yaml
pod/pod-liveness-httpget created

# 查看 pod,没有重启
[root@dce-10-6-215-215 tmp]# kubectl get pod pod-liveness-httpget -n zouzou
NAME                   READY   STATUS    RESTARTS   AGE
pod-liveness-httpget   1/1     Running   0          220s

# 查看 event,正常启动的
[root@dce-10-6-215-215 tmp]# kubectl describe pod pod-liveness-httpget -n zouzou
Name:         pod-liveness-httpget
Namespace:    zouzou
Priority:     0
Node:         dce-10-6-215-200/10.6.215.200
Start Time:   Fri, 15 Apr 2022 18:41:35 +0800
......
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  27s   default-scheduler  Successfully assigned zouzou/pod-liveness-httpget to dce-10-6-215-200
  Normal  Pulled     24s   kubelet            Container image "nginx:1.14" already present on machine
  Normal  Created    24s   kubelet            Created container nginx
  Normal  Started    23s   kubelet            Started container nginx

其他参数配置

至此,已经使用 liveness Probe 演示了三种探测方式,但是查看 livenessProbe 的子属性,会发现除了这三种方式,还有一些其他的配置

[root@dce-10-6-215-215 tmp]# kubectl explain pod.spec.containers.livenessProbe
FIELDS:
   exec <Object>  
   tcpSocket    <Object>
   httpGet      <Object>
   initialDelaySeconds  <integer>  # 容器启动后等待多少秒执行第一次探测
   timeoutSeconds       <integer>  # 探测超时时间。默认1秒,最小1秒
   periodSeconds        <integer>  # 执行探测的频率。默认是10秒,最小1秒
   failureThreshold     <integer>  # 连续探测失败多少次才被认定为失败。默认是3。最小值是1
   successThreshold     <integer>  # 连续探测成功多少次才被认定为成功。默认是1

注意:如果设置了探针等待多少秒执行一次的话(initialDelaySeconds),如果还没到这个时间,k8s 就默认这个 pod 是失败的,只有等到了时间,并且探针通过的话,才认为这个 pod 是成功的

 

posted @ 2022-07-25 22:38  邹邹很busy。  阅读(365)  评论(0编辑  收藏  举报