kubernetes concepts -- Pod Lifecycle

Pod Lifecycle

This page describes the lifecycle of a Pod.

Pod phase

A Pod’s status field is a PodStatus object, which has a phase field.

The phase of a Pod is a simple, high-level summary of where the Pod is in its lifecycle. The phase is not intended to be a comprehensive rollup of observations of Container or Pod state, nor is it intended to be a comprehensive state machine.

The number and meanings of Pod phase values are tightly guarded. Other than what is documented here, nothing should be assumed about Pods that have a given phase value.

Pod的status域是一个PodStatus对象,包含phase域。phase描述了Pod的状态。

  • Pending表示kubernetes接受到请求,但是容器还没有被创建完成。可能是因为在调度、下载镜像。
  • Running表示Pod被创建完成,Pod中所有的容器被创建完成,至少一个容器还在运行或者在重启。
  • Successed表示Pod中所有的容器都被终止,且不会被重启。
  • Failed表示Pod中所有容器都被终止,至少一个容器没有被正常终止,即容器终止后返回非零或者系统终止容器。
  • Unknown,无法获取Pod的状态,一般是由于Pod所在物理机的通信异常。

 

Here are the possible values for phase:

  • Pending: The Pod has been accepted by the Kubernetes system, but one or more of the Container images has not been created. This includes time before being scheduled as well as time spent downloading images over the network, which could take a while.

  • Running: The Pod has been bound to a node, and all of the Containers have been created. At least one Container is still running, or is in the process of starting or restarting.

  • Succeeded: All Containers in the Pod have terminated in success, and will not be restarted.

  • Failed: All Containers in the Pod have terminated, and at least one Container has terminated in failure. That is, the Container either exited with non-zero status or was terminated by the system.

  • Unknown: For some reason the state of the Pod could not be obtained, typically due to an error in communicating with the host of the Pod.

Pod conditions

A Pod has a PodStatus, which has an array of PodConditions. Each element of the PodCondition array has a type field and a status field. The type field is a string, with possible values PodScheduled, Ready, Initialized, and Unschedulable. The status field is a string, with possible values True, False, and Unknown.

PodStatus是PodConditions数组。其中PodCondition有type域和status域。 type是一个字符串,可选值有PodScheduled、Ready、Initialized、Unschedulable。status域是一个字符串,可选值有True、False和Unkown。

Container probes

Probe is a diagnostic performed periodically by the kubelet on a Container. To perform a diagnostic, the kubelet calls a Handler implemented by the Container. There are three types of handlers:

  • ExecAction: Executes a specified command inside the Container. The diagnostic is considered successful if the command exits with a status code of 0.

  • TCPSocketAction: Performs a TCP check against the Container’s IP address on a specified port. The diagnostic is considered successful if the port is open.

  • HTTPGetAction: Performs an HTTP Get request against the Container’s IP address on a specified port and path. The diagnostic is considered successful if the response has a status code greater than or equal to 200 and less than 400.

探针是kubelet在容器上周期进行的诊断,通过调用容器实现的Handler实现诊断。有三种handler:

  • ExecAction,执行容器中的特定命令,返回0时表示诊断成功。
  • TCPSocketAction,对容器的IP地址上的特定端口进行TCP检查。如果端口被监听,则诊断成功。
  • HTTPGetAction,对容器的特定端口和路径执行HTTP Get请求。如果返回状态码大于等于200且小于400,则诊断成功。

Each probe has one of three results:

  • Success: The Container passed the diagnostic.
  • Failure: The Container failed the diagnostic.
  • Unknown: The diagnostic failed, so no action should be taken.

每个探针有三种结果:

  • Success,通过诊断
  • Failure,没有通过诊断
  • Unknown,诊断过程失败,so no action should be taken.

The kubelet can optionally perform and react to two kinds of probes on running Containers:

  • livenessProbe: Indicates whether the Container is running. If the liveness probe fails, the kubelet kills the Container, and the Container is subjected to its restart policy. If a Container does not provide a liveness probe, the default state is Success.

  • readinessProbe: Indicates whether the Container is ready to service requests. If the readiness probe fails, the endpoints controller removes the Pod’s IP address from the endpoints of all Services that match the Pod. The default state of readiness before the initial delay is Failure. If a Container does not provide a readiness probe, the default state is Success.

kubelet可以选择性地执行两种探针,并对结果进行处理。探针有:

  • livenessProbe,检查容器是否在运行。如果诊断失败,kubelet会杀掉容器,根据restart policy处理容器。默认为Success。
  • readinessProbe,检查容器是否可以处理service requests。如果readinessProbe失败了,endpoints controller会将Pod从所有满足条件的service endPoints中删除。初始化延迟之前,默认readiness 状态是Failure。如果容器不包含readinessProbe,默认值为Success。

When should you use liveness or readiness probes?

If the process in your Container is able to crash on its own whenever it encounters an issue or becomes unhealthy, you do not necessarily need a liveness probe; the kubelet will automatically perform the correct action in accordance with the Pod’s restartPolicy.

If you’d like your Container to be killed and restarted if a probe fails, then specify a liveness probe, and specify a restartPolicy of Always or OnFailure.

If you’d like to start sending traffic to a Pod only when a probe succeeds, specify a readiness probe. In this case, the readiness probe might be the same as the liveness probe, but the existence of the readiness probe in the spec means that the Pod will start without receiving any traffic and only start receiving traffic after the probe starts succeeding.

If you want your Container to be able to take itself down for maintenance, you can specify a readiness probe that checks an endpoint specific to readiness that is different from the liveness probe.

Note that if you just want to be able to drain requests when the Pod is deleted, you do not necessarily need a readiness probe; on deletion, the Pod automatically puts itself into an unready state regardless of whether the readiness probe exists. The Pod remains in the unready state while it waits for the Containers in the Pod to stop.

如果容器中的进程可以在遇到问题或错误时自己挂掉,那么就不需要liveness probe,kubelet会根据Pod的重启策略自动执行相应的行动。

如果当探针失败时,需要重启容器,那么就设置一个liveness probe,设置restartPolicy为Always或OnFailure。

如果当探针成功时,想要向Pod开始发送请求,那么就设置一个readiness probe。这时readiness probe可能类似于liveness probe。但是spec中的readiness probe还表示,只有当readiness probe成功后,pod才会开始接收到请求。

如果你想让容器因为维护自行挂掉,那么需要设置单独检查readiness的readiness probe,并且和liveness probe不同。

如果你想让被删除的Pod不接受请求,那么不需要设置readiness probe。当Pod被删除时,pod的状态变为unready state,不管是否存在readiness probe,并一直保存这个状态,等待Pod中的容器被关掉。

Pod and Container status

For detailed information about Pod Container status, see PodStatus and ContainerStatus. Note that the information reported as Pod status depends on the current ContainerState.

关于容器的状态,可以参考PodStatus和ContainerStatus。Pod status依赖当前的容器状态。

Restart policy

A PodSpec has a restartPolicy field with possible values Always, OnFailure, and Never. The default value is Always. restartPolicy applies to all Containers in the Pod. restartPolicy only refers to restarts of the Containers by the kubelet on the same node. Failed Containers that are restarted by the kubelet are restarted with an exponential back-off delay (10s, 20s, 40s …) capped at five minutes, and is reset after ten minutes of successful execution. As discussed in thePods document, once bound to a node, a Pod will never be rebound to another node.

Pod spec有个restartPolicy域,可选值有Always、OnFailure、Never。默认值为Always。restartPolicy被应用到Pod中的所有容器,只涉及同一节点上kubelet对容器的重启。挂掉的容器被kubelet重启时,延时会成指数级增长(10s,20s,40s...),上限为5分钟,重启成功10分钟后延时被重置。一旦某个Pod被绑定到一个节点,就再也不会被绑定到其他节点。

Pod lifetime

In general, Pods do not disappear until someone destroys them. This might be a human or a controller. The only exception to this rule is that Pods with a phaseof Succeeded or Failed for more than some duration (determined by the master) will expire and be automatically destroyed.

Three types of controllers are available:

  • Use a Job for Pods that are expected to terminate, for example, batch computations. Jobs are appropriate only for Pods with restartPolicy equal to OnFailure or Never.

  • Use a ReplicationControllerReplicaSet, or Deployment for Pods that are not expected to terminate, for example, web servers. ReplicationControllers are appropriate only for Pods with a restartPolicy of Always.

  • Use a DaemonSet for Pods that need to run one per machine, because they provide a machine-specific system service.

All three types of controllers contain a PodTemplate. It is recommended to create the appropriate controller and let it create Pods, rather than directly create Pods yourself. That is because Pods alone are not resilient to machine failures, but controllers are.

If a node dies or is disconnected from the rest of the cluster, Kubernetes applies a policy for setting the phase of all Pods on the lost node to Failed.

一般情况下,除非人或者controller把Pod删掉,Pod永远不会消失。对于Succeeded 或Failed的Pods,当状态持续时间超过一定时间(master决定)后,就会超时然后被自动消耗。

有三种Controller:

  • Job,需要关闭Pods,例如批量计算。要求设置restartPolicy为OnFailure或Never。
  • ReplicationController、ReplicaSet、Deployment,不要求关闭Pods,例如web servers。RC要求restartPolicy为Always。
  • DaemonSet,每个机器上运行一个Pod,因为提供特定机器的系统服务。

所有的controller都包含Pod template。建议创建适合的controller,然后由controller创建Pods,而不是直接创建Pods。如果节点挂掉或失联,kubernetes会将该节点上所有Pods的phase设置为Failed。

Examples

Advanced liveness probe example

Liveness probes are executed by the kubelet, so all requests are made in the kubelet network namespace.

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-http
spec:
  containers:
  - args:
    - /server
    image: k8s.gcr.io/liveness
    livenessProbe:
      httpGet:
        # when "host" is not defined, "PodIP" will be used
        # host: my-host
        # when "scheme" is not defined, "HTTP" scheme will be used. Only "HTTP" and "HTTPS" are allowed
        # scheme: HTTPS
        path: /healthz
        port: 8080
        httpHeaders:
        - name: X-Custom-Header
          value: Awesome
      initialDelaySeconds: 15
      timeoutSeconds: 1
    name: liveness

Example states

  • Pod is running and has one Container. Container exits with success.
    • Log completion event.
    • If restartPolicy is:
      • Always: Restart Container; Pod phase stays Running.
      • OnFailure: Pod phase becomes Succeeded.
      • Never: Pod phase becomes Succeeded.
  • Pod is running and has one Container. Container exits with failure.
    • Log failure event.
    • If restartPolicy is:
      • Always: Restart Container; Pod phase stays Running.
      • OnFailure: Restart Container; Pod phase stays Running.
      • Never: Pod phase becomes Failed.
  • Pod is running and has two Containers. Container 1 exits with failure.
    • Log failure event.
    • If restartPolicy is:
      • Always: Restart Container; Pod phase stays Running.
      • OnFailure: Restart Container; Pod phase stays Running.
      • Never: Do not restart Container; Pod phase stays Running.
    • If Container 1 is not running, and Container 2 exits:
      • Log failure event.
      • If restartPolicy is:
        • Always: Restart Container; Pod phase stays Running.
        • OnFailure: Restart Container; Pod phase stays Running.
        • Never: Pod phase becomes Failed.
  • Pod is running and has one Container. Container runs out of memory.
    • Container terminates in failure.
    • Log OOM event.
    • If restartPolicy is:
      • Always: Restart Container; Pod phase stays Running.
      • OnFailure: Restart Container; Pod phase stays Running.
      • Never: Log failure event; Pod phase becomes Failed.
  • Pod is running, and a disk dies.
    • Kill all Containers.
    • Log appropriate event.
    • Pod phase becomes Failed.
    • If running under a controller, Pod is recreated elsewhere.
  • Pod is running, and its node is segmented out.
    • Node controller waits for timeout.
    • Node controller sets Pod phase to Failed.
    • If running under a controller, Pod is recreated elsewhere.

What’s next

posted @ 2018-03-17 14:38  翠绿的柠檬树  阅读(335)  评论(0编辑  收藏  举报