Kubernetes的pod控制器及ReplicaSet控制器类型的pod的定义

为什么需要Pod

Kubernetes项目之所以这么做的原因;
因为Kubernetes是谷歌公司基于Borg项目做出来的,谷歌工程师发现,他们部署的应用往往存在这进程与进程组的关系。具体说呢,就是这些应用之间有着密切的协作关系,使得他们必须部署在同一台机器上
而如果事先没有组的概念,像这样的运维关系很难处理;举个例子
rsyslogd是由3个进程组成的:一个imklog模块,一个imuxsock模块,一个rsyslogd自己的9函数主进程。这三个进程一定要运行在同一台机器上否则,他们之间基于 Socket的通信和文件交换,都会出现问题;现在,我要把rsyslogd这个应用给容器化,由于受限于的单进程模型,这三个模块必须被分别制作成三个不同容器运行,他们设置的内存配额都是1GB
注意:强调一下容器的“单进程模型”,并不是指容器里只能运行一个进程,而是容器没有管理多个进程的能力。这是因为容器里PID=1的进程就是应用本身,其他进程都是这个PID=1进程的子进程。可是,用户编写的应用,并不能够像正常操作系统里的init进程或者systemd那样拥有进程管理功能。比如,你的应用是一个java Web程序(PID=1),然后你执行docker exec在后台启动了一个nginx进程(PID=3)。可是,当nginx进程异常退出的时候,你怎么知道呢?这个进程退出后的垃圾收集工作,由谁做
假设我们Kubernetes集群上有两个节点:node-1上有3GB可用内存,node-2上有2.5GB可用内存
这时,假设我要用Docker Swarm来运行这个rsyslogd程序。为了能够让着三个容器都运行在同一台机器上,就必须在两个容器上设置一个affinity=main(与main容器有亲密性)的约束,即:它俩必须和main容器运行在同一台机器上
然后,我依次执行:“docker run main” “docker run imklog” 和“docker run imuxsock”,创建这三个容器;这样,这三个容器都进入Swarm的代调度队列。然后,main容器和imklog容器都先后出队列并被调度到node-2节点上(这个情况完全有可能的)
可是,当imuxsock容器出队列调度时,Swarm就有的懵了:node-2上的可用资源只有0.5GB了,并不足运行imuxsock容器;可是,根据affinity=main的约束,imuxsock容器有只能运行在node-2上;这就是一个典型的成组调度没有被妥善处理的例子
工业界与学术界,关于这个问题的讨论可谓旷日持久,也产生很多可选方案
比如,Mesos中就有一个资源囤积的机制,会在所有设置了Affinity约束的任务都到达时,才开始对它们统一进行调度。而谷歌在Omege论文中提出使用乐观调度处理冲突方法,即:先不管这些冲突,而是通过精心设计的回滚机制在出现冲突之后解决
以上的方法都谈不上完美。资源囤积带来了不可避免的调度效率失所与死锁的可能性;而乐观调度的复杂程度,不是常规技术团结队所能驾驭的。
但是,到了Kubernetes项目里,这样的问题迎刃而解:Pod是Kubernetes的最小调度单位,这就意味着,Kubernetes项目在调度时,自然就会去选择可用内存等于3GB的node-1节点进行绑定,而根本就不会考虑nod-2
像这样的容器间紧密协作,我们称为“超亲密关系”。这些具有“超亲密关系”容器的典型特征包括但不限于:互相之间发生直接的文件交换,使用localhost或者Socket文件进行本地通信、会发生非常频繁的远程调用、需要共享这些Linux Namespace(比如,一个容器要加入另一个容器的Linux Namespace)等等
这也就意味着,并不是所有有关系的容器都属于同一个Pod。比如,PHP容器和Mysql虽然会发生访问关系,但并不需要、也不应该部署同一个Pod里,更适合做成两个pod

如果只是处理这种超亲密关系这样的调度问题,有Borg和Omega论文珠玉在前,Kubernetes项目肯定可以在调度器层面把它解决掉
不过,pod在Kubernetes项目里还有更重要的意义,那就是容器设计模式
为理解这一层含义,就必须介绍一下Pod的实现原理
首先关于Pod最重要的一个事实是:它就是一个逻辑概念;也就是说Kubernetes真正处理的,还是宿主机操作系统是上的Linux容器的Namespace与Cgroups,而并不存在所谓的Pod边界或者隔离排环境;pod其实就是一组共享某些资源的容器
具体说:pod里所有容器都是共享同一个Network Namespace,并且可以声明挂载同一个Volume
那这么来看的话,一个有A、B两个容器的pod,不就等同一个容器(容器A)共享另一个容器(容器B)的网络和Volume的玩法了;这个好像通过 Docker run --net --volumes-from这样的命令就可以实现
docker run --net=B --volumes-from=B --name=A image-A ...
但是,你没有考虑过,如果真的这样的话,容器B就必须比容器A先启动,这样一个Pod里的多个容器就不是对等关系,而是拓扑关系;
所以在Kubernetes项目里,pod的实现需要一个中间容器,这个容器叫Infra容器。在这个pod中,infra容器永远都是第一个被创建出来的容器,而其他用户定义的容器,则通过join Network Namespace 的方式,与Infra容器关联在一起的

 

这个pod 里有两个用户容器A和B,还有一个Infra容器,很容易理解,在Kubernetes项目里,Infra容器一定占有少量的资源,所有它使用的是一个非常特殊的镜像叫做:k8s.gcr.io/pause。这个镜像时一个汇编语言编写的、永远处于暂停装态的容器,解压后的大小也只有100~200kb左右
而Infra容器Hold住network Namespace后,用户容器

pod 控制器

  ReplicaSet:代用户创建指定数量的副本,并控制副本数量一直处于用户期望的数量状态;多退少补,支持滚动更新;自动扩缩容机制;不建议直接使用

 顶级帮助

[root@master manifests]# kubectl explain rs
KIND:     ReplicaSet
VERSION:  extensions/v1beta1

DESCRIPTION:
     DEPRECATED - This group version of ReplicaSet is deprecated by
     apps/v1beta2/ReplicaSet. See the release notes for more information.
     ReplicaSet ensures that a specified number of pod replicas are running at
     any given time.

FIELDS:
   apiVersion	<string>
     APIVersion defines the versioned schema of this representation of an
     object. Servers should convert recognized schemas to the latest internal
     value, and may reject unrecognized values. More info:
     https://git.k8s.io/community/contributors/devel/api-conventions.md#resources

   kind	<string>
     Kind is a string value representing the REST resource this object
     represents. Servers may infer this from the endpoint the client submits
     requests to. Cannot be updated. In CamelCase. More info:
     https://git.k8s.io/community/contributors/devel/api-conventions.md#types-kinds

   metadata	<Object>  控制器元数据
     If the Labels of a ReplicaSet are empty, they are defaulted to be the same
     as the Pod(s) that the ReplicaSet manages. Standard object's metadata. More
     info:
     https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata

   spec	<Object>  控制器的定义
     Spec defines the specification of the desired behavior of the ReplicaSet.
     More info:
     https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status

   status	<Object>
     Status is the most recently observed status of the ReplicaSet. This data
     may be out of date by some window of time. Populated by the system.
     Read-only. More info:
     https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status

  控制器元数据定义参数

[root@master manifests]# kubectl explain rs.metadata
KIND:     ReplicaSet
VERSION:  extensions/v1beta1

RESOURCE: metadata <Object>

DESCRIPTION:
     If the Labels of a ReplicaSet are empty, they are defaulted to be the same
     as the Pod(s) that the ReplicaSet manages. Standard object's metadata. More
     info:
     https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata

     ObjectMeta is metadata that all persisted resources must have, which
     includes all objects users must create.

FIELDS:
   annotations	<map[string]string>
     Annotations is an unstructured key value map stored with a resource that
     may be set by external tools to store and retrieve arbitrary metadata. They
     are not queryable and should be preserved when modifying objects. More
     info: http://kubernetes.io/docs/user-guide/annotations

   clusterName	<string>
     The name of the cluster which the object belongs to. This is used to
     distinguish resources with same name and namespace in different clusters.
     This field is not set anywhere right now and apiserver is going to ignore
     it if set in create or update request.

   creationTimestamp	<string>
     CreationTimestamp is a timestamp representing the server time when this
     object was created. It is not guaranteed to be set in happens-before order
     across separate operations. Clients may not set this value. It is
     represented in RFC3339 form and is in UTC. Populated by the system.
     Read-only. Null for lists. More info:
     https://git.k8s.io/community/contributors/devel/api-conventions.md#metadata

   deletionGracePeriodSeconds	<integer>
     Number of seconds allowed for this object to gracefully terminate before it
     will be removed from the system. Only set when deletionTimestamp is also
     set. May only be shortened. Read-only.

   deletionTimestamp	<string>
     DeletionTimestamp is RFC 3339 date and time at which this resource will be
     deleted. This field is set by the server when a graceful deletion is
     requested by the user, and is not directly settable by a client. The
     resource is expected to be deleted (no longer visible from resource lists,
     and not reachable by name) after the time in this field, once the
     finalizers list is empty. As long as the finalizers list contains items,
     deletion is blocked. Once the deletionTimestamp is set, this value may not
     be unset or be set further into the future, although it may be shortened or
     the resource may be deleted prior to this time. For example, a user may
     request that a pod is deleted in 30 seconds. The Kubelet will react by
     sending a graceful termination signal to the containers in the pod. After
     that 30 seconds, the Kubelet will send a hard termination signal (SIGKILL)
     to the container and after cleanup, remove the pod from the API. In the
     presence of network partitions, this object may still exist after this
     timestamp, until an administrator or automated process can determine the
     resource is fully terminated. If not set, graceful deletion of the object
     has not been requested. Populated by the system when a graceful deletion is
     requested. Read-only. More info:
     https://git.k8s.io/community/contributors/devel/api-conventions.md#metadata

   finalizers	<[]string>
     Must be empty before the object is deleted from the registry. Each entry is
     an identifier for the responsible component that will remove the entry from
     the list. If the deletionTimestamp of the object is non-nil, entries in
     this list can only be removed.

   generateName	<string>
     GenerateName is an optional prefix, used by the server, to generate a
     unique name ONLY IF the Name field has not been provided. If this field is
     used, the name returned to the client will be different than the name
     passed. This value will also be combined with a unique suffix. The provided
     value has the same validation rules as the Name field, and may be truncated
     by the length of the suffix required to make the value unique on the
     server. If this field is specified and the generated name exists, the
     server will NOT return a 409 - instead, it will either return 201 Created
     or 500 with Reason ServerTimeout indicating a unique name could not be
     found in the time allotted, and the client should retry (optionally after
     the time indicated in the Retry-After header). Applied only if Name is not
     specified. More info:
     https://git.k8s.io/community/contributors/devel/api-conventions.md#idempotency

   generation	<integer>
     A sequence number representing a specific generation of the desired state.
     Populated by the system. Read-only.

   initializers	<Object>
     An initializer is a controller which enforces some system invariant at
     object creation time. This field is a list of initializers that have not
     yet acted on this object. If nil or empty, this object has been completely
     initialized. Otherwise, the object is considered uninitialized and is
     hidden (in list/watch and get calls) from clients that haven't explicitly
     asked to observe uninitialized objects. When an object is created, the
     system will populate this list with the current set of initializers. Only
     privileged users may set or modify this list. Once it is empty, it may not
     be modified further by any user. DEPRECATED - initializers are an alpha
     field and will be removed in v1.15.

   labels	<map[string]string>
     Map of string keys and values that can be used to organize and categorize
     (scope and select) objects. May match selectors of replication controllers
     and services. More info: http://kubernetes.io/docs/user-guide/labels

   managedFields	<[]Object>
     ManagedFields maps workflow-id and version to the set of fields that are
     managed by that workflow. This is mostly for internal housekeeping, and
     users typically shouldn't need to set or understand this field. A workflow
     can be the user's name, a controller's name, or the name of a specific
     apply path like "ci-cd". The set of fields is always in the version that
     the workflow used when modifying the object. This field is alpha and can be
     changed or removed without notice.

   name	<string>  名字
     Name must be unique within a namespace. Is required when creating
     resources, although some resources may allow a client to request the
     generation of an appropriate name automatically. Name is primarily intended
     for creation idempotence and configuration definition. Cannot be updated.
     More info: http://kubernetes.io/docs/user-guide/identifiers#names

   namespace	<string>  属于哪个名称空间
     Namespace defines the space within each name must be unique. An empty
     namespace is equivalent to the "default" namespace, but "default" is the
     canonical representation. Not all objects are required to be scoped to a
     namespace - the value of this field for those objects will be empty. Must
     be a DNS_LABEL. Cannot be updated. More info:
     http://kubernetes.io/docs/user-guide/namespaces

   ownerReferences	<[]Object>
     List of objects depended by this object. If ALL objects in the list have
     been deleted, this object will be garbage collected. If this object is
     managed by a controller, then an entry in this list will point to this
     controller, with the controller field set to true. There cannot be more
     than one managing controller.

   resourceVersion	<string>
     An opaque value that represents the internal version of this object that
     can be used by clients to determine when objects have changed. May be used
     for optimistic concurrency, change detection, and the watch operation on a
     resource or set of resources. Clients must treat these values as opaque and
     passed unmodified back to the server. They may only be valid for a
     particular resource or set of resources. Populated by the system.
     Read-only. Value must be treated as opaque by clients and . More info:
     https://git.k8s.io/community/contributors/devel/api-conventions.md#concurrency-control-and-consistency

   selfLink	<string>
     SelfLink is a URL representing this object. Populated by the system.
     Read-only.

   uid	<string>
     UID is the unique in time and space value for this object. It is typically
     generated by the server on successful creation of a resource and is not
     allowed to change on PUT operations. Populated by the system. Read-only.
     More info: http://kubernetes.io/docs/user-guide/identifiers#uids

  控制器状态定义参数

[root@master manifests]# kubectl explain rs.spec
KIND:     ReplicaSet
VERSION:  extensions/v1beta1

RESOURCE: spec <Object>

DESCRIPTION:
     Spec defines the specification of the desired behavior of the ReplicaSet.
     More info:
     https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status

     ReplicaSetSpec is the specification of a ReplicaSet.

FIELDS:
   minReadySeconds	<integer>
     Minimum number of seconds for which a newly created pod should be ready
     without any of its container crashing, for it to be considered available.
     Defaults to 0 (pod will be considered available as soon as it is ready)

   replicas	<integer>  副本个数
     Replicas is the number of desired replicas. This is a pointer to
     distinguish between explicit zero and unspecified. Defaults to 1. More
     info:
     https://kubernetes.io/docs/concepts/workloads/controllers/replicationcontroller/#what-is-a-replicationcontroller

   selector	<Object>  标签选择器
     Selector is a label query over pods that should match the replica count. If
     the selector is empty, it is defaulted to the labels present on the pod
     template. Label keys and values that must match in order to be controlled
     by this replica set. More info:
     https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#label-selectors

   template	<Object>  pod的定义
     Template is the object that describes the pod that will be created if
     insufficient replicas are detected. More info:
     https://kubernetes.io/docs/concepts/workloads/controllers/replicationcontroller#pod-template

  pod的定义参数介绍

[root@master manifests]# kubectl explain rs.spec.template
KIND:     ReplicaSet
VERSION:  extensions/v1beta1

RESOURCE: template <Object>

DESCRIPTION:
     Template is the object that describes the pod that will be created if
     insufficient replicas are detected. More info:
     https://kubernetes.io/docs/concepts/workloads/controllers/replicationcontroller#pod-template

     PodTemplateSpec describes the data a pod should have when created from a
     template

FIELDS:
   metadata	<Object>  元数据
     Standard object's metadata. More info:
     https://git.k8s.io/community/contributors/devel/api-conventions.md#metadata

   spec	<Object>  pod 的目标状态定义
     Specification of the desired behavior of the pod. More info:
     https://git.k8s.io/community/contributors/devel/api-conventions.md#spec-and-status

  pod元数据的定义

[root@master manifests]# kubectl explain rs.spec.template.metadata
KIND:     ReplicaSet
VERSION:  extensions/v1beta1

RESOURCE: metadata <Object>

DESCRIPTION:
     Standard object's metadata. More info:
     https://git.k8s.io/community/contributors/devel/api-conventions.md#metadata

     ObjectMeta is metadata that all persisted resources must have, which
     includes all objects users must create.

FIELDS:
   annotations	<map[string]string>
     Annotations is an unstructured key value map stored with a resource that
     may be set by external tools to store and retrieve arbitrary metadata. They
     are not queryable and should be preserved when modifying objects. More
     info: http://kubernetes.io/docs/user-guide/annotations

   clusterName	<string>
     The name of the cluster which the object belongs to. This is used to
     distinguish resources with same name and namespace in different clusters.
     This field is not set anywhere right now and apiserver is going to ignore
     it if set in create or update request.

   creationTimestamp	<string>
     CreationTimestamp is a timestamp representing the server time when this
     object was created. It is not guaranteed to be set in happens-before order
     across separate operations. Clients may not set this value. It is
     represented in RFC3339 form and is in UTC. Populated by the system.
     Read-only. Null for lists. More info:
     https://git.k8s.io/community/contributors/devel/api-conventions.md#metadata

   deletionGracePeriodSeconds	<integer>
     Number of seconds allowed for this object to gracefully terminate before it
     will be removed from the system. Only set when deletionTimestamp is also
     set. May only be shortened. Read-only.

   deletionTimestamp	<string>
     DeletionTimestamp is RFC 3339 date and time at which this resource will be
     deleted. This field is set by the server when a graceful deletion is
     requested by the user, and is not directly settable by a client. The
     resource is expected to be deleted (no longer visible from resource lists,
     and not reachable by name) after the time in this field, once the
     finalizers list is empty. As long as the finalizers list contains items,
     deletion is blocked. Once the deletionTimestamp is set, this value may not
     be unset or be set further into the future, although it may be shortened or
     the resource may be deleted prior to this time. For example, a user may
     request that a pod is deleted in 30 seconds. The Kubelet will react by
     sending a graceful termination signal to the containers in the pod. After
     that 30 seconds, the Kubelet will send a hard termination signal (SIGKILL)
     to the container and after cleanup, remove the pod from the API. In the
     presence of network partitions, this object may still exist after this
     timestamp, until an administrator or automated process can determine the
     resource is fully terminated. If not set, graceful deletion of the object
     has not been requested. Populated by the system when a graceful deletion is
     requested. Read-only. More info:
     https://git.k8s.io/community/contributors/devel/api-conventions.md#metadata

   finalizers	<[]string>
     Must be empty before the object is deleted from the registry. Each entry is
     an identifier for the responsible component that will remove the entry from
     the list. If the deletionTimestamp of the object is non-nil, entries in
     this list can only be removed.

   generateName	<string>
     GenerateName is an optional prefix, used by the server, to generate a
     unique name ONLY IF the Name field has not been provided. If this field is
     used, the name returned to the client will be different than the name
     passed. This value will also be combined with a unique suffix. The provided
     value has the same validation rules as the Name field, and may be truncated
     by the length of the suffix required to make the value unique on the
     server. If this field is specified and the generated name exists, the
     server will NOT return a 409 - instead, it will either return 201 Created
     or 500 with Reason ServerTimeout indicating a unique name could not be
     found in the time allotted, and the client should retry (optionally after
     the time indicated in the Retry-After header). Applied only if Name is not
     specified. More info:
     https://git.k8s.io/community/contributors/devel/api-conventions.md#idempotency

   generation	<integer>
     A sequence number representing a specific generation of the desired state.
     Populated by the system. Read-only.

   initializers	<Object>
     An initializer is a controller which enforces some system invariant at
     object creation time. This field is a list of initializers that have not
     yet acted on this object. If nil or empty, this object has been completely
     initialized. Otherwise, the object is considered uninitialized and is
     hidden (in list/watch and get calls) from clients that haven't explicitly
     asked to observe uninitialized objects. When an object is created, the
     system will populate this list with the current set of initializers. Only
     privileged users may set or modify this list. Once it is empty, it may not
     be modified further by any user. DEPRECATED - initializers are an alpha
     field and will be removed in v1.15.

   labels	<map[string]string>  标签定义
     Map of string keys and values that can be used to organize and categorize
     (scope and select) objects. May match selectors of replication controllers
     and services. More info: http://kubernetes.io/docs/user-guide/labels

   managedFields	<[]Object>
     ManagedFields maps workflow-id and version to the set of fields that are
     managed by that workflow. This is mostly for internal housekeeping, and
     users typically shouldn't need to set or understand this field. A workflow
     can be the user's name, a controller's name, or the name of a specific
     apply path like "ci-cd". The set of fields is always in the version that
     the workflow used when modifying the object. This field is alpha and can be
     changed or removed without notice.

   name	<string>  
     Name must be unique within a namespace. Is required when creating
     resources, although some resources may allow a client to request the
     generation of an appropriate name automatically. Name is primarily intended
     for creation idempotence and configuration definition. Cannot be updated.
     More info: http://kubernetes.io/docs/user-guide/identifiers#names

   namespace	<string>
     Namespace defines the space within each name must be unique. An empty
     namespace is equivalent to the "default" namespace, but "default" is the
     canonical representation. Not all objects are required to be scoped to a
     namespace - the value of this field for those objects will be empty. Must
     be a DNS_LABEL. Cannot be updated. More info:
     http://kubernetes.io/docs/user-guide/namespaces

   ownerReferences	<[]Object>
     List of objects depended by this object. If ALL objects in the list have
     been deleted, this object will be garbage collected. If this object is
     managed by a controller, then an entry in this list will point to this
     controller, with the controller field set to true. There cannot be more
     than one managing controller.

   resourceVersion	<string>
     An opaque value that represents the internal version of this object that
     can be used by clients to determine when objects have changed. May be used
     for optimistic concurrency, change detection, and the watch operation on a
     resource or set of resources. Clients must treat these values as opaque and
     passed unmodified back to the server. They may only be valid for a
     particular resource or set of resources. Populated by the system.
     Read-only. Value must be treated as opaque by clients and . More info:
     https://git.k8s.io/community/contributors/devel/api-conventions.md#concurrency-control-and-consistency

   selfLink	<string>
     SelfLink is a URL representing this object. Populated by the system.
     Read-only.

   uid	<string>
     UID is the unique in time and space value for this object. It is typically
     generated by the server on successful creation of a resource and is not
     allowed to change on PUT operations. Populated by the system. Read-only.
     More info: http://kubernetes.io/docs/user-guide/identifiers#uids

  pod的状态定义的参数

[root@master manifests]# kubectl explain rs.spec.template.spec
KIND:     ReplicaSet
VERSION:  extensions/v1beta1

RESOURCE: spec <Object>

DESCRIPTION:
     Specification of the desired behavior of the pod. More info:
     https://git.k8s.io/community/contributors/devel/api-conventions.md#spec-and-status

     PodSpec is a description of a pod.

FIELDS:
   activeDeadlineSeconds	<integer>
     Optional duration in seconds the pod may be active on the node relative to
     StartTime before the system will actively try to mark it failed and kill
     associated containers. Value must be a positive integer.

   affinity	<Object>
     If specified, the pod's scheduling constraints

   automountServiceAccountToken	<boolean>
     AutomountServiceAccountToken indicates whether a service account token
     should be automatically mounted.

   containers	<[]Object> -required-  容器的定义
     List of containers belonging to the pod. Containers cannot currently be
     added or removed. There must be at least one container in a Pod. Cannot be
     updated.

   dnsConfig	<Object>
     Specifies the DNS parameters of a pod. Parameters specified here will be
     merged to the generated DNS configuration based on DNSPolicy.

   dnsPolicy	<string>
     Set DNS policy for the pod. Defaults to "ClusterFirst". Valid values are
     'ClusterFirstWithHostNet', 'ClusterFirst', 'Default' or 'None'. DNS
     parameters given in DNSConfig will be merged with the policy selected with
     DNSPolicy. To have DNS options set along with hostNetwork, you have to
     specify DNS policy explicitly to 'ClusterFirstWithHostNet'.

   enableServiceLinks	<boolean>
     EnableServiceLinks indicates whether information about services should be
     injected into pod's environment variables, matching the syntax of Docker
     links. Optional: Defaults to true.

   hostAliases	<[]Object>
     HostAliases is an optional list of hosts and IPs that will be injected into
     the pod's hosts file if specified. This is only valid for non-hostNetwork
     pods.

   hostIPC	<boolean>
     Use the host's ipc namespace. Optional: Default to false.

   hostNetwork	<boolean>
     Host networking requested for this pod. Use the host's network namespace.
     If this option is set, the ports that will be used must be specified.
     Default to false.

   hostPID	<boolean>
     Use the host's pid namespace. Optional: Default to false.

   hostname	<string>
     Specifies the hostname of the Pod If not specified, the pod's hostname will
     be set to a system-defined value.

   imagePullSecrets	<[]Object>
     ImagePullSecrets is an optional list of references to secrets in the same
     namespace to use for pulling any of the images used by this PodSpec. If
     specified, these secrets will be passed to individual puller
     implementations for them to use. For example, in the case of docker, only
     DockerConfig type secrets are honored. More info:
     https://kubernetes.io/docs/concepts/containers/images#specifying-imagepullsecrets-on-a-pod

   initContainers	<[]Object>
     List of initialization containers belonging to the pod. Init containers are
     executed in order prior to containers being started. If any init container
     fails, the pod is considered to have failed and is handled according to its
     restartPolicy. The name for an init container or normal container must be
     unique among all containers. Init containers may not have Lifecycle
     actions, Readiness probes, or Liveness probes. The resourceRequirements of
     an init container are taken into account during scheduling by finding the
     highest request/limit for each resource type, and then using the max of of
     that value or the sum of the normal containers. Limits are applied to init
     containers in a similar fashion. Init containers cannot currently be added
     or removed. Cannot be updated. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/init-containers/

   nodeName	<string>
     NodeName is a request to schedule this pod onto a specific node. If it is
     non-empty, the scheduler simply schedules this pod onto that node, assuming
     that it fits resource requirements.

   nodeSelector	<map[string]string>
     NodeSelector is a selector which must be true for the pod to fit on a node.
     Selector which must match a node's labels for the pod to be scheduled on
     that node. More info:
     https://kubernetes.io/docs/concepts/configuration/assign-pod-node/

   preemptionPolicy	<string>
     PreemptionPolicy is the Policy for preempting pods with lower priority. One
     of Never, PreemptLowerPriority. Defaults to PreemptLowerPriority if unset.
     This field is alpha-level and is only honored by servers that enable the
     NonPreemptingPriority feature.

   priority	<integer>
     The priority value. Various system components use this field to find the
     priority of the pod. When Priority Admission Controller is enabled, it
     prevents users from setting this field. The admission controller populates
     this field from PriorityClassName. The higher the value, the higher the
     priority.

   priorityClassName	<string>
     If specified, indicates the pod's priority. "system-node-critical" and
     "system-cluster-critical" are two special keywords which indicate the
     highest priorities with the former being the highest priority. Any other
     name must be defined by creating a PriorityClass object with that name. If
     not specified, the pod priority will be default or zero if there is no
     default.

   readinessGates	<[]Object>
     If specified, all readiness gates will be evaluated for pod readiness. A
     pod is ready when all its containers are ready AND all conditions specified
     in the readiness gates have status equal to "True" More info:
     https://git.k8s.io/enhancements/keps/sig-network/0007-pod-ready%2B%2B.md

   restartPolicy	<string>
     Restart policy for all containers within the pod. One of Always, OnFailure,
     Never. Default to Always. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy

   runtimeClassName	<string>
     RuntimeClassName refers to a RuntimeClass object in the node.k8s.io group,
     which should be used to run this pod. If no RuntimeClass resource matches
     the named class, the pod will not be run. If unset or empty, the "legacy"
     RuntimeClass will be used, which is an implicit class with an empty
     definition that uses the default runtime handler. More info:
     https://git.k8s.io/enhancements/keps/sig-node/runtime-class.md This is a
     beta feature as of Kubernetes v1.14.

   schedulerName	<string>
     If specified, the pod will be dispatched by specified scheduler. If not
     specified, the pod will be dispatched by default scheduler.

   securityContext	<Object>
     SecurityContext holds pod-level security attributes and common container
     settings. Optional: Defaults to empty. See type description for default
     values of each field.

   serviceAccount	<string>
     DeprecatedServiceAccount is a depreciated alias for ServiceAccountName.
     Deprecated: Use serviceAccountName instead.

   serviceAccountName	<string>
     ServiceAccountName is the name of the ServiceAccount to use to run this
     pod. More info:
     https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/

   shareProcessNamespace	<boolean>
     Share a single process namespace between all of the containers in a pod.
     When this is set containers will be able to view and signal processes from
     other containers in the same pod, and the first process in each container
     will not be assigned PID 1. HostPID and ShareProcessNamespace cannot both
     be set. Optional: Default to false. This field is beta-level and may be
     disabled with the PodShareProcessNamespace feature.

   subdomain	<string>
     If specified, the fully qualified Pod hostname will be
     "<hostname>.<subdomain>.<pod namespace>.svc.<cluster domain>". If not
     specified, the pod will not have a domainname at all.

   terminationGracePeriodSeconds	<integer>
     Optional duration in seconds the pod needs to terminate gracefully. May be
     decreased in delete request. Value must be non-negative integer. The value
     zero indicates delete immediately. If this value is nil, the default grace
     period will be used instead. The grace period is the duration in seconds
     after the processes running in the pod are sent a termination signal and
     the time when the processes are forcibly halted with a kill signal. Set
     this value longer than the expected cleanup time for your process. Defaults
     to 30 seconds.

   tolerations	<[]Object>
     If specified, the pod's tolerations.

   volumes	<[]Object>
     List of volumes that can be mounted by containers belonging to the pod.
     More info: https://kubernetes.io/docs/concepts/storage/volumes

  pod的容器的相关定义

[root@master manifests]# kubectl explain rs.spec.template.spec.containers
KIND:     ReplicaSet
VERSION:  extensions/v1beta1

RESOURCE: containers <[]Object>

DESCRIPTION:
     List of containers belonging to the pod. Containers cannot currently be
     added or removed. There must be at least one container in a Pod. Cannot be
     updated.

     A single application container that you want to run within a pod.

FIELDS:
   args	<[]string>
     Arguments to the entrypoint. The docker image's CMD is used if this is not
     provided. Variable references $(VAR_NAME) are expanded using the
     container's environment. If a variable cannot be resolved, the reference in
     the input string will be unchanged. The $(VAR_NAME) syntax can be escaped
     with a double $$, ie: $$(VAR_NAME). Escaped references will never be
     expanded, regardless of whether the variable exists or not. Cannot be
     updated. More info:
     https://kubernetes.io/docs/tasks/inject-data-application/define-command-argument-container/#running-a-command-in-a-shell

   command	<[]string>
     Entrypoint array. Not executed within a shell. The docker image's
     ENTRYPOINT is used if this is not provided. Variable references $(VAR_NAME)
     are expanded using the container's environment. If a variable cannot be
     resolved, the reference in the input string will be unchanged. The
     $(VAR_NAME) syntax can be escaped with a double $$, ie: $$(VAR_NAME).
     Escaped references will never be expanded, regardless of whether the
     variable exists or not. Cannot be updated. More info:
     https://kubernetes.io/docs/tasks/inject-data-application/define-command-argument-container/#running-a-command-in-a-shell

   env	<[]Object>  容器里变量定义
     List of environment variables to set in the container. Cannot be updated.

   envFrom	<[]Object>
     List of sources to populate environment variables in the container. The
     keys defined within a source must be a C_IDENTIFIER. All invalid keys will
     be reported as an event when the container is starting. When a key exists
     in multiple sources, the value associated with the last source will take
     precedence. Values defined by an Env with a duplicate key will take
     precedence. Cannot be updated.

   image	<string>  使用的容器镜像
     Docker image name. More info:
     https://kubernetes.io/docs/concepts/containers/images This field is
     optional to allow higher level config management to default or override
     container images in workload controllers like Deployments and StatefulSets.

   imagePullPolicy	<string>  获取镜像的策略
     Image pull policy. One of Always, Never, IfNotPresent. Defaults to Always
     if :latest tag is specified, or IfNotPresent otherwise. Cannot be updated.
     More info:
     https://kubernetes.io/docs/concepts/containers/images#updating-images

   lifecycle	<Object>
     Actions that the management system should take in response to container
     lifecycle events. Cannot be updated.

   livenessProbe	<Object>
     Periodic probe of container liveness. Container will be restarted if the
     probe fails. Cannot be updated. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

   name	<string> -required-  容器名字
     Name of the container specified as a DNS_LABEL. Each container in a pod
     must have a unique name (DNS_LABEL). Cannot be updated.

   ports	<[]Object>  暴露端口的参数
     List of ports to expose from the container. Exposing a port here gives the
     system additional information about the network connections a container
     uses, but is primarily informational. Not specifying a port here DOES NOT
     prevent that port from being exposed. Any port which is listening on the
     default "0.0.0.0" address inside a container will be accessible from the
     network. Cannot be updated.

   readinessProbe	<Object>
     Periodic probe of container service readiness. Container will be removed
     from service endpoints if the probe fails. Cannot be updated. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

   resources	<Object>
     Compute Resources required by this container. Cannot be updated. More info:
     https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/

   securityContext	<Object>
     Security options the pod should run with. More info:
     https://kubernetes.io/docs/concepts/policy/security-context/ More info:
     https://kubernetes.io/docs/tasks/configure-pod-container/security-context/

   stdin	<boolean>
     Whether this container should allocate a buffer for stdin in the container
     runtime. If this is not set, reads from stdin in the container will always
     result in EOF. Default is false.

   stdinOnce	<boolean>
     Whether the container runtime should close the stdin channel after it has
     been opened by a single attach. When stdin is true the stdin stream will
     remain open across multiple attach sessions. If stdinOnce is set to true,
     stdin is opened on container start, is empty until the first client
     attaches to stdin, and then remains open and accepts data until the client
     disconnects, at which time stdin is closed and remains closed until the
     container is restarted. If this flag is false, a container processes that
     reads from stdin will never receive an EOF. Default is false

   terminationMessagePath	<string>
     Optional: Path at which the file to which the container's termination
     message will be written is mounted into the container's filesystem. Message
     written is intended to be brief final status, such as an assertion failure
     message. Will be truncated by the node if greater than 4096 bytes. The
     total message length across all containers will be limited to 12kb.
     Defaults to /dev/termination-log. Cannot be updated.

   terminationMessagePolicy	<string>
     Indicate how the termination message should be populated. File will use the
     contents of terminationMessagePath to populate the container status message
     on both success and failure. FallbackToLogsOnError will use the last chunk
     of container log output if the termination message file is empty and the
     container exited with an error. The log output is limited to 2048 bytes or
     80 lines, whichever is smaller. Defaults to File. Cannot be updated.

   tty	<boolean>
     Whether this container should allocate a TTY for itself, also requires
     'stdin' to be true. Default is false.

   volumeDevices	<[]Object>
     volumeDevices is the list of block devices to be used by the container.
     This is a beta feature.

   volumeMounts	<[]Object>
     Pod volumes to mount into the container's filesystem. Cannot be updated.

   workingDir	<string>
     Container's working directory. If not specified, the container runtime's
     default will be used, which might be configured in the container image.
     Cannot be updated.

  pod里容器暴露端口的参数

[root@master manifests]# kubectl explain rs.spec.template.spec.containers.ports
KIND:     ReplicaSet
VERSION:  extensions/v1beta1

RESOURCE: ports <[]Object>

DESCRIPTION:
     List of ports to expose from the container. Exposing a port here gives the
     system additional information about the network connections a container
     uses, but is primarily informational. Not specifying a port here DOES NOT
     prevent that port from being exposed. Any port which is listening on the
     default "0.0.0.0" address inside a container will be accessible from the
     network. Cannot be updated.

     ContainerPort represents a network port in a single container.

FIELDS:
   containerPort	<integer> -required-  容器里的端口
     Number of port to expose on the pod's IP address. This must be a valid port
     number, 0 < x < 65536.

   hostIP	<string>
     What host IP to bind the external port to.

   hostPort	<integer>
     Number of port to expose on the host. If specified, this must be a valid
     port number, 0 < x < 65536. If HostNetwork is specified, this must match
     ContainerPort. Most containers do not need this.

   name	<string>  名字
     If specified, this must be an IANA_SVC_NAME and unique within the pod. Each
     named port in a pod must have a unique name. Name for the port that can be
     referred to by services.

   protocol	<string>  协议
     Protocol for port. Must be UDP, TCP, or SCTP. Defaults to "TCP".

     pod容器状态探针的定义

[root@master manifests]# kubectl explain rs.spec.template.spec.containers.livenessProbe
KIND:     ReplicaSet
VERSION:  extensions/v1beta1

RESOURCE: livenessProbe <Object>

DESCRIPTION:
     Periodic probe of container liveness. Container will be restarted if the
     probe fails. Cannot be updated. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

     Probe describes a health check to be performed against a container to
     determine whether it is alive or ready to receive traffic.

FIELDS:
   exec	<Object>  使用命令
     One and only one of the following should be specified. Exec specifies the
     action to take.

   failureThreshold	<integer>
     Minimum consecutive failures for the probe to be considered failed after
     having succeeded. Defaults to 3. Minimum value is 1.

   httpGet	<Object> 使用http
     HTTPGet specifies the http request to perform.

   initialDelaySeconds	<integer>
     Number of seconds after the container has started before liveness probes
     are initiated. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

   periodSeconds	<integer>
     How often (in seconds) to perform the probe. Default to 10 seconds. Minimum
     value is 1.

   successThreshold	<integer>
     Minimum consecutive successes for the probe to be considered successful
     after having failed. Defaults to 1. Must be 1 for liveness. Minimum value
     is 1.

   tcpSocket	<Object>  使用tcp
     TCPSocket specifies an action involving a TCP port. TCP hooks not yet
     supported

   timeoutSeconds	<integer>
     Number of seconds after which the probe times out. Defaults to 1 second.
     Minimum value is 1. More info:
     https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

  编写一个rs控制器的yaml文件,并启动pod

[root@master manifests]# cat rs-01.yaml 
apiVersion: apps/v1  #API的版本
kind: ReplicaSet   #控制器对象
metadata:  #控制器元数据
  name: rs-myapp   #控制的名字
  namespace: default  #控制器的名称空间
spec:  #期望状态定义
  replicas: 3 #期望的副本数量
  selector:   #便签选择器定义
    matchLabels: #使用哪个标签选择器
      app: rs-cx   #标签的定义
      rs: cx  #标签定义
  template:  pod定义
    metadata:  pod 元数据定义
      labels:   定义pod 标签
        app: rs-cx  
        rs: cx
    spec:  pod期望状态定义
      containers:  容器定义
      - name: myapp-rs  容器的名字
        image: ikubernetes/myapp:v1  镜像的定义
        ports: 暴露端口的定义
        - name: http  端口名字定义
          containerPort: 80 容器里暴露的端口

  创建这个控制器类型的pod

 kubectl create -f rs-01.yaml 
查看创建pod
 kubectl get pods -o wide -l rs=cx
NAME             READY   STATUS    RESTARTS   AGE    IP            NODE     NOMINATED NODE   READINESS GATES
rs-myapp-f5pg7   1/1     Running   0          168m   10.244.1.43   node01   <none>           <none>
rs-myapp-wzjkz   1/1     Running   0          168m   10.244.1.45   node01   <none>           <none>
rs-myapp-z6kx4   1/1     Running   0          168m   10.244.2.21   node02   <none>           <none>

  删除一个pod 自动创建

[root@master manifests]# kubectl delete pods  rs-myapp-z6kx4 
pod "rs-myapp-z6kx4" deleted
[root@master manifests]# kubectl get pods -o wide -l rs=cx 
NAME             READY   STATUS    RESTARTS   AGE    IP            NODE     NOMINATED NODE   READINESS GATES
rs-myapp-f5pg7   1/1     Running   0          170m   10.244.1.43   node01   <none>           <none>
rs-myapp-pmvw8   1/1     Running   0          9s     10.244.2.23   node02   <none>           <none>
rs-myapp-wzjkz   1/1     Running   0          170m   10.244.1.45   node01   <none>           <none>

  查看创建的pod 的详细信息

[root@master manifests]# kubectl  describe pods rs-myapp-wzjkz
Name:           rs-myapp-wzjkz
Namespace:      default
Priority:       0
Node:           node01/192.168.183.12  运行在呢个节点
Start Time:     Sat, 10 Aug 2019 13:01:49 +0800
Labels:         app=rs-cx  标签
                rs=cx
Annotations:    <none>
Status:         Running  状态
IP:             10.244.1.45  pod的IP地址
Controlled By:  ReplicaSet/rs-myapp
Containers:
  myapp-rs:
    Container ID:   docker://42b4318ab99e8d36aa5716ae8fa459ceb70adf4b68f4d1ebbbbcd79527457175
    Image:          ikubernetes/myapp:v1  镜像
    Image ID:       docker-pullable://ikubernetes/myapp@sha256:9c3dc30b5219788b2b8a4b065f548b922a34479577befb54b03330999d30d513
    Port:           80/TCP容器端口
    Host Port:      0/TCP
    State:          Running
      Started:      Sat, 10 Aug 2019 13:04:47 +0800
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-2m2ts (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  default-token-2m2ts:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-2m2ts
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

  多退的示例

[root@master manifests]# kubectl label pods pod-demo app=rs-cx --overwrite
pod/pod-demo labeled
[root@master manifests]# kubectl label pods pod-demo rs=cx
pod/pod-demo labeled
[root@master manifests]# kubectl get pods --show-labels
NAME                     READY   STATUS        RESTARTS   AGE     LABELS
myapp-84cd4b7f95-g6ldp   1/1     Running       6          16d     pod-template-hash=84cd4b7f95,run=myapp
nginx-5896f46c8-zblcs    1/1     Running       6          16d     chenxi=cx,pod-template-hash=5896f46c8,run=nginx
pod-demo                 2/2     Running       8          5d16h   app=rs-cx,rs=cx,tier=frontend
rs-myapp-f5pg7           1/1     Running       0          3h5m    app=rs-cx,rs=cx
rs-myapp-pmvw8           0/1     Terminating   0          15m     app=rs-cx,rs=cx 
rs-myapp-wzjkz           1/1     Running       0          3h5m    app=rs-cx,rs=cx
[root@master manifests]# kubectl get pods --show-labels  自动随机删除一个pod
NAME                     READY   STATUS    RESTARTS   AGE     LABELS
myapp-84cd4b7f95-g6ldp   1/1     Running   6          16d     pod-template-hash=84cd4b7f95,run=myapp
nginx-5896f46c8-zblcs    1/1     Running   6          16d     chenxi=cx,pod-template-hash=5896f46c8,run=nginx
pod-demo                 2/2     Running   8          5d16h   app=rs-cx,rs=cx,tier=frontend
rs-myapp-f5pg7           1/1     Running   0          3h5m    app=rs-cx,rs=cx
rs-myapp-wzjkz           1/1     Running   0          3h5m    app=rs-cx,rs=cx

  动态修改pod的个数

[root@master manifests]# kubectl edit  rs rs-myapp

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: extensions/v1beta1
kind: ReplicaSet
metadata:
  creationTimestamp: "2019-08-10T05:01:49Z"
  generation: 1
  name: rs-myapp
  namespace: default
  resourceVersion: "587241"
  selfLink: /apis/extensions/v1beta1/namespaces/default/replicasets/rs-myapp
  uid: c3a57f4b-dde9-4b0c-804e-5026271b70f9
spec:
  replicas: 5
  selector:
    matchLabels:
      app: rs-cx
      rs: cx
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: rs-cx
        rs: cx
    spec:
      containers:
      - image: ikubernetes/myapp:v1
        imagePullPolicy: IfNotPresent
        name: myapp-rs
        ports:
        - containerPort: 80
          name: http
          protocol: TCP
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  availableReplicas: 3
  fullyLabeledReplicas: 3
  observedGeneration: 1
  readyReplicas: 3
  replicas: 3
[root@master manifests]# kubectl get pods -o wide -l rs=cx  扩容到5个
NAME             READY   STATUS    RESTARTS   AGE     IP            NODE     NOMINATED NODE   READINESS GATES
rs-myapp-5z6hp   1/1     Running   0          71s     10.244.2.26   node02   <none>           <none>
rs-myapp-f5pg7   1/1     Running   0          3h17m   10.244.1.43   node01   <none>           <none>
rs-myapp-j75tx   1/1     Running   0          8m9s    10.244.2.24   node02   <none>           <none>
rs-myapp-kh659   1/1     Running   0          71s     10.244.2.25   node02   <none>           <none>
rs-myapp-wzjkz   1/1     Running   0          3h17m   10.244.1.45   node01   <none>           <none>

 Deployment:工作在ReplicaSet之上,支持滚动更新与回滚操作,支持声明式的配置

 

DaemonSet: 保证每个节点上运行一个特定的pod副本,或指定类型的节点运行一个pod副本

 job 运行一次性任务的pod控制器

CronJob:周期性任务的pod控制器

StatefulSet:有状态的pod控制

 

 

 

 

 

 

 

  

posted @ 2019-08-10 16:33  烟雨楼台,行云流水  阅读(747)  评论(0编辑  收藏  举报