Service资源及其实现模型
运行于pod中的部分容器化应用是向客户端提供服务的守护进程,例如nginx,tomcat和etcd等,它们受控于控制器资源对象,存在生命周期,在自愿或非自愿中断后,只能被重构的新pod对象所取代,属于非可再生类的组件。于是,在动态、弹性的管理模型下,Service资源用于为此类pod对象提供一个固定的、统一的访问接口及负载均衡的能力。
Service是kubernetes的核心资源类型之一,通常可看做微服务的一种实现。事实上它是一种抽象:通过规则定义出由多个pod对象组合而成的逻辑集合,以及访问这组pod的策略。Service关联pod资源的规则要借助于标签选择器来完成。
一、Service资源概述
1. 使用Service的原因
在kubernetes中,Pod是有生命周期的,如果Pod重启它的IP很有可能会发生变化。如果服务都是将Pod的IP地址写死,Pod挂掉或者重启,和刚才重启的pod相关联的其他服务将会找不到它所关联的Pod;而pod规模的扩容又会使得客户端无法有效的使用新增的pod对象,从而影响达成规模扩展的目的。为了解决这个问题,在kubernetes中定义了service资源对象,Service 定义了一个服务访问的入口,客户端通过这个入口即可访问服务背后的应用集群实例,Service资源基于标签选择器将一组pod定义成一个逻辑组合,并通过自己的ip地址和端口调度代理请求至组内的pod对象之上。如图所示,它向客户端隐藏了真实的、处理用户请求的pod资源,使得客户端的请求看上去就像是由service直接处理并进行响应的一样。
1)pod ip经常变化,service是pod的代理,客户端访问,只需要访问service,就会把请求代理到Pod
2)pod ip在k8s集群之外无法访问,所以需要创建service,这个service可以在k8s集群外访问的。
2. Service描述
Service是一个固定接入层,客户端可以通过访问service的ip和端口访问到service关联的后端pod,这个service工作依赖于在kubernetes集群之上部署的一个附件,就是kubernetes的dns服务(不同kubernetes版本的dns默认使用的也是不一样的,1.11之前的版本使用的是kubeDNs,较新的版本使用的是coredns),service的名称解析是依赖于dns附件的,因此在部署完k8s之后需要再部署dns附件,kubernetes要想给客户端提供网络功能,需要依赖第三方的网络插件(flannel,calico等)。
Service对象的ip地址也称为Cluster IP,它位于为kubernetes集群配置指定专用ip地址的范围之内,而且是一种虚拟IP地址,它在Service对象创建后即保持不变,并且能够被同一集群中的pod资源所访问。Service端口用于接收客户端请求并将其转发至其后端的Pod中应用的相应端口之上,因此,这种代理机制也称为“端口代理”或四层代理,它工作于TCP/IP协议栈的传输层。
通过其标签选择器匹配到的后端pod资源不止一个时,Service资源能够以负载均衡的方式进行流量调度,实现了请求流量的分发机制。Service与Pod对象之间的关联关系通过标签选择器以松耦合的方式建立,它可以先于pod对象创建而不会发生错误,于是,创建Service与pod资源的任务可由不同的用户分别完成。
Service资源会通过API Server持续监视着标签选择器匹配到的后端pod对象,并实时跟踪各对象的变动,例如,IP地址变动、对象增加或减少等。不过,需要说明的是,Service并不直接连接至pod对象,它们之间还有一个中间层—Endpoints资源对象,它是一个由ip地址和端口组成的列表,这些ip地址和端口则来自于由Service的标签选择器匹配到的pod资源。默认情况下,创建Service资源对象时,其关联的Endpoints对象会自动创建。
3. Service工作原理
kubernetes在创建Service时,会根据标签选择器selector(lable selector)来查找Pod,据此创建与Service同名的endpoint对象,当Pod 地址发生变化时,endpoint也会随之发生变化,service接收前端client请求的时候,就会通过endpoint,找到转发到哪个Pod进行访问的地址。(至于转发到哪个节点的Pod,由负载均衡kube-proxy决定)。
4. kubernetes集群中有三类IP地址
1)Node Network(节点网络):物理节点或者虚拟节点的网络,如eth0接口上的网路地址10.0.0.131/24
[root@k8s-master1 ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:30:2c:a8 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.131/24 brd 10.0.0.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe30:2ca8/64 scope link
valid_lft forever preferred_lft forever
2)Pod network(pod 网络),创建的Pod具有的IP地址:10.244.36.96
[root@k8s-master1 ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-test 1/1 Running 0 8s 10.244.36.96 k8s-node1 <none> <none>
Node Network和Pod network这两种网络地址是实实在在配置的,其中节点网络地址是配置在节点接口之上,而pod网络地址是配置在pod资源之上的,因此这些地址都是配置在某些设备之上的,这些设备可能是硬件,也可能是软件模拟的
3)Cluster Network(集群地址,也称为service network),这个地址是虚拟的地址(virtual ip),没有配置在某个接口上,只是出现在service的规则当中
[root@k8s-master1 ~]# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 41d
二、Service资源清单说明
查看Service的字段定义
[root@k8s-master1 ~]# kubectl explain service KIND: Service VERSION: v1 DESCRIPTION: Service is a named abstraction of software service (for example, mysql) consisting of local port (for example 3306) that the proxy listens on, and the selector that determines which pods will answer requests sent through the proxy. FIELDS: apiVersion <string> #service资源使用的api版本 APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources kind <string> #创建的资源类型 Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds metadata <Object> #定义元数据 Standard object's metadata. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata spec <Object> Spec defines the behavior of a service. https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status status <Object> Most recently observed status of the service. Populated by the system. Read-only. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status
查看service的spec字段如何定义
[root@k8s-master1 ~]# kubectl explain service.spec KIND: Service VERSION: v1 RESOURCE: spec <Object> DESCRIPTION: Spec defines the behavior of a service. https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status ServiceSpec describes the attributes that a user creates on a service. FIELDS: allocateLoadBalancerNodePorts <boolean> allocateLoadBalancerNodePorts defines if NodePorts will be automatically allocated for services with type LoadBalancer. Default is "true". It may be set to "false" if the cluster load-balancer does not rely on NodePorts. allocateLoadBalancerNodePorts may only be set for services with type LoadBalancer and will be cleared if the type is changed to any other type. This field is alpha-level and is only honored by servers that enable the ServiceLBNodePortControl feature. clusterIP <string> #动态分配的地址,也可以在创建的时候指定,创建之后就改不了了 clusterIP is the IP address of the service and is usually assigned randomly. If an address is specified manually, is in-range (as per system configuration), and is not in use, it will be allocated to the service; otherwise creation of the service will fail. This field may not be changed through updates unless the type field is also being changed to ExternalName (which requires this field to be blank) or the type field is being changed from ExternalName (in which case this field may optionally be specified, as describe above). Valid values are "None", empty string (""), or a valid IP address. Setting this to "None" makes a "headless service" (no virtual IP), which is useful when direct endpoint connections are preferred and proxying is not required. Only applies to types ClusterIP, NodePort, and LoadBalancer. If this field is specified when creating a Service of type ExternalName, creation will fail. This field will be wiped when updating a Service to type ExternalName. More info: https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies clusterIPs <[]string> ClusterIPs is a list of IP addresses assigned to this service, and are usually assigned randomly. If an address is specified manually, is in-range (as per system configuration), and is not in use, it will be allocated to the service; otherwise creation of the service will fail. This field may not be changed through updates unless the type field is also being changed to ExternalName (which requires this field to be empty) or the type field is being changed from ExternalName (in which case this field may optionally be specified, as describe above). Valid values are "None", empty string (""), or a valid IP address. Setting this to "None" makes a "headless service" (no virtual IP), which is useful when direct endpoint connections are preferred and proxying is not required. Only applies to types ClusterIP, NodePort, and LoadBalancer. If this field is specified when creating a Service of type ExternalName, creation will fail. This field will be wiped when updating a Service to type ExternalName. If this field is not specified, it will be initialized from the clusterIP field. If this field is specified, clients must ensure that clusterIPs[0] and clusterIP have the same value. Unless the "IPv6DualStack" feature gate is enabled, this field is limited to one value, which must be the same as the clusterIP field. If the feature gate is enabled, this field may hold a maximum of two entries (dual-stack IPs, in either order). These IPs must correspond to the values of the ipFamilies field. Both clusterIPs and ipFamilies are governed by the ipFamilyPolicy field. More info: https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies externalIPs <[]string> externalIPs is a list of IP addresses for which nodes in the cluster will also accept traffic for this service. These IPs are not managed by Kubernetes. The user is responsible for ensuring that traffic arrives at a node with this IP. A common example is external load-balancers that are not part of the Kubernetes system. externalName <string> externalName is the external reference that discovery mechanisms will return as an alias for this service (e.g. a DNS CNAME record). No proxying will be involved. Must be a lowercase RFC-1123 hostname (https://tools.ietf.org/html/rfc1123) and requires Type to be externalTrafficPolicy <string> externalTrafficPolicy denotes if this Service desires to route external traffic to node-local or cluster-wide endpoints. "Local" preserves the client source IP and avoids a second hop for LoadBalancer and Nodeport type services, but risks potentially imbalanced traffic spreading. "Cluster" obscures the client source IP and may cause a second hop to another node, but should have good overall load-spreading. healthCheckNodePort <integer> healthCheckNodePort specifies the healthcheck nodePort for the service. This only applies when type is set to LoadBalancer and externalTrafficPolicy is set to Local. If a value is specified, is in-range, and is not in use, it will be used. If not specified, a value will be automatically allocated. External systems (e.g. load-balancers) can use this port to determine if a given node holds endpoints for this service or not. If this field is specified when creating a Service which does not need it, creation will fail. This field will be wiped when updating a Service to no longer need it (e.g. changing type). ipFamilies <[]string> IPFamilies is a list of IP families (e.g. IPv4, IPv6) assigned to this service, and is gated by the "IPv6DualStack" feature gate. This field is usually assigned automatically based on cluster configuration and the ipFamilyPolicy field. If this field is specified manually, the requested family is available in the cluster, and ipFamilyPolicy allows it, it will be used; otherwise creation of the service will fail. This field is conditionally mutable: it allows for adding or removing a secondary IP family, but it does not allow changing the primary IP family of the Service. Valid values are "IPv4" and "IPv6". This field only applies to Services of types ClusterIP, NodePort, and LoadBalancer, and does apply to "headless" services. This field will be wiped when updating a Service to type ExternalName. This field may hold a maximum of two entries (dual-stack families, in either order). These families must correspond to the values of the clusterIPs field, if specified. Both clusterIPs and ipFamilies are governed by the ipFamilyPolicy field. ipFamilyPolicy <string> IPFamilyPolicy represents the dual-stack-ness requested or required by this Service, and is gated by the "IPv6DualStack" feature gate. If there is no value provided, then this field will be set to SingleStack. Services can be "SingleStack" (a single IP family), "PreferDualStack" (two IP families on dual-stack configured clusters or a single IP family on single-stack clusters), or "RequireDualStack" (two IP families on dual-stack configured clusters, otherwise fail). The ipFamilies and clusterIPs fields depend on the value of this field. This field will be wiped when updating a service to type ExternalName. loadBalancerIP <string> Only applies to Service Type: LoadBalancer LoadBalancer will get created with the IP specified in this field. This feature depends on whether the underlying cloud-provider supports specifying the loadBalancerIP when a load balancer is created. This field will be ignored if the cloud-provider does not support the feature. loadBalancerSourceRanges <[]string> If specified and supported by the platform, this will restrict traffic through the cloud-provider load-balancer will be restricted to the specified client IPs. This field will be ignored if the cloud-provider does not support the feature." More info: https://kubernetes.io/docs/tasks/access-application-cluster/configure-cloud-provider-firewall/ ports <[]Object> #定义service端口,用来和后端pod建立联系 The list of ports that are exposed by this service. More info: https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies publishNotReadyAddresses <boolean> publishNotReadyAddresses indicates that any agent which deals with endpoints for this Service should disregard any indications of ready/not-ready. The primary use case for setting this field is for a StatefulSet's Headless Service to propagate SRV DNS records for its Pods for the purpose of peer discovery. The Kubernetes controllers that generate Endpoints and EndpointSlice resources for Services interpret this to mean that all endpoints are considered "ready" even if the Pods themselves are not. Agents which consume only Kubernetes generated endpoints through the Endpoints or EndpointSlice resources can safely assume this behavior. selector <map[string]string> #通过标签选择器选择关联的pod有哪些 Route service traffic to pods with label keys and values matching this selector. If empty or not present, the service is assumed to have an external process managing its endpoints, which Kubernetes will not modify. Only applies to types ClusterIP, NodePort, and LoadBalancer. Ignored if type is ExternalName. More info: https://kubernetes.io/docs/concepts/services-networking/service/ sessionAffinity <string> Supports "ClientIP" and "None". Used to maintain session affinity. Enable client IP based session affinity. Must be ClientIP or None. Defaults to None. More info: https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies sessionAffinityConfig <Object> sessionAffinityConfig contains the configurations of session affinity.
#service在实现负载均衡的时候还支持sessionAffinity,sessionAffinity会话联系,默认是none,随机调度的(基于iptables规则调度的);如果定义sessionAffinity的client ip,那就表示把来自同一客户端的IP请求调度到同一个pod上 topologyKeys <[]string> topologyKeys is a preference-order list of topology keys which implementations of services should use to preferentially sort endpoints when accessing this Service, it can not be used at the same time as externalTrafficPolicy=Local. Topology keys must be valid label keys and at most 16 keys may be specified. Endpoints are chosen based on the first topology key with available backends. If this field is specified and all entries have no backends that match the topology of the client, the service has no backends for that client and connections should fail. The special value "*" may be used to mean "any topology". This catch-all value, if used, only makes sense as the last value in the list. If this is not specified or empty, no topology constraints will be applied. This field is alpha-level and is only honored by servers that enable the ServiceTopology feature. type <string> #定义service的类型 type determines how the Service is exposed. Defaults to ClusterIP. Valid options are ExternalName, ClusterIP, NodePort, and LoadBalancer. "ClusterIP" allocates a cluster-internal IP address for load-balancing to endpoints. Endpoints are determined by the selector or if that is not specified, by manual construction of an Endpoints object or EndpointSlice objects. If clusterIP is "None", no virtual IP is allocated and the endpoints are published as a set of endpoints rather than a virtual IP. "NodePort" builds on ClusterIP and allocates a port on every node which routes to the same endpoints as the clusterIP. "LoadBalancer" builds on NodePort and creates an external load-balancer (if supported in the current cloud) which routes to the same endpoints as the clusterIP. "ExternalName" aliases this service to the specified externalName. Several other fields do not apply to ExternalName services. More info: https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types
1. Service的四种类型
查看定义Service.spec.type需要的字段有哪些
[root@k8s-master1 ~]# kubectl explain service.spec.type
KIND: Service
VERSION: v1
FIELD: type <string>
DESCRIPTION:
type determines how the Service is exposed. Defaults to ClusterIP. Valid
options are ExternalName, ClusterIP, NodePort, and LoadBalancer.
"ClusterIP" allocates a cluster-internal IP address for load-balancing to
endpoints. Endpoints are determined by the selector or if that is not
specified, by manual construction of an Endpoints object or EndpointSlice
objects. If clusterIP is "None", no virtual IP is allocated and the
endpoints are published as a set of endpoints rather than a virtual IP.
"NodePort" builds on ClusterIP and allocates a port on every node which
routes to the same endpoints as the clusterIP. "LoadBalancer" builds on
NodePort and creates an external load-balancer (if supported in the current
cloud) which routes to the same endpoints as the clusterIP. "ExternalName"
aliases this service to the specified externalName. Several other fields do
not apply to ExternalName services. More info:
https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types
Kubernetes的Service共四种类型:ExternalName, ClusterIP, NodePort, and LoadBalancer
1)ExternalName
其通过将Service映射至由externalName字段的内容指定的主机名来暴露服务,此主机名需要被DNS服务解析至CNAME类型的记录。换言之,此种类型并非定义由kubernetes集群提供的服务,而是把集群外的某服务以DNS CNAME记录方式映射到集群内,从而让集群内的pod资源能够访问外部的Service的一种实现方式。此种类型的Service没有ClusterIP和NodePort,也没有标签选择器用于选择pod资源,因此,也不会有Endpoints存在。
2)ClusterIP
通过集群内部IP地址暴露服务,此地址仅在集群内部可达,而无法被集群外部的客户端访问,此为默认的Service类型。
3)NodePort
这种类型建立在ClusterIP类型之上,其在每个节点的ip地址的某静态端口(NodePort)暴露服务,因此,它依然会为Service分配集群IP地址,并将此作为NodePort的路由目标。简单的说,NodePort类型就是在工作节点的ip地址上选择一个端口用于将集群外部的用户请求转发至目标Service的ClusterIP和port,因此,这种类型的service即可如ClusterIP一样受到集群内部客户端pod的访问,也会受到集群外部客户端通过套接字<NodeIP>:<NodePort>进行的请求
通过请求<NodeIP>:<NodePort>可以把请求代理到内部的pod。Client----->NodeIP:NodePort----->Service Ip:ServicePort----->PodIP:ContainerPort。
4)LoadBalancer
这种类型建立在NodePort类型之上,其通过cloud provider提供的负载均衡器将服务暴露到集群外部。因此,LoadBalancer一样具有NodePort和ClusterIP。简而言之,一个LoadBalancer类型的service会指向关联至kubernetes集群外部的、切实存在的某个负载均衡设备,该设备通过工作节点之上的NodePort向集群内部发送请求流量。
2. service端口
查看service的spec.ports字段如何定义
[root@k8s-master1 ~]# kubectl explain service [root@k8s-master1 ~]# kubectl explain service.spec.ports KIND: Service VERSION: v1 RESOURCE: ports <[]Object> DESCRIPTION: The list of ports that are exposed by this service. More info: https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies ServicePort contains information on service's port. FIELDS: appProtocol <string> The application protocol for this port. This field follows standard Kubernetes label syntax. Un-prefixed names are reserved for IANA standard service names (as per RFC-6335 and http://www.iana.org/assignments/service-names). Non-standard protocols should use prefixed names such as mycompany.com/my-custom-protocol. This is a beta field that is guarded by the ServiceAppProtocol feature gate and enabled by default. name <string> #定义端口的名字 The name of this port within the service. This must be a DNS_LABEL. All ports within a ServiceSpec must have unique names. When considering the endpoints for a Service, this must match the 'name' field in the EndpointPort. Optional if only one ServicePort is defined on this service. nodePort <integer> #宿主机上映射的端口 The port on each node on which this service is exposed when type is NodePort or LoadBalancer. Usually assigned by the system. If a value is specified, in-range, and not in use it will be used, otherwise the operation will fail. If not specified, a port will be allocated if this Service requires one. If this field is specified when creating a Service which does not need it, creation will fail. This field will be wiped when updating a Service to no longer need it (e.g. changing type from NodePort to ClusterIP). More info: https://kubernetes.io/docs/concepts/services-networking/service/#type-nodeport port <integer> -required- #service的端口,这个是k8s集群内部服务可访问的端口 The port that will be exposed by this service. protocol <string> The IP protocol for this port. Supports "TCP", "UDP", and "SCTP". Default is TCP. targetPort <string> #targetPort是pod上的端口,从port和nodePort上来的流量,经过kube-proxy流入到后端pod的targetPort上,最后进入容器。与制作容器时暴露的端口一致(使用DockerFile中的EXPOSE),例如官方的nginx暴露80端口 Number or name of the port to access on the pods targeted by the service. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME. If this is a string, it will be looked up as a named port in the target Pod's container ports. If this is not specified, the value of the 'port' field is used (an identity map). This field is ignored for services with clusterIP=None, and should be omitted or set equal to the 'port' field. More info: https://kubernetes.io/docs/concepts/services-networking/service/#defining-a-service
三、创建Service资源
1. ClusterIP类型的Service资源
1)创建pod资源
[root@k8s-master1 ~]# mkdir service [root@k8s-master1 ~]# cd service/
[root@k8s-master1 service]# vim deploy-demo.yaml You have new mail in /var/spool/mail/root [root@k8s-master1 service]# cat deploy-demo.yaml apiVersion: apps/v1 kind: Deployment metadata: name: my-nginx spec: replicas: 2 selector: matchLabels: run: my-nginx version: v1 template: metadata: labels: run: my-nginx version: v1 spec: containers: - name: my-nginx image: nginx:latest imagePullPolicy: IfNotPresent ports: - containerPort: 80
创建deployment控制器,查看pod资源 ip地址
[root@k8s-master1 service]# kubectl apply -f deploy-demo.yaml deployment.apps/my-nginx created You have new mail in /var/spool/mail/root [root@k8s-master1 service]# kubectl get deployments NAME READY UP-TO-DATE AVAILABLE AGE my-nginx 2/2 2 2 9s [root@k8s-master1 service]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES my-nginx-6f6bcdf657-6ffht 1/1 Running 0 15s 10.244.36.98 k8s-node1 <none> <none> my-nginx-6f6bcdf657-cxcf4 1/1 Running 0 15s 10.244.36.97 k8s-node1 <none> <none>
请求pod ip地址,查看结果
[root@k8s-master1 service]# curl 10.244.36.98 <!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title> <style> html { color-scheme: light dark; } body { width: 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; } </style> </head> <body> <h1>Welcome to nginx!</h1> <p>If you see this page, the nginx web server is successfully installed and working. Further configuration is required.</p> <p>For online documentation and support please refer to <a href="http://nginx.org/">nginx.org</a>.<br/> Commercial support is available at <a href="http://nginx.com/">nginx.com</a>.</p> <p><em>Thank you for using nginx.</em></p> </body> </html>
进入到pod中容器内访问pod ip地址,查看结果
[root@k8s-master1 service]# kubectl exec -it my-nginx-6f6bcdf657-cxcf4 -- /bin/sh # curl 10.244.36.97 <!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title> <style> html { color-scheme: light dark; } body { width: 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; } </style> </head> <body> <h1>Welcome to nginx!</h1> <p>If you see this page, the nginx web server is successfully installed and working. Further configuration is required.</p> <p>For online documentation and support please refer to <a href="http://nginx.org/">nginx.org</a>.<br/> Commercial support is available at <a href="http://nginx.com/">nginx.com</a>.</p> <p><em>Thank you for using nginx.</em></p> </body> </html>
需要注意的是,pod虽然定义了容器端口,但是不会使用调度到该节点上的80端口,也不会使用任何特定的NAT规则去路由流量到Pod上。 这意味着可以在同一个节点上运行多个 Pod,使用相同的容器端口,并且可以从集群中任何其他的Pod或节点上使用IP的方式访问到它们
误删除其中一个Pod:my-nginx-6f6bcdf657-6ffht,可以看到重新生成了一个pod :my-nginx-6f6bcdf657-fmjmn,ip是10.244.36.99,在k8s中创建pod,如果pod被删除了,重新生成的pod ip地址会发生变化,所以需要在pod前端加一个固定接入层。
[root@k8s-master1 service]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES my-nginx-6f6bcdf657-6ffht 1/1 Running 0 6m59s 10.244.36.98 k8s-node1 <none> <none> my-nginx-6f6bcdf657-cxcf4 1/1 Running 0 6m59s 10.244.36.97 k8s-node1 <none> <none> [root@k8s-master1 service]# kubectl delete pods my-nginx-6f6bcdf657-6ffht pod "my-nginx-6f6bcdf657-6ffht" deleted [root@k8s-master1 service]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES my-nginx-6f6bcdf657-cxcf4 1/1 Running 0 7m17s 10.244.36.97 k8s-node1 <none> <none> my-nginx-6f6bcdf657-fmjmn 1/1 Running 0 6s 10.244.36.99 k8s-node1 <none> <none>
2)创建service
查看pod对象的标签
[root@k8s-master1 service]# kubectl get pods --show-labels NAME READY STATUS RESTARTS AGE LABELS my-nginx-6f6bcdf657-cxcf4 1/1 Running 0 9m28s pod-template-hash=6f6bcdf657,run=my-nginx,version=v1 my-nginx-6f6bcdf657-fmjmn 1/1 Running 0 2m17s pod-template-hash=6f6bcdf657,run=my-nginx,version=v1
根据标签选择器创建service,关联上述run=my-nginx的Pod资源对象。创建一个 Service,具有标签run=my-nginx的Pod,目标TCP端口 80,并且在一个抽象的Service端口(targetPort:容器接收流量的端口;port:抽象的 Service 端口,可以使任何其它 Pod访问该 Service 的端口)上暴露。
[root@k8s-master1 service]# vim service-demo.yaml You have new mail in /var/spool/mail/root [root@k8s-master1 service]# cat service-demo.yaml kind: Service apiVersion: v1 metadata: name: my-nginx labels: run: my-service spec: type: ClusterIP ports: - protocol: TCP port: 80 targetPort: 80 selector: run: my-nginx [root@k8s-master1 service]# kubectl apply -f service-demo.yaml service/my-nginx created [root@k8s-master1 service]# kubectl get svc -l run=my-service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE my-nginx ClusterIP 10.103.70.228 <none> 80/TCP 53s
在k8s控制节点访问service的ip:端口就可以把请求代理到后端pod
[root@k8s-master1 service]# curl 10.103.70.228 <!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title> <style> html { color-scheme: light dark; } body { width: 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; } </style> </head> <body> <h1>Welcome to nginx!</h1> <p>If you see this page, the nginx web server is successfully installed and working. Further configuration is required.</p> <p>For online documentation and support please refer to <a href="http://nginx.org/">nginx.org</a>.<br/> Commercial support is available at <a href="http://nginx.com/">nginx.com</a>.</p> <p><em>Thank you for using nginx.</em></p> </body> </html>
通过上面可以看到请求service IP:port跟直接访问pod ip:port看到的结果一样,这就说明service可以把请求代理到它所关联的后端pod。
注意:上面的10.103.70.228:80地址只能是在k8s集群内部可以访问,在外部无法访问。
查看service详细信息
[root@k8s-master1 service]# kubectl describe svc my-nginx Name: my-nginx Namespace: default Labels: run=my-service Annotations: <none> Selector: run=my-nginx Type: ClusterIP IP Families: <none> IP: 10.103.70.228 IPs: 10.103.70.228 Port: <unset> 80/TCP TargetPort: 80/TCP Endpoints: 10.244.36.97:80,10.244.36.99:80 Session Affinity: None Events: <none>
查看作为此service对象后端的endpoint对象,负责接收相应的请求流量。
[root@k8s-master1 service]# kubectl get endpoints my-nginx NAME ENDPOINTS AGE my-nginx 10.244.36.97:80,10.244.36.99:80 4m47s
service可以对外提供统一固定的ip地址,并将请求重定向至集群中的pod。其中“将请求重定向至集群中的pod”就是通过endpoint与selector协同工作实现。selector是用来选择pod,由selector选择出来的pod的ip地址和端口号,将会被记录在endpoint中。endpoint便记录了所有pod的ip地址和端口号。当一个请求访问到service的ip地址时,就会从endpoint中选择出一个ip地址和端口号,然后将请求重定向至pod中。具体把请求代理到哪个pod,需要的就是kube-proxy的轮询实现的。service不会直接到pod,service是直接到endpoint资源,就是地址加端口,再由endpoint再关联到pod。
service只要创建完成,就可以直接解析它的服务名,每一个服务创建完成后都会在集群dns中动态添加一个资源记录,添加完成后就可以解析了,资源记录格式是:
SVC_NAME.NS_NAME.DOMAIN.LTD.
服务名.命名空间.域名后缀
集群默认的域名后缀是svc.cluster.local.
就像上面创建的my-nginx这个服务,它的完整名称解析就是
my-nginx.default.svc.cluster.local
[root@k8s-master1 service]# kubectl exec -it my-nginx-6f6bcdf657-cxcf4 -- /bin/sh # curl my-nginx.default.svc.cluster.local <!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title> <style> html { color-scheme: light dark; } body { width: 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; } </style> </head> <body> <h1>Welcome to nginx!</h1> <p>If you see this page, the nginx web server is successfully installed and working. Further configuration is required.</p> <p>For online documentation and support please refer to <a href="http://nginx.org/">nginx.org</a>.<br/> Commercial support is available at <a href="http://nginx.com/">nginx.com</a>.</p> <p><em>Thank you for using nginx.</em></p> </body> </html>
2. NodePort类型的Service资源
NodePort即节点Port,通常在安装k8s集群系统时会预留一个端口范围用于NodePort,默认为30000~32767之间的端口。与ClusterIP类型的可省略.spec.type属性所不同的是,定义NodePort类型的Service资源时,需要通过此属性明确指定其类型名称。
1)创建pod资源
[root@k8s-master1 service]# vim deploy-nodeport.yaml You have new mail in /var/spool/mail/root [root@k8s-master1 service]# cat deploy-nodeport.yaml apiVersion: apps/v1 kind: Deployment metadata: name: my-nginx-nodeport spec: replicas: 3 selector: matchLabels: run: my-nginx-nodeport template: metadata: labels: run: my-nginx-nodeport spec: containers: - name: my-nginx-nodeport image: nginx:latest imagePullPolicy: IfNotPresent ports: - containerPort: 80 [root@k8s-master1 service]# kubectl apply -f deploy-nodeport.yaml deployment.apps/my-nginx-nodeport created [root@k8s-master1 service]# kubectl get pods -o wide -l run=my-nginx-nodeport NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES my-nginx-nodeport-689f6d5464-6mdbz 1/1 Running 0 26s 10.244.36.100 k8s-node1 <none> <none> my-nginx-nodeport-689f6d5464-7bzfn 1/1 Running 0 26s 10.244.169.151 k8s-node2 <none> <none> my-nginx-nodeport-689f6d5464-vx5f9 1/1 Running 0 26s 10.244.36.101 k8s-node1 <none> <none>
2)创建service,代理pod,使用了NodePort类型,且人为指定其节点端口为32223
[root@k8s-master1 service]# vim service-nodeport.yaml You have new mail in /var/spool/mail/root [root@k8s-master1 service]# cat service-nodeport.yaml kind: Service apiVersion: v1 metadata: name: my-nginx-nodeport labels: run: my-nignx-nodeport spec: type: NodePort ports: - protocol: TCP port: 80 targetPort: 80 nodePort: 32223 selector: run: my-nginx-nodeport
实践中,并不鼓励用户自定义使用的节点端口,除非事先声明能够明确知道它不会与某个现存的service资源产生冲突。无论如何,只要没有特别需求,留给系统自动配置总是较好的选择。使用创建命令创建上面的service对象后,即可了解其运行状态。
[root@k8s-master1 service]# kubectl apply -f service-nodeport.yaml service/my-nginx-nodeport created You have new mail in /var/spool/mail/root [root@k8s-master1 service]# kubectl get svc my-nginx-nodeport -o wide NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR my-nginx-nodeport NodePort 10.96.92.212 <none> 80:32223/TCP 24s run=my-nginx-nodeport
命令结果显示,NodePort类型的Service资源依然会被配置ClusterIP,事实上,它会作为节点从NodePort接入流量后转发的目标地址,目标端口则是与Service资源对应的spec.ports.port属性定义的端口。
[root@k8s-master1 service]# curl 10.96.92.212 <!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title> <style> html { color-scheme: light dark; } body { width: 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; } </style> </head> <body> <h1>Welcome to nginx!</h1> <p>If you see this page, the nginx web server is successfully installed and working. Further configuration is required.</p> <p>For online documentation and support please refer to <a href="http://nginx.org/">nginx.org</a>.<br/> Commercial support is available at <a href="http://nginx.com/">nginx.com</a>.</p> <p><em>Thank you for using nginx.</em></p> </body> </html>
10.96.92.212是k8s集群内部的service ip地址,只能在k8s集群内部访问,在集群外无法访问
集群外访问service:
[root@k8s-master1 service]# curl 10.0.0.132:32223 <!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title> <style> html { color-scheme: light dark; } body { width: 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; } </style> </head> <body> <h1>Welcome to nginx!</h1> <p>If you see this page, the nginx web server is successfully installed and working. Further configuration is required.</p> <p>For online documentation and support please refer to <a href="http://nginx.org/">nginx.org</a>.<br/> Commercial support is available at <a href="http://nginx.com/">nginx.com</a>.</p> <p><em>Thank you for using nginx.</em></p> </body> </html>
浏览器访问:
服务请求走向:Client-node ip:32223->service ip:80-pod ip:container port
因此,对于集群外部客户端来说,它们可经由任何一个节点的节点ip及端口访问NodePort类型的Service资源,而对于集群内的pod客户端来说,依然可以通过ClusterIP对其进行访问。
3. ExternalName类型的Service资源
ExternalName类型的Service资源用于将集群外部的服务发布到集群中以供pod中的应用程序访问。因此,它不需要使用标签选择器关联任何pod对象,但必须要使用.spec.externalName属性定义一个CNAME记录用于返回外部真正提供服务的主机的别名,而后通过CNAME记录值获取到相关主机的ip地址。
示例:应用场景:跨名称空间访问:需求:default名称空间下的client 服务想要访问nginx-ns名称空间下的nginx-svc服务
1)创建default名称空间下的client 服务
[root@k8s-master1 service]# cat deploy-client.yaml apiVersion: apps/v1 kind: Deployment metadata: name: client spec: replicas: 1 selector: matchLabels: app: busybox template: metadata: labels: app: busybox spec: containers: - name: busybox image: busybox:1.28 imagePullPolicy: IfNotPresent command: ["/bin/sh", "-c", "sleep 36000"] [root@k8s-master1 service]# kubectl apply -f deploy-client.yaml deployment.apps/client created [root@k8s-master1 service]# kubectl get pods NAME READY STATUS RESTARTS AGE client-df684dd68-86lq4 1/1 Running 0 3m30s
2)创建ExternalName类型的service资源,该文件中指定了到 nginx-svc 的软链,让使用者感觉就好像调用自己命名空间的服务一样
[root@k8s-master1 service]# vim service-client.yaml You have new mail in /var/spool/mail/root [root@k8s-master1 service]# cat service-client.yaml kind: Service apiVersion: v1 metadata: name: svc-client spec: type: ExternalName externalName: nginx-svc.nginx-ns.svc.cluster.local ports: - protocol: TCP port: 80 targetPort: 80 name: http [root@k8s-master1 service]# kubectl apply -f service-client.yaml service/svc-client created [root@k8s-master1 service]# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 42d svc-client ExternalName <none> nginx-svc.nginx-ns.svc.cluster.local 80/TCP 9s
创建nginx-ns命令空间下的nginx服务
[root@k8s-master1 service]# kubectl create ns nginx-ns namespace/nginx-ns created [root@k8s-master1 service]# vim deploy-nginx.yaml You have new mail in /var/spool/mail/root [root@k8s-master1 service]# cat deploy-nginx.yaml apiVersion: apps/v1 kind: Deployment metadata: name: nginx namespace: nginx-ns spec: replicas: 2 selector: matchLabels: app: nginx version: v1 template: metadata: labels: app: nginx version: v1 spec: containers: - name: nginx image: nginx:latest imagePullPolicy: IfNotPresent ports: - containerPort: 80 [root@k8s-master1 service]# kubectl apply -f deploy-nginx.yaml deployment.apps/nginx created [root@k8s-master1 service]# kubectl get pods -n nginx-ns NAME READY STATUS RESTARTS AGE nginx-6f857fcc59-4s75h 1/1 Running 0 30s nginx-6f857fcc59-7gkm8 1/1 Running 0 30s [root@k8s-master1 service]# vim service-nginx.yaml You have new mail in /var/spool/mail/root [root@k8s-master1 service]# cat service-nginx.yaml kind: Service apiVersion: v1 metadata: name: nginx-svc namespace: nginx-ns labels: app: nginx spec: type: ClusterIP ports: - protocol: TCP port: 80 targetPort: 80 selector: app: nginx [root@k8s-master1 service]# kubectl apply -f service-nginx.yaml service/nginx-svc created [root@k8s-master1 service]# kubectl get svc -n nginx-ns NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE nginx-svc ClusterIP 10.111.96.90 <none> 80/TCP 5s
登录到client pod,可通过svc-client.default.svc.cluster.local访问nginx服务。客户端进行服务名称解析。
[root@k8s-master1 service]# kubectl exec -it client-df684dd68-86lq4 -- /bin/sh / # wget -q -O - nginx-svc.nginx-ns.svc.cluster.local <!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title> <style> html { color-scheme: light dark; } body { width: 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; } </style> </head> <body> <h1>Welcome to nginx!</h1> <p>If you see this page, the nginx web server is successfully installed and working. Further configuration is required.</p> <p>For online documentation and support please refer to <a href="http://nginx.org/">nginx.org</a>.<br/> Commercial support is available at <a href="http://nginx.com/">nginx.com</a>.</p> <p><em>Thank you for using nginx.</em></p> </body> </html> / # wget -q -O - svc-client.default.svc.cluster.local <!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title> <style> html { color-scheme: light dark; } body { width: 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; } </style> </head> <body> <h1>Welcome to nginx!</h1> <p>If you see this page, the nginx web server is successfully installed and working. Further configuration is required.</p> <p>For online documentation and support please refer to <a href="http://nginx.org/">nginx.org</a>.<br/> Commercial support is available at <a href="http://nginx.com/">nginx.com</a>.</p> <p><em>Thank you for using nginx.</em></p> </body> </html> / # nslookup svc-client.default.svc.cluster.local Server: 10.96.0.10 Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local Name: svc-client.default.svc.cluster.local Address 1: 10.111.96.90 nginx-svc.nginx-ns.svc.cluster.local
由于ExternalName类型的Service资源实现于DNS级别,客户端将直接接入外部的服务而完全不需要服务代理,因此,它也无须配置ClusterIP,此种类型的服务也称为Headless Service。
4. 最佳实践
自定义endpoint 映射外部服务案例分享:k8s集群引用外部的mysql数据库
1)在k8s-node1节点上安装mariadb服务,并启动服务
[root@k8s-node1 ~]# yum install mariadb-server [root@k8s-node1 ~]# systemctl status mariadb ● mariadb.service - MariaDB database server Loaded: loaded (/usr/lib/systemd/system/mariadb.service; disabled; vendor preset: disabled) Active: inactive (dead) [root@k8s-node1 ~]# systemctl start mariadb [root@k8s-node1 ~]# systemctl status mariadb ● mariadb.service - MariaDB database server Loaded: loaded (/usr/lib/systemd/system/mariadb.service; disabled; vendor preset: disabled) Active: active (running) since Mon 2022-09-12 22:45:15 CST; 1s ago Process: 116424 ExecStartPost=/usr/libexec/mariadb-wait-ready $MAINPID (code=exited, status=0/SUCCESS) Process: 116321 ExecStartPre=/usr/libexec/mariadb-prepare-db-dir %n (code=exited, status=0/SUCCESS) Main PID: 116423 (mysqld_safe) Memory: 102.0M CGroup: /system.slice/mariadb.service ├─116423 /bin/sh /usr/bin/mysqld_safe --basedir=/usr └─116587 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --log-error=/var/log/mariadb/mariadb.log ... Sep 12 22:45:13 k8s-node1 mariadb-prepare-db-dir[116321]: MySQL manual for more instructions. Sep 12 22:45:13 k8s-node1 mariadb-prepare-db-dir[116321]: Please report any problems at http://mariadb.org/jira Sep 12 22:45:13 k8s-node1 mariadb-prepare-db-dir[116321]: The latest information about MariaDB is available at http://mariadb.org/. Sep 12 22:45:13 k8s-node1 mariadb-prepare-db-dir[116321]: You can find additional information about the MySQL part at: Sep 12 22:45:13 k8s-node1 mariadb-prepare-db-dir[116321]: http://dev.mysql.com Sep 12 22:45:13 k8s-node1 mariadb-prepare-db-dir[116321]: Consider joining MariaDB's strong and vibrant community: Sep 12 22:45:13 k8s-node1 mariadb-prepare-db-dir[116321]: https://mariadb.org/get-involved/ Sep 12 22:45:13 k8s-node1 mysqld_safe[116423]: 220912 22:45:13 mysqld_safe Logging to '/var/log/mariadb/mariadb.log'. Sep 12 22:45:13 k8s-node1 mysqld_safe[116423]: 220912 22:45:13 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql Sep 12 22:45:15 k8s-node1 systemd[1]: Started MariaDB database server. [root@k8s-node1 ~]# netstat -lntup |grep 3306 tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 116587/mysqld
2)创建mysql_service资源对象
[root@k8s-master1 service]# vim mysql_service.yaml [root@k8s-master1 service]# cat mysql_service.yaml apiVersion: v1 kind: Service metadata: name: mysql spec: type: ClusterIP ports: - port: 3306 You have new mail in /var/spool/mail/root [root@k8s-master1 service]# kubectl apply -f mysql_service.yaml service/mysql created [root@k8s-master1 service]# kubectl get svc -o wide NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 42d <none> mysql ClusterIP 10.108.32.167 <none> 3306/TCP 8s <none> [root@k8s-master1 service]# kubectl describe svc mysql Name: mysql Namespace: default Labels: <none> Annotations: <none> Selector: <none> Type: ClusterIP IP Families: <none> IP: 10.108.32.167 IPs: 10.108.32.167 Port: <unset> 3306/TCP TargetPort: 3306/TCP Endpoints: <none> #无endpoint Session Affinity: None Events: <none>
3)自定义endpoint对象
[root@k8s-master1 service]# vim mysql_endpoint.yaml You have new mail in /var/spool/mail/root [root@k8s-master1 service]# cat mysql_endpoint.yaml apiVersion: v1 kind: Endpoints metadata: name: mysql subsets: - addresses: - ip: 10.0.0.132 ports: - port: 3306 [root@k8s-master1 service]# kubectl apply -f mysql_endpoint.yaml endpoints/mysql created [root@k8s-master1 service]# kubectl describe svc mysql Name: mysql Namespace: default Labels: <none> Annotations: <none> Selector: <none> Type: ClusterIP IP Families: <none> IP: 10.108.32.167 IPs: 10.108.32.167 Port: <unset> 3306/TCP TargetPort: 3306/TCP Endpoints: 10.0.0.132:3306 #定义的外部mysql服务 Session Affinity: None Events: <none>
4)登录数据库
上面配置就是将外部IP地址和服务引入到k8s集群内部,由service作为一个代理来达到能够访问外部服务的目的。如:通过service ip地址可以登录到数据库
[root@k8s-node1 ~]# mysql -h10.108.32.167 Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 3 Server version: 5.5.68-MariaDB MariaDB Server Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]>
四、虚拟ip和服务代理
一个Service对象就是工作节点上的一些iptables或ipvs规则,用于将到达Service对象的IP地址的流量调度转发至相应的endpoint对象指向的ip地址和端口之上。工作于每个工作节点的kube-proxy组件将通过API Server持续监控着各Service及与其关联的pod对象,并将其创建或变动实时反映至当前工作节点上相应的iptables或ipvs规则上。
Service IP事实上是用于生成iptables或ipvs规则时使用的IP地址,它仅用于实现kubernetes集群网络的内部通信,并且仅能够将规则中定义的转发服务的请求作为目标地址予以响应。
1. kube-proxy组件介绍
Service只是把应用对外提供服务的方式做了抽象,真正的应用跑在Pod中的container里,那么客户端的请求转到kubernetes nodes对应的nodePort上,那么nodePort上的请求是如何进一步转到提供后台服务的Pod的呢? 就是通过kube-proxy实现的:
kube-proxy部署在Kubernetes的每一个Node工作节点上,是Kubernetes的核心组件,创建一个 service 的时候,kube-proxy 会在iptables中追加一些规则,实现路由与负载均衡的功能。在k8s1.8之前,kube-proxy默认使用的是iptables模式,通过各个node节点上的iptables规则来实现service的负载均衡,但是随着service数量的增大,iptables模式由于线性查找匹配、全量更新等特点,其性能会显著下降。从k8s的1.8版本开始,kube-proxy引入了IPVS模式,IPVS模式与iptables同样基于Netfilter,但是采用的hash表,因此当service数量达到一定规模时,hash查表的速度优势就会显现出来,从而提高service的服务性能。
简单来说,Service是一组pod的服务抽象,相当于一组pod的LB,负责将请求分发给对应的pod。Service会为这个LB提供一个IP,一般称为Cluster IP。kube-proxy的作用主要是负责service的实现,具体来说,就是实现了内部从pod到service和外部的从node port向service的访问。
1)kube-proxy其实就是管理service的访问入口,包括集群内Pod到Service的访问和集群外访问service。
2)kube-proxy管理sevice的Endpoints,该service对外暴露一个Virtual IP,也可以称为是Cluster IP, 集群内通过访问这个Cluster IP:Port就能访问到集群内对应的serivce下的Pod。
2. kube-proxy三种工作模式
kube-proxy将请求代理至相应端点的方式有三种:userspace(用户空间)、iptables和ipvs
1) Userspace方式:
Client请求要访问Server Pod时,它先将请求流量发给内核空间中的service iptables规则,由它再将请求经由监听的指定套接字送往用户空间的kube-proxy,kube-proxy处理完请求,并分发请求到指定Server Pod后, 再将请求转发给内核空间中的service ip,由service iptables将请求调度至后端的pod。
这个模式有很大的问题,客户端请求先进入内核空间的,又进去用户空间访问kube-proxy,由kube-proxy封装完成后再进去内核空间的iptables,再根据iptables的规则分发给各节点的用户空间的pod。由于其需要来回在用户空间和内核空间交互通信,因此效率很差。在Kubernetes 1.1版本之前,userspace是默认的代理模型。
2) iptables方式
iptables代理模型中,kube-proxy负责跟踪API Server上Service和Endpoints对象的变动,并据此作出Service资源定义的变动。同时,对于每个Service,它都会自动创建iptables规则直接捕捉到达ClusterIP和Port的流量,并将其重定向值当前的Service的后端。如下图所示。对于每个Endpoints对象,Service资源会为其创建iptables规则并关联至挑选的后端pod资源,默认算法是随机调度。iptables代理模式由kubernetes 1.1 版本引入,并自1.2版本开始成为默认的类型。
在创建Service资源时,集群中每个节点上的kube-proxy都会收到通知并将其定义为当前节点上的iptables规则,用于转发工作节点接收到的与此service资源的ClusterIP和端口的相关流量。客户端发来的请求被相关的iptables规则进行调度和目标地址转换(DNAT)后再转发至集群内的pod对象之上。
相对于用户空间模型来说,iptables模型无需将流量在用户空间和内核空间来回切换,因而更加高效和可靠。不过,其缺点是iptables代理模型不会在被挑中的后端pod资源无响应时自动进行重定向,而userspace模型则可以。
3)ipvs代理模型
Kubernetes自1.9-alpha版本引入了ipvs代理模式,自1.11版本开始成为默认设置。客户端请求时到达内核空间时,根据ipvs的规则直接分发到各pod上。kube-proxy会监视Kubernetes Service对象和Endpoints,调用netlink接口以相应地创建ipvs规则并定期与Kubernetes Service对象和Endpoints对象同步ipvs规则,以确保ipvs状态与期望一致。访问服务时,流量将被重定向到其中一个后端Pod。
与iptables类似,ipvs基于netfilter 的 hook 功能,但使用哈希表作为底层数据结构并在内核空间中工作。这意味着ipvs可以更快地重定向流量,并且在同步代理规则时具有更好的性能。此外,ipvs为负载均衡算法提供了更多选项,例如:rr:轮询调度,lc:最小连接数,dh:目标哈希,sh:源哈希,sed:最短期望延迟,nq:不排队调度等。
总之,如果某个服务后端pod发生变化,标签选择器适应的pod又多一个,适应的信息会立即反映到apiserver上,而kube-proxy一定可以watch到etc中的信息变化,而将它立即转为ipvs或者iptables中的规则,这一切都是动态和实时的,删除一个pod也是同样的原理。如下图:
注:以上不论哪种,kube-proxy都通过watch的方式监控着apiserver写入etcd中关于Pod的最新状态信息,它一旦检查到一个Pod资源被删除了或新建了,它将立即将这些变化,反应在iptables 或 ipvs规则中,以便iptables和ipvs在调度Clinet Pod请求到Server Pod时,不会出现Server Pod不存在的情况。自k8s1.11以后,service默认使用ipvs规则,若ipvs没有被激活,则降级使用iptables规则。
3. kube-proxy生成的iptables规则分析
1)service的type类型是ClusterIp,iptables规则分析
在k8s创建的service,虽然有ip地址,但是service的ip是虚拟的,不存在物理机上的,是在iptables或者ipvs规则里的。
[root@k8s-master1 service]# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 42d my-nginx ClusterIP 10.99.159.169 <none> 80/TCP 6s [root@k8s-master1 service]# iptables -t nat -L |grep 10.99.159.169 KUBE-MARK-MASQ tcp -- !10.244.0.0/16 10.99.159.169 /* default/my-nginx cluster IP */ tcp dpt:http KUBE-SVC-L65ENXXZWWSAPRCR tcp -- anywhere 10.99.159.169 /* default/my-nginx cluster IP */ tcp dpt:http [root@k8s-master1 service]# iptables -t nat -L |grep KUBE-SVC-L65ENXXZWWSAPRCR KUBE-SVC-L65ENXXZWWSAPRCR tcp -- anywhere 10.99.159.169 /* default/my-nginx cluster IP */ tcp dpt:http Chain KUBE-SVC-L65ENXXZWWSAPRCR (1 references) [root@k8s-master1 service]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES my-nginx-6f6bcdf657-d2vr5 1/1 Running 0 75s 10.244.36.104 k8s-node1 <none> <none> my-nginx-6f6bcdf657-vxntd 1/1 Running 0 75s 10.244.36.105 k8s-node1 <none> <none> [root@k8s-master1 service]# iptables -t nat -L |grep 10.244.36.104 KUBE-MARK-MASQ all -- 10.244.36.104 anywhere /* default/my-nginx */ DNAT tcp -- anywhere anywhere /* default/my-nginx */ tcp to:10.244.36.104:80 [root@k8s-master1 service]# iptables -t nat -L |grep 10.244.36.105 KUBE-MARK-MASQ all -- 10.244.36.105 anywhere /* default/my-nginx */ DNAT tcp -- anywhere anywhere /* default/my-nginx */ tcp to:10.244.36.105:80
通过上面可以看到之前创建的service,会通过kube-proxy在iptables中生成一个规则,来实现流量路由,有一系列目标为 KUBE-SVC-xxx 链的规则,每条规则都会匹配某个目标 ip 与端口。也就是说访问某个 ip:port 的请求会由 KUBE-SVC-xxx 链来处理。这个目标 IP 其实就是service ip。
2)service的type类型是nodePort,iptables规则分析
[root@k8s-master1 service]# kubectl get svc -l run=my-nignx-nodeport NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE my-nginx-nodeport NodePort 10.96.123.30 <none> 80:32223/TCP 2m55s [root@k8s-master1 service]# iptables -t nat -S |grep 32223 -A KUBE-NODEPORTS -p tcp -m comment --comment "default/my-nginx-nodeport" -m tcp --dport 32223 -j KUBE-MARK-MASQ -A KUBE-NODEPORTS -p tcp -m comment --comment "default/my-nginx-nodeport" -m tcp --dport 32223 -j KUBE-SVC-J5QV2XWG4FEBPH3Q You have new mail in /var/spool/mail/root [root@k8s-master1 service]# iptables -t nat -S | grep KUBE-SVC-J5QV2XWG4FEBPH3Q -N KUBE-SVC-J5QV2XWG4FEBPH3Q -A KUBE-NODEPORTS -p tcp -m comment --comment "default/my-nginx-nodeport" -m tcp --dport 32223 -j KUBE-SVC-J5QV2XWG4FEBPH3Q -A KUBE-SERVICES -d 10.96.123.30/32 -p tcp -m comment --comment "default/my-nginx-nodeport cluster IP" -m tcp --dport 80 -j KUBE-SVC-J5QV2XWG4FEBPH3Q -A KUBE-SVC-J5QV2XWG4FEBPH3Q -m comment --comment "default/my-nginx-nodeport" -m statistic --mode random --probability 0.33333333349 -j KUBE-SEP-SUXXFNQM72XX3L5W -A KUBE-SVC-J5QV2XWG4FEBPH3Q -m comment --comment "default/my-nginx-nodeport" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-7X7MTBS73UHMAH2M -A KUBE-SVC-J5QV2XWG4FEBPH3Q -m comment --comment "default/my-nginx-nodeport" -j KUBE-SEP-KD6XOKD6ZB2PDMFB [root@k8s-master1 service]# iptables -t nat -S | grep KUBE-SEP-SUXXFNQM72XX3L5W -N KUBE-SEP-SUXXFNQM72XX3L5W -A KUBE-SEP-SUXXFNQM72XX3L5W -s 10.244.36.106/32 -m comment --comment "default/my-nginx-nodeport" -j KUBE-MARK-MASQ -A KUBE-SEP-SUXXFNQM72XX3L5W -p tcp -m comment --comment "default/my-nginx-nodeport" -m tcp -j DNAT --to-destination 10.244.36.106:80 -A KUBE-SVC-J5QV2XWG4FEBPH3Q -m comment --comment "default/my-nginx-nodeport" -m statistic --mode random --probability 0.33333333349 -j KUBE-SEP-SUXXFNQM72XX3L5 You have new mail in /var/spool/mail/root [root@k8s-master1 service]# iptables -t nat -S | grep KUBE-SEP-7X7MTBS73UHMAH2M -N KUBE-SEP-7X7MTBS73UHMAH2M -A KUBE-SEP-7X7MTBS73UHMAH2M -s 10.244.36.107/32 -m comment --comment "default/my-nginx-nodeport" -j KUBE-MARK-MASQ -A KUBE-SEP-7X7MTBS73UHMAH2M -p tcp -m comment --comment "default/my-nginx-nodeport" -m tcp -j DNAT --to-destination 10.244.36.107:80 -A KUBE-SVC-J5QV2XWG4FEBPH3Q -m comment --comment "default/my-nginx-nodeport" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-7X7MTBS73UHMAH2 [root@k8s-master1 service]# iptables -t nat -S | grep KUBE-SEP-KD6XOKD6ZB2PDMFB -N KUBE-SEP-KD6XOKD6ZB2PDMFB -A KUBE-SEP-KD6XOKD6ZB2PDMFB -s 10.244.36.108/32 -m comment --comment "default/my-nginx-nodeport" -j KUBE-MARK-MASQ -A KUBE-SEP-KD6XOKD6ZB2PDMFB -p tcp -m comment --comment "default/my-nginx-nodeport" -m tcp -j DNAT --to-destination 10.244.36.108:80 -A KUBE-SVC-J5QV2XWG4FEBPH3Q -m comment --comment "default/my-nginx-nodeport" -j KUBE-SEP-KD6XOKD6ZB2PDMFB
与pod资源对应的ip地址一致:
[root@k8s-master1 service]# kubectl get pods --show-labels -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS my-nginx-nodeport-689f6d5464-bqs86 1/1 Running 0 5m8s 10.244.36.106 k8s-node1 <none> <none> pod-template-hash=689f6d5464,run=my-nginx-nodeport my-nginx-nodeport-689f6d5464-ld4gw 1/1 Running 0 5m8s 10.244.36.108 k8s-node1 <none> <none> pod-template-hash=689f6d5464,run=my-nginx-nodeport my-nginx-nodeport-689f6d5464-nc6v9 1/1 Running 0 5m8s 10.244.36.107 k8s-node1 <none> <none> pod-template-hash=689f6d5464,run=my-nginx-nodeport