HPA-自动弹性缩放
Deployment、ReplicaSet、Replication Controller或StatefulSet控制器资源管控Pod副本数量支持手动方式的运行时调整,从而更好地匹配业务规模的实际需求。不过,手动调整的方式依赖于用户深度参与监控容器应用的资源压力并且需要计算出合理的值进行调整,存在一定程度的滞后性。为此,需要借助一些自动伸缩的手段,例如通过监控Pod的资源使用率、访问的QPS等指标来实现自动的弹性伸缩。同时,kubernetes提供了多种自动弹性伸缩(Auto Scaling)工具,具体如下:
1)HPA
HPA全称Horizontal Pod Autoscaling,一种支持控制器对象下pod规模弹性伸缩的工具。HAP通过监控分析RC或者Deployment控制的所有Pod的负载变化情况来确定是否需要调整Pod的副本数量,这是HPA最基本的原理。HPA控制器示意图如下图所示:
2)CA
CA全称Cluster Autoscaler,是集群规模自动弹性伸缩工具,能自动增减GCP、AWS或Azure集群上部署的kubernetes集群的节点数量。
3) VPA
VPA全称Vertical Pod Autoscaler,是pod应用垂直伸缩工具,它通过调整pod对象的CPU和内存资源需求量完成扩展或伸缩。
4)AR
AR全称Addon Resizer,是一个简化版本的pod应用垂直伸缩工具,它基于集群中的节点数量来调整附加组件的资源需求量。
Kubernetes从两个维度上支持自动的弹性伸缩:
•Cluster AutoScaler:处理kubernetes集群node节点的伸缩,其严重依赖IaaS厂商提供的云主机服务和资源监控服务
•HPA(Horizontal Pod Autoscaler):处理Pod副本集的自动弹性伸缩,其依赖监控服务采集到的资源监控指标数据
一、HPA概述
尽管Cluster Autoscaler高度依赖基础云计算环境的底层功能,但HPA、VPA和AR可以独立于IaaS或PaaS云环境运行。HPA可作为kubernetes API资源和控制器实现,它基于采集到的资源指标数据来调整控制器的行为,控制器会定期调整ReplicaSets或Deployment控制器对象中的副本数,以使得观察到的平均CPU利用率与用户指定的目标相匹配。
HPA在kubernetes集群中被设计成一个controller,可以简单的通过kubectl autoscale命令来创建一个HPA资源对象。HPA自身是一个控制循环的实现,其周期由controller-manager的--horizontal-pod-autoscaler-sync-period选项定义,默认为30秒。在每个周期内,controller-manager将根据每个HPA定义中指定的指标查询相应的资源利用率。controller-manager从资源指标API(针对每个Pod资源指标)或自定义指标API(针对所有其他指标)中获取指标数据。
1. 资源指标
采集每个节点上的kubelet公开的summary api中的指标信息,通常只包含cpu、内存使用率信息。对于每个pod资源指标(如CPU),控制器都将从HPA定位到每个pod的资源指标API中获取指标数据。若设置了目标利用率标准,则HPA控制器会计算其实际利用率。若设置的是目标原始值,则直接使用原始指标值。然后控制器获取所有目标pod对象的利用率或原始值的均值(取决于指定的目标类型),并生成一个用于缩放所需副本数的比率。不过,对于未定义资源需求量的pod对象,HPA控制器将无法定义该容器的CPU利用率,并且不会为该指标采取任何操作。
2. 自定义指标
允许用户从外部的监控系统当中采集自定义指标,如应用的qps等。对于每个pod对象自定义指标,HPA控制器的功能与每个pod资源指标的处理机制类似,只是它仅能够处理原始值而非利用率。
3. HPA获取指标的方式
当创建了HPA后,HPA会从Heapster或者用户自定义的RESTClient端获取每一个Pod利用率或原始值的平均值,然后和HPA中定义的指标进行对比,同时计算出需要伸缩的具体值并进行相应的操作。HPA控制器可以通过两种不同的方式获取指标:
1)Heapster:直接使用Heapster获取指标数据时,HPA直接通过API服务器的服务代理子资源向Heapster发起查询请求,因此Heapster需要事先部署在集群上并在kube-system名称空间中运行。
2)REST客户端接口:使用REST客户端接口获取指标时,需要事先部署好资源指标API及其API Server,必要时,还应该部署好自定义指标API及其相关的API Server。
通常应用的扩缩容都是由cpu或内存的使用率实现的。事实上在早期的kubernetes版本当中,hpa只支持基于cpu使用的率的扩缩容,而hpa获取到的cpu使用率指标则来源于kubernetes自带的监控系统heapster。自Kubernetes 1.6版本起为HPA增加了基于多个指标的扩展支持,用户可以使用autoscaling/v2beta1 API版本为HPA配置多个指标以控制规模的伸缩。运行时,HPA控制器将评估每个指标,并基于每个指标分别计算出各自控制下的新的Pod规模数量,结果值最大的即为采用的新规模标准。资源的使用指标改为通过metrics api获取。而heapster当前已经废弃。
二、部署Metrics Server
资源指标的采集是通过metrics api获取,而Metrics Server实现了Resurce Metrics API(metrics.k8s.io),通过此 API 可以查询 Pod 与 Node 的部分监控指标,Pod 的监控指标用于 HPA、VPA 与"kubectl top pods" 命令,而 Node 指标目前只用于 kubectl top nodes 命令。
Metrics Server 是集群级别的资源利用率数据的聚合器。Metrics Server通过kubernetes聚合器(kube-aggregator)注册到主API Server之上,而后基于kubelet的Summary API收集每个节点上的指标数据,并将它们存储于内存中然后以指标API格式提供,也就是说,如果需要使用kubernetes的HPA功能,需要先安装Metrics Server。
Metrics Server基于内存存储,重启后数据将全部丢失,而且它仅能留存最近收集到的指标数据,因此,如果用户期望访问历史数据,就需要借助第三方的监控系统(如prometheus等)。
Metrics Server是kubernetes多个核心组件的基础依赖,因此,它应该默认部署运行于集群中。一般来说,Metrics Server在每个集群中仅会运行一个实例,启动时,它将自动初始化与各节点的连接,因此,处于安全方面的考虑,它需要运行于普通节点而非Master主机之上。
1. 下载yaml部署文件
#下载 metrics-server 官方的部署 yaml
[root@k8s-master1 hpa]# wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.5.0/components.yaml --2022-12-10 22:40:43-- https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.5.0/components.yaml Resolving github.com (github.com)... 20.205.243.166 Connecting to github.com (github.com)|20.205.243.166|:443... connected. HTTP request sent, awaiting response... 302 Found Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/92132038/cd1df380-bfda-11eb-9ef4-3e2b52c08a3e?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20221210%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20221210T144044Z&X-Amz-Expires=300&X-Amz-Signature=9d91b74221021b84b4fa57c6e8acf795933498da9e0f3201e793ec6992b4f4ba&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=92132038&response-content-disposition=attachment%3B%20filename%3Dcomponents.yaml&response-content-type=application%2Foctet-stream [following] --2022-12-10 22:40:44-- https://objects.githubusercontent.com/github-production-release-asset-2e65be/92132038/cd1df380-bfda-11eb-9ef4-3e2b52c08a3e?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20221210%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20221210T144044Z&X-Amz-Expires=300&X-Amz-Signature=9d91b74221021b84b4fa57c6e8acf795933498da9e0f3201e793ec6992b4f4ba&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=92132038&response-content-disposition=attachment%3B%20filename%3Dcomponents.yaml&response-content-type=application%2Foctet-stream Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.111.133, 185.199.110.133, 185.199.109.133, ... Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.111.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 4115 (4.0K) [application/octet-stream] Saving to: ‘components.yaml’ 100%[======================================================================================================================>] 4,115 --.-K/s in 0s 2022-12-10 22:40:45 (37.9 MB/s) - ‘components.yaml’ saved [4115/4115]
2. 修改yaml文件
metrics-server 会请求每台节点的 kubelet 接口来获取监控数据,接口通过 HTTPS 暴露,但 Kubernetes 节点的 kubelet 使用的是自签证书,若 metrics-server 直接请求 kubelet 接口,将产生证书校验失败的错误,因此需要在 components.yaml 文件中加上 --kubelet-insecure-tls 启动参数。
由于 metrics-server 官方镜像仓库存储在k8s.gcr.io ,国内可能无法直接拉取,使用自己同步的镜像:willdockerhub/metrics-server:v0.5.0。
因此,components.yaml 文件修改示例如下:
3. 部署 metrics-server
[root@k8s-master1 hpa]# kubectl apply -f components.yaml serviceaccount/metrics-server created clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created clusterrole.rbac.authorization.k8s.io/system:metrics-server created rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created service/metrics-server created deployment.apps/metrics-server created apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
4. 检查运行状态
1).执行以下命令,检查 metrics-server 是否正常启动
[root@k8s-master1 hpa]# kubectl get pod -n kube-system -o wide| grep metrics-server metrics-server-778d9477f6-m5j52 1/1 Running 0 66s 10.244.36.79 k8s-node1 <none> <none>
2)执行以下命令,检查在apiservice中是否多了一个接口
[root@k8s-master1 hpa]# kubectl get apiservice |grep metrics v1beta1.metrics.k8s.io kube-system/metrics-server True 5m20s
3).执行以下命令,检查配置文件
通过访问metrics.k8s.io接口,如能正常访问代表安装成功
[root@k8s-master1 hpa]# kubectl get --raw /apis/metrics.k8s.io/v1beta1 | jq { "kind": "APIResourceList", "apiVersion": "v1", "groupVersion": "metrics.k8s.io/v1beta1", "resources": [ { "name": "nodes", "singularName": "", "namespaced": false, "kind": "NodeMetrics", "verbs": [ "get", "list" ] }, { "name": "pods", "singularName": "", "namespaced": true, "kind": "PodMetrics", "verbs": [ "get", "list" ] } ] }
4).执行以下命令,检查节点占用性能情况
[root@k8s-master1 hpa]# kubectl top nodes NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% k8s-master1 891m 22% 2025Mi 52% k8s-node1 658m 16% 1297Mi 33% k8s-node2 228m 11% 815Mi 43%
三、HPA控制器
HPA本质上也是Kubernetes的一种资源对象。通过周期性检查Deployment控制的目标Pod 的相关监控指标的变化情况,来确定是否需要针对性地调整目标Pod的副本数。
HPA也是标准的kubernetes API资源,其基于资源配置清单的管理方式通过其他资源相同。另外,它还有一个特别的“kubectl autoscale”命令快速创建HPA控制器。例如,首先,创建一个名为myapp的deployment控制器,而后通过HPA控制器自动管控其Pod副本规模。
1. 创建一个名为myapp的deployment控制器
[root@k8s-master1 deployment]# cat myapp.yaml apiVersion: apps/v1 kind: Deployment metadata: name: myapp spec: replicas: 2 selector: matchLabels: app: myapp version: v1 template: metadata: labels: app: myapp version: v1 spec: containers: - name: myapp resources: requests: memory: "256Mi" cpu: "50m" limits: memory: "256Mi" cpu: "50m" image: janakiramm/myapp:v1 imagePullPolicy: IfNotPresent ports: - containerPort: 80 [root@k8s-master1 deployment]# kubectl apply -f myapp.yaml deployment.apps/myapp created [root@k8s-master1 deployment]# kubectl get pods -o wide |grep myapp myapp-56767f6f74-nmf8h 1/1 Running 0 15s 10.244.36.77 k8s-node1 <none> <none> myapp-56767f6f74-tt6vh 1/1 Running 0 15s 10.244.36.78 k8s-node1 <none> <none>
2. 创建一个名为myapp的HPA
创建一个HPA,可以使用kubectl autoscale命令来创建。通过命令创建的HPA对象隶属于“autoscaling/v1”群组,因此,它仅支持基于CPU利用率的弹性伸缩机制,可从metrics service获得相关指标数据。
[root@k8s-master1 deployment]# kubectl autoscale deploy myapp --min=2 --max=5 --cpu-percent=20 horizontalpodautoscaler.autoscaling/myapp autoscaled [root@k8s-master1 deployment]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE myapp Deployment/myapp <unknown>/20% 2 5 0 6s [root@k8s-master1 deployment]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE myapp Deployment/myapp 0%/20% 2 5 2 22s
此命令创建了一个关联资源myapp的HPA,最小的 pod 副本数为2,最大为5。HPA会根据设定的 cpu使用率(60%)动态的增加或者减少pod数量。
下面命令用于显示HPA控制器的当前状态:
[root@k8s-master1 deployment]# kubectl get hpa myapp -o yaml apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: annotations: autoscaling.alpha.kubernetes.io/conditions: '[{"type":"AbleToScale","status":"True","lastTransitionTime":"2022-12-10T15:17:11Z","reason":"ScaleDownStabilized","message":"recent recommendations were higher than current one, applying the highest recent recommendation"},{"type":"ScalingActive","status":"True","lastTransitionTime":"2022-12-10T15:17:11Z","reason":"ValidMetricFound","message":"the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)"},{"type":"ScalingLimited","status":"False","lastTransitionTime":"2022-12-10T15:17:11Z","reason":"DesiredWithinRange","message":"the desired count is within the acceptable range"}]' autoscaling.alpha.kubernetes.io/current-metrics: '[{"type":"Resource","resource":{"name":"cpu","currentAverageUtilization":0,"currentAverageValue":"0"}}]' creationTimestamp: "2022-12-10T15:16:55Z" managedFields: - apiVersion: autoscaling/v1 fieldsType: FieldsV1 fieldsV1: f:spec: f:maxReplicas: {} f:minReplicas: {} f:scaleTargetRef: f:apiVersion: {} f:kind: {} f:name: {} f:targetCPUUtilizationPercentage: {} manager: kubectl-autoscale operation: Update time: "2022-12-10T15:16:55Z" - apiVersion: autoscaling/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: .: {} f:autoscaling.alpha.kubernetes.io/conditions: {} f:autoscaling.alpha.kubernetes.io/current-metrics: {} f:status: f:currentCPUUtilizationPercentage: {} f:currentReplicas: {} f:desiredReplicas: {} manager: kube-controller-manager operation: Update time: "2022-12-10T15:17:11Z" name: myapp namespace: default resourceVersion: "1641787" selfLink: /apis/autoscaling/v1/namespaces/default/horizontalpodautoscalers/myapp uid: dfc5d7d0-f5a3-405e-bf12-10de4287ac90 spec: maxReplicas: 5 minReplicas: 2 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: myapp targetCPUUtilizationPercentage: 20 status: currentCPUUtilizationPercentage: 0 currentReplicas: 2 desiredReplicas: 2
3. 压力测试
HPA控制器会试图让Pod对象相应资源的占用率无限接近设定的目标值。例如,向myapp-svc的NodePort发起持续性的压力测试式访问请求,各pod对象的CPU利用率将持续上升,直到超过目标利用率边界的20%,而后触发增加pod对象副本数量。待其资源占用率下降到必须要降低pod对象的数量以使得资源占用率靠近目标设定值时,即触发pod副本的终止操作。
增大负载进行测试,创建一个busybox,并且循环访问上面创建的服务
[root@k8s-master1 hpa]# kubectl get pods -o wide |grep myapp myapp-56767f6f74-hqldq 1/1 Running 0 11m 10.244.169.146 k8s-node2 <none> <none> myapp-56767f6f74-nmf8h 1/1 Running 0 45m 10.244.36.77 k8s-node1 <none> <none> [root@k8s-master1 deployment]# kubectl run -i --tty load-generator --image=busybox:1.28 -- /bin/sh If you don't see a command prompt, try pressing enter. / # while true; do wget -q -O- http://10.244.36.77:80; done <!DOCTYPE html> <html> <head> <meta charset="utf-8"> <title>Sample Deployment</title> <style> body { color: #ffffff; background-color: blue; font-family: Arial, sans-serif; font-size: 14px; } h1 { font-size: 500%; font-weight: normal; margin-bottom: 0; } h2 { font-size: 200%; font-weight: normal; margin-bottom: 0; } </style> </head> <body> <div align="center"> <h1>Welcome to V1 of the web application</h1> <h2>This application will be deployed on Kubernetes.</h2> </div> </body> </html> ....
下图可以看到,HPA已经开始工作
[root@k8s-master1 hpa]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE myapp Deployment/myapp 0%/20% 2 5 2 3m46s [root@k8s-master1 hpa]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE myapp Deployment/myapp 48%/20% 2 5 2 3m53s #deployment的pod数量从2增加到4 [root@k8s-master1 hpa]# kubectl get deployment myapp NAME READY UP-TO-DATE AVAILABLE AGE myapp 4/5 5 4 34m [root@k8s-master1 hpa]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE myapp Deployment/myapp 46%/20% 2 5 4 4m10s #deployment的pod数量从4增加到5 [root@k8s-master1 hpa]# kubectl get deployment myapp NAME READY UP-TO-DATE AVAILABLE AGE myapp 5/5 5 5 34m [root@k8s-master1 hpa]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE myapp Deployment/myapp 46%/20% 2 5 4 4m16s [root@k8s-master1 hpa]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE myapp Deployment/myapp 25%/20% 2 5 5 4m21s [root@k8s-master1 hpa]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE myapp Deployment/myapp 23%/20% 2 5 5 4m45s [root@k8s-master1 hpa]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE myapp Deployment/myapp 20%/20% 2 5 5 4m53s
由于副本数一直增加,再次查看HPA,使用率也保持在了20%左右,myapp的pod数量也保持在5个。
[root@k8s-master1 hpa]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE myapp Deployment/myapp 20%/20% 2 5 5 5m [root@k8s-master1 hpa]# kubectl get deployment myapp NAME READY UP-TO-DATE AVAILABLE AGE myapp 5/5 5 5 35m
同样的,此时关掉busybox来减少负载,然后等待一段时间观察下HPA和Deployment对象
[root@k8s-master1 hpa]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE myapp Deployment/myapp 19%/20% 2 5 5 5m21s [root@k8s-master1 hpa]# kubectl get deployment myapp NAME READY UP-TO-DATE AVAILABLE AGE myapp 5/5 5 5 35m [root@k8s-master1 hpa]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE myapp Deployment/myapp 0%/20% 2 5 2 11m [root@k8s-master1 hpa]# kubectl get deployment myapp NAME READY UP-TO-DATE AVAILABLE AGE myapp 2/2 2 2 41m
可以看到副本数量已经由5变为2。
当然除了使用kubectl autoscale命令来创建外,依然可以通过创建YAML文件的形式来创建HPA资源对象。