52、K8S-Pod弹性伸缩-HPA-HorizontalPodAutoscaler、metrics-server
1、基础知识
1.1、监控指标
在k8s的系统上包含了各种各样的指标数据,早期的k8s系统,为kubelet集成了一个CAdvsior工具可以获取kubelet所在节点上的相关指标,包括容器指标。但是CAdvsior的缺陷在于,我们仅能够获取,指定
节点上的指标信息,而无法获取集群管理的统一指标。
比如,在k8s集群中,提供了一些监控用的命令,比如top,通过它可以汇总节点上的相关统计信息。由于没有默认情况下,k8s没有提供专用的metrics接口,所以这个命令无法正常使用。
~]# kubectl top node
error: Metrics API not available
虽然k8s已经内嵌了CAdvsior,但是没有集群级别的资源对象能够汇总这些所有的指标数据,并通过相关资源对象的api接口暴露出去。
kubectl api-resources | grep metrics
1.2、指标类型
早期的k8s提供了一个专用的metrics的指标服务器heapster,用于采集所有的监控数据,只不过这个软件因为某些原因被弃用了。
其中一部分原因就是,k8s的指标分化成了两种不同的类别:
核心资源指标
- k8s系统内部默认就应用的指标
- 它通过metrics-server服务来进行采集所有kubelet的数据,进而通过/metrics接口输出数据/api/metrics.k8s.io/v1bata1该服务默认情况下,是仅仅获取指标数据,然后保存到内存中。
自定义指标
- 用户根据情况自定义的指标
- 通过专用的监控平台软件来实现,或者通过对项目接口改造,暴露相关数据
注意:
默认情况下,k8s的指标格式与prometheus的指标格式不兼容。 如果我们需要通过prometheus来监控k8s的话,需要添加适配器。
1.3、数据采集汇总
1.3.1、监控代理程序
如node_exporter,收集标准的主机指标数据,包括平均负载、CPU、Memory、Disk、Network及诸多其他维度的数据
1.3.2、kubelet
收集容器指标数据,它们也是Kubernetes“核心指标”,每个容器的相关指标数据主要有CPU利用率(user和system)及限额、文件系统读/写/限额、内存利用率及限额、网络报文发送/接收/丢弃速率等。
1.3.3、APIServer
收集API Server的性能指标数据,包括控制工作队列的性能、请求速率与延迟时长、etcd缓存工作队列及缓存性能、普通进程状态(文件描述符、内存、CPU等)、Golang状态(GC、内存和线程等)
1.3.4、etcd
收集etcd存储集群的相关指标数据,包括领导节点及领域变动速率、提交/应用/挂起/错误的提案次数、磁盘写入性能、网络与gRPC计数器等
1.3.5、kube-state-metrics
该组件用于根据Kubernetes API Server中的资源派生出多种资源指标,它们主要是资源类型相关的计数器和元数据信息,包括指定类型的对象总数、资源限额、容器状态
(ready/restart/running/terminated/waiting)以及Pod资源的标签系列等
1.4、metrics-server流程图
1.4.1、集群整体流程图
1.4.2、metrics-server与cAdvisor关系图
2、安装metrics-server
2.1、项目地址
https://github.com/kubernetes-sigs/metrics-server 当前版本:v0.6.3 主要用于获取资源的参数,不然HPA无法使用
2.2、下载yaml资源配置清单
wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.6.3/components.yaml
2.3、修改配置文件
2.3.1、准备离线镜像并且修改yaml
docker pull registry.k8s.io/metrics-server/metrics-server:v0.6.3 docker tag registry.k8s.io/metrics-server/metrics-server:v0.6.3 192.168.10.33:80/k8s/hpa/metrics-server:v0.6.3 docker push 192.168.10.33:80/k8s/hpa/metrics-server:v0.6.3 ]# sed -i 's#registry.k8s.io/metrics-server/metrics-server:v0.6.3#192.168.10.33:80/k8s/hpa/metrics-server:v0.6.3#g' components.yaml ]# grep 'image' components.yaml image: 192.168.10.33:80/k8s/hpa/metrics-server:v0.6.3 imagePullPolicy: IfNotPresent
2.3.2、修改启动命令
]# vi components.yaml - args: - --cert-dir=/tmp - --secure-port=4443 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s - --kubelet-insecure-tls # 增加此行参数 image: 192.168.10.33:80/k8s/hpa/metrics-server:v0.6.3 imagePullPolicy: IfNotPresent
2.4、应用资源配置清单
]# kubectl apply -f components.yaml serviceaccount/metrics-server created clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created clusterrole.rbac.authorization.k8s.io/system:metrics-server created rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created service/metrics-server created deployment.apps/metrics-server created apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
2.5、查询运行状态
~]# kubectl api-resources | egrep "metr|NAME" NAME SHORTNAMES APIVERSION NAMESPACED KIND nodes metrics.k8s.io/v1beta1 false NodeMetrics pods metrics.k8s.io/v1beta1 true PodMetrics ]# kubectl -n kube-system get pods -o wide | grep metrics metrics-server-5b7dc79d77-6cfzv 1/1 Running 0 4m29s 10.244.4.124 node2 <none> <none> ]# kubectl -n kube-system get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 7d13h metrics-server ClusterIP 10.97.76.234 <none> 443/TCP 86m
2.6、验证kubectl top是否生效
2.6.1、node
~]# kubectl top node NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% master1 168m 8% 1694Mi 62% node1 84m 2% 1019Mi 27% node2 98m 3% 1075Mi 29%
2.6.2、pod
~]# kubectl -n kube-system top pod NAME CPU(cores) MEMORY(bytes) calico-kube-controllers-74846594dd-lf7p5 1m 28Mi calico-node-brxqj 53m 132Mi calico-node-ckxjg 49m 137Mi calico-node-hgk89 39m 90Mi coredns-c676cc86f-7k94l 3m 15Mi coredns-c676cc86f-qbdpw 3m 20Mi etcd-master1 28m 231Mi kube-apiserver-master1 59m 428Mi kube-controller-manager-master1 23m 60Mi kube-proxy-7zxhp 1m 18Mi kube-proxy-959kc 1m 30Mi kube-proxy-9sv7f 1m 18Mi kube-scheduler-master1 4m 23Mi metrics-server-5b7dc79d77-s24rc 6m 22Mi
3、HPA-实践
3.1、基本知识
3.1.1、HPA工作机制
3.1.2、前提需知
1、报错信息 ]# kubectl describe hpa scalar Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedGetResourceMetric 12s (x2 over 27s) horizontal-pod-autoscaler failed to get cpu utilization: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io) Warning FailedComputeMetricsReplicas 12s (x2 over 27s) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get cpu resource metric value: failed to get cpu utilization: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not
find the requested resource (get pods.metrics.k8s.io) 2、报错信息,TARGETS显示unknown ]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE deployment-hpa Deployment/deployment-hpa <unknown>/50% 1 10 1 106m 以上原因:没有安装metrics-server导致的。
3.2、创建deployment资源
3.2.1、定义资源配置清单且应用
kubectl apply -f - <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: deployment-hpa namespace: default spec: replicas: 1 selector: matchLabels: app: rs-test template: metadata: labels: app: rs-test spec: containers: - name: nginxpod-test image: 192.168.10.33:80/k8s/pod_test:v0.1 resources: limits: cpu: 100m memory: 200Mi requests: cpu: 100m memory: 200Mi EOF
3.2.2、查询运行情况
]# kubectl get deployments.apps NAME READY UP-TO-DATE AVAILABLE AGE deployment-hpa 1/1 1 1 114m ]# kubectl get pods NAME READY STATUS RESTARTS AGE deployment-hpa-88878d778-d8hn9 1/1 Running 0 114m
3.3、创建HPA资源
3.3.1、参考官方文档
3.3.2、v2版本语法-定义资源配置清单且应用
kubectl apply -f - <<EOF apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: scale namespace: default spec: maxReplicas: 10 minReplicas: 1 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 50 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: deployment-hpa EOF # 注意:autoscaling/v2beta2 在v1.23之后不可以用,1.23+需要使用autoscaling/v2
3.3.3、v1版本语法-定义资源配置清单且应用
kubectl apply -f - <<EOF apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: name: deployment-hpa namespace: default spec: maxReplicas: 10 minReplicas: 1 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: deployment-hpa targetCPUUtilizationPercentage: 50 EOF
3.3.4、怎么写hpa yaml
]# kubectl autoscale deployment deployment-hpa --cpu-percent=50 --min=1 --max=10 --dry-run=client -o yaml apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: creationTimestamp: null name: deployment-hpa spec: maxReplicas: 10 minReplicas: 1 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: deployment-hpa targetCPUUtilizationPercentage: 50 status: currentReplicas: 0 desiredReplicas: 0
3.3.5、查询运行结果
]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE scale Deployment/deployment-hpa 1%/50% 1 10 1 3m49s
3.4、压力测试pod
3.4.1、开启监控hpa
]# kubectl get hpa -w NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE scale Deployment/deployment-hpa 1%/50% 1 10 1 4m32s
3.4.2、进入pod进行压测
]# kubectl exec -it deployment-hpa-88878d778-d8hn9 -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"
3.4.3、观察hpa状态
]# kubectl get hpa -w NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE scale Deployment/deployment-hpa 1%/50% 1 10 1 4m32s scale Deployment/deployment-hpa 50%/50% 1 10 1 6m46s scale Deployment/deployment-hpa 100%/50% 1 10 1 7m1s
# 此时cpu跑到50+
3.4.4、查询deployment状态
]# kubectl get deployments.apps NAME READY UP-TO-DATE AVAILABLE AGE deployment-hpa 2/2 2 2 140m
3.4.5、查询pod
]# kubectl get pods NAME READY STATUS RESTARTS AGE deployment-hpa-88878d778-d8hn9 1/1 Running 0 140m deployment-hpa-88878d778-s5p5f 1/1 Running 0 19s
# 发现pod已经自动创建多一个出来了
3.5、自动缩容
# CPU负载下来,隔一段时间后,会自动缩容 ]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE scale Deployment/deployment-hpa 1%/50% 1 10 1 13m