k8s metrics-server
一.
1.报错:
在使用kubectl get nodes时会报错,Metrics-Server API is not found
2.排查:
查看kubectl get pods -n kube-system | grep metrics ,pod是在的
查看kubectl get apiservice -n kube-system | grep metrics ,apiservice中没有metrics注册
查看pod日志 kubectl logs --tail=3000 -f podname -n kube-system 发现只有两行日志没有进行工作
可以得出结论pod没有注册到metrics
3.解决方法:
重装apiservice
Metrics Server Metrics API | group/version | 支持k8s版本 |
0.6.x | metrics.k8s.io/v1beta1 | 1.19+ |
0.5.x | metrics.k8s.io/v1beta1 | *1.8+ |
0.4.x metrics.k8s. | metrics.k8s.io/v1beta1 | *1.8+ |
0.3.x | metrics.k8s.io/v1beta1 | 1.8-1.21 |
本集群的k8s版本是1.20.4,之前装的是0.3.6版本,这个版本在github仓库已查不到,此次装0.6.2版本
下载地址:https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.6.2/components.yaml
主要 改两个地方
134行添加参数 “--kubelet-insecure-tls” 用以禁用证书验证
140行更改镜像地址为阿里云:“registry.cn-hangzhou.aliyuncs.com/chenby/metrics-server:v0.6.2”
apiVersion: v1 kind: ServiceAccount metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: k8s-app: metrics-server rbac.authorization.k8s.io/aggregate-to-admin: "true" rbac.authorization.k8s.io/aggregate-to-edit: "true" rbac.authorization.k8s.io/aggregate-to-view: "true" name: system:aggregated-metrics-reader rules: - apiGroups: - metrics.k8s.io resources: - pods - nodes verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: k8s-app: metrics-server name: system:metrics-server rules: - apiGroups: - "" resources: - nodes/metrics verbs: - get - apiGroups: - "" resources: - pods - nodes verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: labels: k8s-app: metrics-server name: metrics-server-auth-reader namespace: kube-system roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: extension-apiserver-authentication-reader subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: k8s-app: metrics-server name: metrics-server:system:auth-delegator roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:auth-delegator subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: k8s-app: metrics-server name: system:metrics-server roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:metrics-server subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: v1 kind: Service metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system spec: ports: - name: https port: 443 protocol: TCP targetPort: https selector: k8s-app: metrics-server --- apiVersion: apps/v1 kind: Deployment metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system spec: selector: matchLabels: k8s-app: metrics-server strategy: rollingUpdate: maxUnavailable: 0 template: metadata: labels: k8s-app: metrics-server spec: containers: - args: - --kubelet-insecure-tls - --cert-dir=/tmp - --secure-port=4443 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s image: registry.cn-hangzhou.aliyuncs.com/chenby/metrics-server:v0.6.2 #image: k8s.gcr.io/metrics-server/metrics-server:v0.6.2 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /livez port: https scheme: HTTPS periodSeconds: 10 name: metrics-server ports: - containerPort: 4443 name: https protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /readyz port: https scheme: HTTPS initialDelaySeconds: 20 periodSeconds: 10 resources: requests: cpu: 100m memory: 200Mi securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true runAsNonRoot: true runAsUser: 1000 volumeMounts: - mountPath: /tmp name: tmp-dir nodeSelector: kubernetes.io/os: linux priorityClassName: system-cluster-critical serviceAccountName: metrics-server volumes: - emptyDir: {} name: tmp-dir --- apiVersion: apiregistration.k8s.io/v1 kind: APIService metadata: labels: k8s-app: metrics-server name: v1beta1.metrics.k8s.io spec: group: metrics.k8s.io groupPriorityMinimum: 100 insecureSkipTLSVerify: true service: name: metrics-server namespace: kube-system version: v1beta1 versionPriority: 100
执行kubectl apply -f components.yaml,在metrics-server部署成功后,查看kubectl top nodes,可以显示资源使用情况了。
二.
1.报错:
在使用kubectl get nodes时会报错,Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
2.排查:
查看kubectl get pods -n kube-system | grep metrics ,pod是在的
查看kubectl get apiservice -n kube-system | grep metrics ,metrics已经注册到apiservice中
查看pod日志 kubectl logs --tail=3000 -f podname -n kube-system 发现日志正常输出
3.解决方法:
尝试使用重装metrics-server进行问题解决,解决方法同上,重装后查看kubectl top nodes,可以显示资源使用情况了。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· 单线程的Redis速度为什么快?