kube-prometheus(prometheus-operator)监控(二):解决kube-controller-manager没数据

问题描述

监控系统搭建好后,kube-controller-manager和kube-scheduler没有数据:

背景描述

Kubernetes 版本:1.20.1
kube-prometheus 版本:0.7
版本说明:https://github.com/prometheus-operator/kube-prometheus/tree/release-0.7#apply-the-kube-prometheus-stack

排查过程

检查kube-controller-manager和kube-scheduler服务端口

在k8s-master01执行
# netstat -nltp|grep sche
# netstat -nltp|grep contr


从图可以看出,端口已经对0.0.0.0放开。

查看serviceMonitor的标签

kube-controller-manager的serviceMonitor文件在kube-prometheus/manifests/prometheus-serviceMonitorKubeControllerManager.yaml
kube-scheduler的serviceMonitor文件在kube-prometheus/manifests/prometheus-serviceMonitorKubeScheduler.yaml

# cd kube-prometheus/manifests/
# grep "selector" prometheus-serviceMonitorKubeControllerManager.yaml -A 3
# grep "selector" prometheus-serviceMonitorKubeScheduler.yaml -A 3


或者用命令检查:

kubectl get servicemonitors.monitoring.coreos.com -n monitoring kube-controller-manager -oyaml
kubectl get servicemonitors.monitoring.coreos.com -n monitoring kube-scheduler -oyaml

从截图可以看出,kube-controller-manager标签是:k8s-app: kube-controller-manager
同样方式,查看出kube-scheduler的标签是:k8s-app: kube-scheduler

查看kube-controller-manager和kube-scheduler的svc

此处说明下,kube-controller-manager和kube-scheduler是二进制方式安装,svc和endpoint需要自己新增并创建,注意标签需要与serviceMonitor的标签一致

添加kube-controller-manager和kube-scheduler的svc

文件prometheus-kubeControllerManagerService.yaml

apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: kube-controller-manager
  name: kube-controller-manager
  namespace: kube-system
spec:
  type: ClusterIP
  sessionAffinity: None
  ports:
  - name: http-metrics
    port: 10252
    protocol: TCP
    targetPort: 10252
---
#二进制部署还得创建对应的 Endpoints 对象将两个组件挂入到 kubernetes 集群内,然后通过 Service 提供访问,才能让 Prometheus 监控到。
apiVersion: v1
kind: Endpoints
metadata:
  labels:
    k8s-app: kube-controller-manager
  name: kube-controller-manager
  namespace: kube-system
subsets:
- addresses:
  - ip: 10.10.246.36
  - ip: 10.10.246.37
  - ip: 10.10.246.38
  ports:
  - name: http-metrics
    port: 10252
    protocol: TCP

文件prometheus-kubeSchedulerService.yaml

apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: kube-scheduler
  name: kube-scheduler
  namespace: kube-system
spec:
  type: ClusterIP
  ports:
  - name: http-metrics
    port: 10251
    protocol: TCP
    targetPort: 10251
---
apiVersion: v1
kind: Endpoints
metadata:
  labels:
    k8s-app: kube-scheduler
  name: kube-scheduler
  namespace: kube-system
subsets:
- addresses:
  - ip: 10.10.246.36
  - ip: 10.10.246.37
  - ip: 10.10.246.38
  ports:
  - name: http-metrics
    port: 10251
    protocol: TCP

启动:

kubectl create -f prometheus-kubeControllerManagerService.yaml
kubectl create -f prometheus-kubeSchedulerService.yaml

检查svc标签是否与serviceMonitor的标签一致

# kubectl get svc -n kube-system -l k8s-app=kube-controller-manager
# kubectl get svc -n kube-system -l k8s-app=kube-scheduler

创建完成后,隔一小会儿后去 Prometheus 页面上查看 targets 下面 kube-scheduler 是否可以采集到指标数据了。
不过我们看到上述问题并没有解决。

重新配置ServiceMonitor

通过对比,我们发现 ServiceMonitor 的资源在v0.7.0 中用的是https协议:

我们二进制部署的是http协议的。所以我创建了两个旧版本的资源定义。
将原来的停止:

cd kube-prometheus/manifests
kubectl delete -f prometheus-serviceMonitorKubeControllerManager.yaml
kubectl delete -f prometheus-serviceMonitorKubeScheduler.yaml

文件prometheus-serviceMonitorKubeScheduler.yaml

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    k8s-app: kube-controller-manager
  name: kube-controller-manager
  namespace: monitoring
spec:
  endpoints:
  - interval: 30s
    metricRelabelings:
    - action: drop
      regex: etcd_(debugging|disk|request|server).*
      sourceLabels:
      - __name__
    port: http-metrics
  jobLabel: k8s-app
  namespaceSelector:
    matchNames:
    - kube-system
  selector:
    matchLabels:
      k8s-app: kube-controller-manager

文件prometheus-serviceMonitorKubeScheduler.yaml

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    k8s-app: kube-scheduler
  name: kube-scheduler
  namespace: monitoring
spec:
  endpoints:
  - interval: 30s  # 每30s获取一次信息
    port: http-metrics  # 对应 service 的端口名
  jobLabel: k8s-app
  namespaceSelector:  # 表示去匹配某一命名空间中的service,如果想从所有的namespace中匹配用any: true
    matchNames:
    - kube-system
  selector:  # 匹配的 Service 的 labels,如果使用 mathLabels,则下面的所有标签都匹配时才会匹配该 service,如果使用 matchExpressions,则至少匹配一个标签的 service 都会被选择
    matchLabels:
      k8s-app: kube-scheduler

将修改好的配置启动:

kubectl create -f  prometheus-serviceMonitorKubeControllerManager.yaml
kubectl create -f prometheus-serviceMonitorKubeScheduler.yaml

查看:

posted @ 2022-11-24 12:27  邹姣姣  阅读(1281)  评论(0编辑  收藏  举报