kube-prometheus(prometheus-operator)监控(二):解决kube-controller-manager没数据
问题描述
监控系统搭建好后,kube-controller-manager和kube-scheduler没有数据:
背景描述
Kubernetes 版本:1.20.1
kube-prometheus 版本:0.7
版本说明:https://github.com/prometheus-operator/kube-prometheus/tree/release-0.7#apply-the-kube-prometheus-stack
排查过程
检查kube-controller-manager和kube-scheduler服务端口
在k8s-master01执行
# netstat -nltp|grep sche
# netstat -nltp|grep contr
从图可以看出,端口已经对0.0.0.0放开。
查看serviceMonitor的标签
kube-controller-manager的serviceMonitor文件在kube-prometheus/manifests/prometheus-serviceMonitorKubeControllerManager.yaml
kube-scheduler的serviceMonitor文件在kube-prometheus/manifests/prometheus-serviceMonitorKubeScheduler.yaml
# cd kube-prometheus/manifests/
# grep "selector" prometheus-serviceMonitorKubeControllerManager.yaml -A 3
# grep "selector" prometheus-serviceMonitorKubeScheduler.yaml -A 3
或者用命令检查:
kubectl get servicemonitors.monitoring.coreos.com -n monitoring kube-controller-manager -oyaml
kubectl get servicemonitors.monitoring.coreos.com -n monitoring kube-scheduler -oyaml
从截图可以看出,kube-controller-manager标签是:k8s-app: kube-controller-manager
同样方式,查看出kube-scheduler的标签是:k8s-app: kube-scheduler
查看kube-controller-manager和kube-scheduler的svc
此处说明下,kube-controller-manager和kube-scheduler是二进制方式安装,svc和endpoint需要自己新增并创建,注意标签需要与serviceMonitor的标签一致
添加kube-controller-manager和kube-scheduler的svc
文件prometheus-kubeControllerManagerService.yaml
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: kube-controller-manager
name: kube-controller-manager
namespace: kube-system
spec:
type: ClusterIP
sessionAffinity: None
ports:
- name: http-metrics
port: 10252
protocol: TCP
targetPort: 10252
---
#二进制部署还得创建对应的 Endpoints 对象将两个组件挂入到 kubernetes 集群内,然后通过 Service 提供访问,才能让 Prometheus 监控到。
apiVersion: v1
kind: Endpoints
metadata:
labels:
k8s-app: kube-controller-manager
name: kube-controller-manager
namespace: kube-system
subsets:
- addresses:
- ip: 10.10.246.36
- ip: 10.10.246.37
- ip: 10.10.246.38
ports:
- name: http-metrics
port: 10252
protocol: TCP
文件prometheus-kubeSchedulerService.yaml
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: kube-scheduler
name: kube-scheduler
namespace: kube-system
spec:
type: ClusterIP
ports:
- name: http-metrics
port: 10251
protocol: TCP
targetPort: 10251
---
apiVersion: v1
kind: Endpoints
metadata:
labels:
k8s-app: kube-scheduler
name: kube-scheduler
namespace: kube-system
subsets:
- addresses:
- ip: 10.10.246.36
- ip: 10.10.246.37
- ip: 10.10.246.38
ports:
- name: http-metrics
port: 10251
protocol: TCP
启动:
kubectl create -f prometheus-kubeControllerManagerService.yaml
kubectl create -f prometheus-kubeSchedulerService.yaml
检查svc标签是否与serviceMonitor的标签一致
# kubectl get svc -n kube-system -l k8s-app=kube-controller-manager
# kubectl get svc -n kube-system -l k8s-app=kube-scheduler
创建完成后,隔一小会儿后去 Prometheus 页面上查看 targets 下面 kube-scheduler 是否可以采集到指标数据了。
不过我们看到上述问题并没有解决。
重新配置ServiceMonitor
通过对比,我们发现 ServiceMonitor 的资源在v0.7.0 中用的是https协议:
我们二进制部署的是http协议的。所以我创建了两个旧版本的资源定义。
将原来的停止:
cd kube-prometheus/manifests
kubectl delete -f prometheus-serviceMonitorKubeControllerManager.yaml
kubectl delete -f prometheus-serviceMonitorKubeScheduler.yaml
文件prometheus-serviceMonitorKubeScheduler.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
k8s-app: kube-controller-manager
name: kube-controller-manager
namespace: monitoring
spec:
endpoints:
- interval: 30s
metricRelabelings:
- action: drop
regex: etcd_(debugging|disk|request|server).*
sourceLabels:
- __name__
port: http-metrics
jobLabel: k8s-app
namespaceSelector:
matchNames:
- kube-system
selector:
matchLabels:
k8s-app: kube-controller-manager
文件prometheus-serviceMonitorKubeScheduler.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
k8s-app: kube-scheduler
name: kube-scheduler
namespace: monitoring
spec:
endpoints:
- interval: 30s # 每30s获取一次信息
port: http-metrics # 对应 service 的端口名
jobLabel: k8s-app
namespaceSelector: # 表示去匹配某一命名空间中的service,如果想从所有的namespace中匹配用any: true
matchNames:
- kube-system
selector: # 匹配的 Service 的 labels,如果使用 mathLabels,则下面的所有标签都匹配时才会匹配该 service,如果使用 matchExpressions,则至少匹配一个标签的 service 都会被选择
matchLabels:
k8s-app: kube-scheduler
将修改好的配置启动:
kubectl create -f prometheus-serviceMonitorKubeControllerManager.yaml
kubectl create -f prometheus-serviceMonitorKubeScheduler.yaml
查看: