k8s监控系统-prometheus

一、方案:prometheus + altermanager + grafana + 各种收集组件

二、部署

参考:https://github.com/prometheus-operator/kube-prometheus

1、拉取项目

说明:该项目对应k8s版本,拉取符合k8s版本的项目

git clone -b v0.10.0 https://github.com/prometheus-operator/kube-prometheus.git

2、部署项目

说明:在部署前,建议修改manifests目录下alertmanager-alertmanager.yaml、prometheus-prometheus.yaml副本数为1,资源充足的话忽略。

cd kube-prometheus
kubectl apply -f manifests/setup/
kubectl apply -f manifests/

3、创建prometheus、grafana ingress,部署ingress-nginx请参考ingress文章

prometheus-ingress.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: prometheus-ingress
  namespace: monitoring 
spec:
  ingressClassName: nginx
  rules:
  - host: "test.prometheus.com"
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: prometheus-k8s
            port:
              number: 9090

grafana-ingress.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: grafana-ingress
  namespace: monitoring 
spec:
  ingressClassName: nginx
  rules:
  - host: "test.grafana.com"
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: grafana
            port:
              number: 3000

4、添加本机hosts,用于浏览器访问

192.168.152.10 test.grafana.com
192.168.152.10 test.prometheus.com

三、浏览器访问http://test.grafana.com

说明:controller-manager、scheduler、etcd组件面板没有监控数据显示,这是因为使用了kubeadm方式部署k8s。

 

 四、解决controller-manager、scheduler、etcd组件面板没有监控数据问题

参考:https://cloud.tencent.com/developer/article/1807805

1、解决controller-manager组件面板没有监控数据问题

修改/etc/kubernetes/manifests/kube-controller-manager.yaml中bind-address=127.0.0.1更改为bind-address=0.0.0.0,修改完后等待组件自动重启。

 

 创建controller-manager组件的Service和Endpoints

kube-controller-manager-svc-ep.yml

apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: kube-controller-manager
  labels:
    app.kubernetes.io/name: kube-controller-manager
spec:
  selector:
    component: kube-controller-manager
  type: ClusterIP
  clusterIP: None
  ports:
  - name: https-metrics
    port: 10257
    targetPort: 10257
    protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
  labels:
    k8s-app: kube-controller-manager
  name: kube-controller-manager
  namespace: kube-system
subsets:
- addresses:
  # 根据实际情况添加修改ip
  - ip: 192.168.152.10
  ports:
  - name: https-metrics
    port: 10257
    protocol: TCP
kubectl apply -f kube-controller-manager-svc-ep.yml

2、解决scheduler组件面板没有监控数据问题

修改/etc/kubernetes/manifests/kube-scheduler.yaml中bind-address=127.0.0.1更改为bind-address=0.0.0.0,修改完后等待组件自动重启。

 

 创建scheduler组件的Service和Endpoints

kube-scheduler-svc-ep.yml

apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: kube-scheduler
  labels:
    app.kubernetes.io/name: kube-scheduler
spec:
  selector:
    component: kube-scheduler
  type: ClusterIP
  clusterIP: None
  ports:
  - name: https-metrics
    port: 10259
    targetPort: 10259
    protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
  labels:
    k8s-app: kube-scheduler
  name: kube-scheduler
  namespace: kube-system
subsets:
- addresses:
# 根据实际情况添加修改ip
- ip: 192.168.152.10 ports: - name: https-metrics port: 10259 protocol: TCP
kubectl apply -f kube-scheduler-svc-ep.yml

3、解决etcd组件面板没有监控数据问题

创建etcd组件的Service和Endpoints

kube-etcd-svc-ep.yml

apiVersion: v1
kind: Service
metadata:
  name: etcd-k8s
  namespace: kube-system
  labels:
    k8s-app: etcd
spec:
  type: ClusterIP
  clusterIP: None
  ports:
  - name: etcd
    port: 2379
    protocol: TCP

---
apiVersion: v1
kind: Endpoints
metadata:
  labels:
    k8s-app: etcd
  name: etcd-k8s
  namespace: kube-system
subsets:
- addresses:
  # 根据实际情况添加修改ip
  - ip: 192.168.152.10
  ports:
  - name: etcd
    port: 2379
    protocol: TCP
kubectl apply -f kube-etcd-svc-ep.yml

创建连接etcd的证书

kubectl -n monitoring create secret generic etcd-certs --from-file=/etc/kubernetes/pki/etcd/healthcheck-client.crt --from-file=/etc/kubernetes/pki/etcd/healthcheck-client.key --from-file=/etc/kubernetes/pki/etcd/ca.crt

prometheus挂载证书:编辑prometheus,配置secrets。

kubectl edit prometheus k8s -n monitoring

 

 创建etcd的ServiceMonitor

kube-etcd-sm.yml

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: etcd-k8s
  namespace: monitoring
  labels:
    k8s-app: etcd
spec:
  jobLabel: k8s-app
  endpoints:
  - port: etcd
    interval: 30s
    scheme: https
    tlsConfig:
      caFile: /etc/prometheus/secrets/etcd-certs/ca.crt
      certFile: /etc/prometheus/secrets/etcd-certs/healthcheck-client.crt
      keyFile: /etc/prometheus/secrets/etcd-certs/healthcheck-client.key
      insecureSkipVerify: true
  selector:
    matchLabels:
      k8s-app: etcd
  namespaceSelector:
    matchNames:
    - kube-system
kubectl apply -f kube-etcd-sm.yml

4、验证

浏览器访问test.grafana.com

controller-manager组件面板

 

 scheduler组件面板

 

 etcd组件面板,默认面板中没有etcd的,可以从https://grafana.com/grafana/dashboards/找一个导入即可。

 

posted @ 2022-09-29 14:07  屠夫2022  阅读(928)  评论(0编辑  收藏  举报