监控-Prometheus09-监控Kubernetes

  • 在过去的几年中,云计算已经成为及分布式计算最火热的技术之一,其中Docker、Kubernetes、Prometheus等开源软件的发展极大地推动了云计算的发展。
  • Kubernetes使用Docker进行容器管理,如果说Docker和kubernetes的搭配是云原生时代的基石,那么Prometheus为云原生插上了飞翔的翅膀。随着云原生社区的不断壮大,应用场景越来越复杂,需要一套针对云原生环境的完善并且开放的监控平台。在这样的环境下,Prometheus应运而生,天然支持Kubernetes。
  • 传统方式部署步骤相对复杂,随着Operator的日益成熟,推荐使用Operator方式部署Prometheus。通过Operator方式部署Prometheus,可将更多的操作集成到Operator中,简化了操作过程,也使部署更加简单。

1、Prometheus Operator介绍

  • Kubernetes的Prometheus Operator为Kubernetes服务和Prometheus实例的部署和管理提供了简单的监控定义。
  • Prometheus Operator(后面都简称Operater)提供如下功能:
    • 创建/销毁:在Kubernetes namespace中更加容易地启动一个Prometheues实例,一个特定应用程序或者团队可以更容易使用Operator。
    • 便捷配置:可以通过Kubernetes资源配置Prometheus的基本信息,比如版本、存储、保留策略和副本集等。

1.1、Prometheus Operator架构

  • Prometheus Operator架构如图11-1所示:

  • 架构中的各组以k8s自定义资源的方式运行在Kubernetes集群中,它们各自有不同的作用。
    • Operator:根据自定义资源(Custom ResourceDefinition,CRD)来部署和管理Prometheus Server,同时监控这些自定义资源事件的变化来做相应的处理,是整个系统的控制中心。
    • Prometheus资源:声明Prometheus statefulset控制器的期望状态,Prometheus Operator确保这个statefulset控制器运行时一直与定义保持一致。
    • Prometheus Server:Operator根据Prometheus资源部署Prometheus Server集群,这些自定义资源可以看作是用来管理Prometheus Server集群的StatefulSets资源。
    • Alertmanager资源:定义AlertManager statefulset控制器的期望状态,Prometheus Operator确保这个statefulset控制器运行时一直与定义保持一致。
    • ServiceMonitor资源:声明Prometheus监控的target列表。该资源通过Labels来选取对应的Service Endpoint,让Prometheus Server通过选取的Service来获取Metrics信息。
    • Service:简单的说就是Prometheus监控的对象。

2.2、Prometheus Operator的自定义资源

  • Prometheus Operater有四种自定义资源
    • Prometheus
    • ServiceMonitor
    • Alertmanager
    • PrometheusRule
  • 查看名称空间中的所有资源
1
]# kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get --show-kind --ignore-not-found -n monitoring

1、Prometheus资源

  • Prometheus自定义资源(CRD):声明Prometheus statefulset控制器的期望状态,Prometheus Operator确保这个statefulset控制器运行时一直与定义保持一致。包含副本数量、持久化存储以及Prometheus实例发送警告到的Alertmanagers等配置选项。
  • Prometheus Operator会根据Prometheus资源在相同namespace下生成一个StatefulSet控制器。Prometheus的Pod都会挂载一个名为<prometheus-name>的Secret,里面包含了Prometheus的配置。Prometheus Operator根据包含的ServiceMonitor生成配置,并且更新含有配置的Secret。无论是对ServiceMonitors或者Prometheus的修改,都会持续不断的被按照前面的步骤更新。
  • 查看Prometheus资源(使用helm部署kube-Prometheus的内容)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
]# kubectl edit -n monitoring prometheus.monitoring.coreos.com/kube-promet
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  annotations:
    meta.helm.sh/release-name: kube-prometheus
    meta.helm.sh/release-namespace: monitoring
  labels:
    app.kubernetes.io/component: prometheus
    app.kubernetes.io/instance: kube-prometheus
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: kube-prometheus
    helm.sh/chart: kube-prometheus-8.1.11
  name: kube-prometheus-prometheus
  namespace: monitoring
spec:
  affinity:             #定义亲和性
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          labelSelector:
            matchLabels:
              app.kubernetes.io/component: prometheus
              app.kubernetes.io/instance: kube-prometheus
              app.kubernetes.io/name: kube-prometheus
          namespaces:
          - monitoring
          topologyKey: kubernetes.io/hostname
        weight: 1
  alerting:            #定义Prometheus关联的Alertmanager
    alertmanagers:
    - name: kube-prometheus-alertmanager
      namespace: monitoring
      pathPrefix: /
      port: http
  containers:          #定义容器
  - name: prometheus   #prometheus容器
    livenessProbe:     #容器存活性探针
      failureThreshold: 10
      httpGet:
        path: /-/healthy
        port: web
        scheme: HTTP
      initialDelaySeconds: 0
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 3
    readinessProbe:    #容器可用性探针
      failureThreshold: 10
      httpGet:
        path: /-/ready
        port: web
        scheme: HTTP
      initialDelaySeconds: 0
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 3
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      readOnlyRootFilesystem: false
      runAsNonRoot: true
    startupProbe:      #容器启动性探针
      failureThreshold: 60
      httpGet:
        path: /-/ready
        port: web
        scheme: HTTP
      initialDelaySeconds: 0
      periodSeconds: 15
      successThreshold: 1
      timeoutSeconds: 3
  - name: config-reloader    #config-reloader容器
    livenessProbe:
      failureThreshold: 6
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 1
      tcpSocket:
        port: reloader-web
      timeoutSeconds: 5
    readinessProbe:
      failureThreshold: 6
      initialDelaySeconds: 15
      periodSeconds: 20
      successThreshold: 1
      tcpSocket:
        port: reloader-web
      timeoutSeconds: 5
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      readOnlyRootFilesystem: false
      runAsNonRoot: true
  enableAdminAPI: false
  evaluationInterval: 30s
  externalUrl: http://127.0.0.1:9090/
  image: docker.io/bitnami/prometheus:2.39.1-debian-11-r1    #镜像
  listenLocal: false
  logFormat: logfmt    #日志风格
  logLevel: info
  paused: false
  podMetadata:
    labels:
      app.kubernetes.io/component: prometheus
      app.kubernetes.io/instance: kube-prometheus
      app.kubernetes.io/name: kube-prometheus
  podMonitorNamespaceSelector: {}
  podMonitorSelector: {}
  portName: web
  probeNamespaceSelector: {}
  probeSelector: {}
  replicas: 1         #定义Proemtheus“集群”有两个副本。说是集群,其实Prometheus自身不带集群功能,这里只是起两个完全一样的Prometheus来避免单点故障
  retention: 10d
  routePrefix: /
  ruleNamespaceSelector: {}
  ruleSelector: {}    #定义Prometheus使用哪些PrometheusRule,根据标签选择。
  scrapeInterval: 30s
  securityContext:
    fsGroup: 1001
    runAsUser: 1001
  serviceAccountName: kube-prometheus-prometheus
  serviceMonitorNamespaceSelector: {}    #定义Prometheus在哪些namespace中选择要被监控的ServiceMonitor,根据标签选择namespace。不声明则会全部选中
  serviceMonitorSelector: {}             #定义Prometheus选择哪些要被监控的ServiceMonitor,根据标签选择ServiceMonitor。不声明则会全部选中
  shards: 1
  storage:             #定义存储卷
    volumeClaimTemplate:
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 8Gi
        storageClassName: nfs-client
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: "2022-10-22T21:24:42Z"
    status: "True"
    type: Available
  - lastTransitionTime: "2022-10-22T21:12:51Z"
    status: "True"
    type: Reconciled
  paused: false
  replicas: 1
  shardStatuses:
  - availableReplicas: 1
    replicas: 1
    shardID: "0"
    unavailableReplicas: 0
    updatedReplicas: 1
  unavailableReplicas: 0
  updatedReplicas: 1
  • 查看Prometheus资源生成的StatefulSet控制器(使用helm部署kube-Prometheus的内容)
复制代码
]# kubectl edit -n monitoring statefulset.apps/prometheus-kube-prometheus-prometheus
apiVersion: apps/v1
kind: StatefulSet
metadata:
  annotations:
    meta.helm.sh/release-name: kube-prometheus
    meta.helm.sh/release-namespace: monitoring
  generation: 1
  labels:
    app.kubernetes.io/component: prometheus
    app.kubernetes.io/instance: kube-prometheus
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: kube-prometheus
    helm.sh/chart: kube-prometheus-8.1.11
    operator.prometheus.io/name: kube-prometheus-prometheus
    operator.prometheus.io/shard: "0"
  name: prometheus-kube-prometheus-prometheus
  namespace: monitoring
  ownerReferences:
  - apiVersion: monitoring.coreos.com/v1
    blockOwnerDeletion: true
    controller: true
    kind: Prometheus
    name: kube-prometheus-prometheus
spec:
  podManagementPolicy: Parallel
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/instance: kube-prometheus-prometheus
      app.kubernetes.io/managed-by: prometheus-operator
      app.kubernetes.io/name: prometheus
      operator.prometheus.io/name: kube-prometheus-prometheus
      operator.prometheus.io/shard: "0"
      prometheus: kube-prometheus-prometheus
  serviceName: prometheus-operated
  template:
    metadata:
      annotations:
        kubectl.kubernetes.io/default-container: prometheus
      creationTimestamp: null
      labels:
        app.kubernetes.io/component: prometheus
        app.kubernetes.io/instance: kube-prometheus-prometheus
        app.kubernetes.io/managed-by: prometheus-operator
        app.kubernetes.io/name: prometheus
        app.kubernetes.io/version: 2.39.0
        operator.prometheus.io/name: kube-prometheus-prometheus
        operator.prometheus.io/shard: "0"
        prometheus: kube-prometheus-prometheus
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchLabels:
                  app.kubernetes.io/component: prometheus
                  app.kubernetes.io/instance: kube-prometheus
                  app.kubernetes.io/name: kube-prometheus
              namespaces:
              - monitoring
              topologyKey: kubernetes.io/hostname
            weight: 1
      automountServiceAccountToken: true
      containers:
      - args:
        - --web.console.templates=/etc/prometheus/consoles
        - --web.console.libraries=/etc/prometheus/console_libraries
        - --storage.tsdb.retention.time=10d
        - --config.file=/etc/prometheus/config_out/prometheus.env.yaml
        - --storage.tsdb.path=/prometheus
        - --web.enable-lifecycle
        - --web.external-url=http://127.0.0.1:9090/
        - --web.route-prefix=/
        - --web.config.file=/etc/prometheus/web_config/web-config.yaml
        image: docker.io/bitnami/prometheus:2.39.1-debian-11-r1
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 10
          httpGet:
            path: /-/healthy
            port: web
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 3
        name: prometheus
        ports:
        - containerPort: 9090
          name: web
          protocol: TCP
        readinessProbe:
          failureThreshold: 10
          httpGet:
            path: /-/ready
            port: web
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 3
        resources: {}
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: false
          runAsNonRoot: true
        startupProbe:
          failureThreshold: 60
          httpGet:
            path: /-/ready
            port: web
            scheme: HTTP
          periodSeconds: 15
          successThreshold: 1
          timeoutSeconds: 3
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
        - mountPath: /etc/prometheus/config_out
          name: config-out
          readOnly: true
        - mountPath: /etc/prometheus/certs
          name: tls-assets
          readOnly: true
        - mountPath: /prometheus
          name: prometheus-kube-prometheus-prometheus-db
          subPath: prometheus-db
        - mountPath: /etc/prometheus/rules/prometheus-kube-prometheus-prometheus-rulefiles-0
          name: prometheus-kube-prometheus-prometheus-rulefiles-0
        - mountPath: /etc/prometheus/web_config/web-config.yaml
          name: web-config
          readOnly: true
          subPath: web-config.yaml
      - args:
        - --listen-address=:8080
        - --reload-url=http://127.0.0.1:9090/-/reload
        - --config-file=/etc/prometheus/config/prometheus.yaml.gz
        - --config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml
        - --watched-dir=/etc/prometheus/rules/prometheus-kube-prometheus-prometheus-rulefiles-0
        command:
        - /bin/prometheus-config-reloader
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SHARD
          value: "0"
        image: docker.io/bitnami/prometheus-operator:0.60.1-debian-11-r0
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 6
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          tcpSocket:
            port: reloader-web
          timeoutSeconds: 5
        name: config-reloader
        ports:
        - containerPort: 8080
          name: reloader-web
          protocol: TCP
        readinessProbe:
          failureThreshold: 6
          initialDelaySeconds: 15
          periodSeconds: 20
          successThreshold: 1
          tcpSocket:
            port: reloader-web
          timeoutSeconds: 5
        resources:
          limits:
            cpu: 100m
            memory: 50Mi
          requests:
            cpu: 100m
            memory: 50Mi
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: false
          runAsNonRoot: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
        - mountPath: /etc/prometheus/config
          name: config
        - mountPath: /etc/prometheus/config_out
          name: config-out
        - mountPath: /etc/prometheus/rules/prometheus-kube-prometheus-prometheus-rulefiles-0
          name: prometheus-kube-prometheus-prometheus-rulefiles-0
      dnsPolicy: ClusterFirst
      initContainers:
      - args:
        - --watch-interval=0
        - --listen-address=:8080
        - --config-file=/etc/prometheus/config/prometheus.yaml.gz
        - --config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml
        - --watched-dir=/etc/prometheus/rules/prometheus-kube-prometheus-prometheus-rulefiles-0
        command:
        - /bin/prometheus-config-reloader
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SHARD
          value: "0"
        image: docker.io/bitnami/prometheus-operator:0.60.1-debian-11-r0
        imagePullPolicy: IfNotPresent
        name: init-config-reloader
        ports:
        - containerPort: 8080
          name: reloader-web
          protocol: TCP
        resources:
          limits:
            cpu: 100m
            memory: 50Mi
          requests:
            cpu: 100m
            memory: 50Mi
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
        - mountPath: /etc/prometheus/config
          name: config
        - mountPath: /etc/prometheus/config_out
          name: config-out
        - mountPath: /etc/prometheus/rules/prometheus-kube-prometheus-prometheus-rulefiles-0
          name: prometheus-kube-prometheus-prometheus-rulefiles-0
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext:
        fsGroup: 1001
        runAsUser: 1001
      serviceAccount: kube-prometheus-prometheus
      serviceAccountName: kube-prometheus-prometheus
      terminationGracePeriodSeconds: 600
      volumes:
      - name: config
        secret:
          defaultMode: 420
          secretName: prometheus-kube-prometheus-prometheus
      - name: tls-assets
        projected:
          defaultMode: 420
          sources:
          - secret:
              name: prometheus-kube-prometheus-prometheus-tls-assets-0
      - emptyDir: {}
        name: config-out
      - configMap:
          defaultMode: 420
          name: prometheus-kube-prometheus-prometheus-rulefiles-0
        name: prometheus-kube-prometheus-prometheus-rulefiles-0
      - name: web-config
        secret:
          defaultMode: 420
          secretName: prometheus-kube-prometheus-prometheus-web-config
  updateStrategy:
    type: RollingUpdate
  volumeClaimTemplates:
  - apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      creationTimestamp: null
      name: prometheus-kube-prometheus-prometheus-db
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 8Gi
      storageClassName: nfs-client
      volumeMode: Filesystem
    status:
      phase: Pending
status:
  collisionCount: 0
  currentReplicas: 1
  currentRevision: prometheus-kube-prometheus-prometheus-8cbb4d97f
  observedGeneration: 1
  readyReplicas: 1
  replicas: 1
  updateRevision: prometheus-kube-prometheus-prometheus-8cbb4d97f
  updatedReplicas: 1
View Code
复制代码

2、ServiceMonitor资源

  • ServiceMonitor自定义资源(CRD)能够声明如何监控一组动态的服务。它会使用标签选择一组需要被监控的服务(target)。
  • Prometheus Operator想要监控Kubernetes集群中的应用时,它的Endpoints必须存在
    • Endpoints对象本质是一个IP地址列表。
    • Endpoints对象由Service构建。Service对象通过对象选择器发现Pod并将它们添加到Endpoints对象中。
  • 一个Service可以公开一个或多个端口,通常情况下,这些端口由指向一个Pod的多个Endpoints支持。
  • Prometheus Operator引入ServiceMonitor对象,通过它发现Endpoints对象,然后让Prometheus去监控这些Pods
    • ServiceMonitor.Spec的endpoints部分用于配置需要收集metrics的Endpoints的端口和其他参数。在endpoints部分指定endpoint时,请严格使用。
  • 注意:endpoints(小写)是ServiceMonitor CRD中的一个字段,而Endpoints(大写)是Kubernetes资源类型。
  • ServiceMonitor和发现的目标可能来自任何namespace。这对于跨namespace的监控十分重要,比如monitoring。
    • 使用Prometheus.Spec下ServiceMonitorNamespaceSelector,通过各自Prometheus server限制ServiceMonitors作用namespece。
    • 使用ServiceMonitor.Spec下的namespaceSelector可以现在允许发现Endpoints对象的命名空间。要发现所有命名空间下的目标,namespaceSelector必须为空。
  • 查看ServiceMonitor资源(使用helm部署kube-Prometheus的内容)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
]# kubectl edit -n monitoring servicemonitor.monitoring.coreos.com/kube-prometheus-node-exporter
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  annotations:
    meta.helm.sh/release-name: kube-prometheus
    meta.helm.sh/release-namespace: monitoring
  generation: 1
  labels:
    app.kubernetes.io/instance: kube-prometheus
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: node-exporter
    helm.sh/chart: node-exporter-3.2.1
  name: kube-prometheus-node-exporter
  namespace: monitoring
spec:
  endpoints:
  - port: metrics
    interval: 15s            #抓取Endpoints的时间间隔
    relabelings:             #标签重写
    - action: replace
      regex: (.*)
      replacement: $1
      sourceLabels:
      - __meta_kubernetes_pod_node_name
      targetLabel: instance
  jobLabel: jobLabel
  namespaceSelector:    #定义Prometheus在哪些namespace中选择要被监控的Endpoints,根据标签选择
    matchNames:
    - monitoring
  selector:             #定义Prometheus选择哪些要被监控的Endpoints,根据标签选择Endpoints
    matchLabels:
      app.kubernetes.io/instance: kube-prometheus
      app.kubernetes.io/name: node-exporter

3、PrometheusRule

  • PrometheusRule CRD声明一个或多个Prometheus实例需要的Prometheus rule。
  • Alerts和recording rules可以保存并应用为yaml文件,可以被动态加载而不需要重启。
  • 获取PrometheusRule资源:https://github.com/prometheus-operator/kube-prometheus/tree/main/manifests
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: node-exporter
    app.kubernetes.io/part-of: kube-prometheus
    prometheus: k8s
    role: alert-rules
  name: node-exporter-rules
  namespace: monitoring
spec:
  groups:
  - name: node-exporter
    rules:
    - alert: NodeFilesystemSpaceFillingUp
      annotations:
        description: Filesystem on {{ $labels.device }} at {{ $labels.instance }}
          has only {{ printf "%.2f" $value }}% available space left and is filling
          up.
        runbook_url: https://github.com/prometheus-operator/kube-prometheus/wiki/nodefilesystemspacefillingup
        summary: Filesystem is predicted to run out of space within the next 24 hours.
      expr: |
        (
          node_filesystem_avail_bytes{job="node-exporter",fstype!=""} / node_filesystem_size_bytes{job="node-exporter",fstype!=""} * 100 < 40
        and
          predict_linear(node_filesystem_avail_bytes{job="node-exporter",fstype!=""}[6h], 24*60*60) < 0
        and
          node_filesystem_readonly{job="node-exporter",fstype!=""} == 0
        )
      for: 1h
      labels:
        severity: warning
...

4、Alertmanager

  • Alertmanager资源:定义AlertManager statefulset控制器的期望状态,Prometheus Operator确保这个statefulset控制器运行时一直与定义保持一致。包含副本数量、持久化存储的选项。
  • Prometheus Operator会根据Alertmanager资源在相同namespace下生成一个StatefulSet控制器。Alertmanager的Pod都会挂载一个名为<prometheus-name>的Secret。
  • 当有两个或更多配置的副本时,Operator可以高可用性模式运行Alertmanager实例。
  • 查看Alertmanager资源(使用helm部署kube-Prometheus的内容)
复制代码
]# kubectl edit -n monitoring alertmanager.monitoring.coreos.com/kube-prometheus-alertmanager
apiVersion: monitoring.coreos.com/v1
kind: Alertmanager
metadata:
  annotations:
    meta.helm.sh/release-name: kube-prometheus
    meta.helm.sh/release-namespace: monitoring
  generation: 1
  labels:
    app.kubernetes.io/component: alertmanager
    app.kubernetes.io/instance: kube-prometheus
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: kube-prometheus
    helm.sh/chart: kube-prometheus-8.1.11
  name: kube-prometheus-alertmanager
  namespace: monitoring
spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          labelSelector:
            matchLabels:
              app.kubernetes.io/component: alertmanager
              app.kubernetes.io/instance: kube-prometheus
              app.kubernetes.io/name: kube-prometheus
          namespaces:
          - monitoring
          topologyKey: kubernetes.io/hostname
        weight: 1
  containers:
  - livenessProbe:
      failureThreshold: 120
      httpGet:
        path: /-/healthy
        port: web
        scheme: HTTP
      initialDelaySeconds: 0
      periodSeconds: 5
      successThreshold: 1
      timeoutSeconds: 3
    name: alertmanager
    readinessProbe:
      failureThreshold: 120
      httpGet:
        path: /-/ready
        port: web
        scheme: HTTP
      initialDelaySeconds: 0
      periodSeconds: 5
      successThreshold: 1
      timeoutSeconds: 3
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      readOnlyRootFilesystem: false
      runAsNonRoot: true
  - livenessProbe:
      failureThreshold: 6
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 1
      tcpSocket:
        port: reloader-web
      timeoutSeconds: 5
    name: config-reloader
    readinessProbe:
      failureThreshold: 6
      initialDelaySeconds: 15
      periodSeconds: 20
      successThreshold: 1
      tcpSocket:
        port: reloader-web
      timeoutSeconds: 5
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      readOnlyRootFilesystem: false
      runAsNonRoot: true
  externalUrl: http://127.0.0.1:9093/
  image: docker.io/bitnami/alertmanager:0.24.0-debian-11-r46
  listenLocal: false
  logFormat: logfmt
  logLevel: info
  paused: false
  podMetadata:
    labels:
      app.kubernetes.io/component: alertmanager
      app.kubernetes.io/instance: kube-prometheus
      app.kubernetes.io/name: kube-prometheus
  portName: web
  replicas: 1
  resources: {}
  retention: 120h
  routePrefix: /
  securityContext:
    fsGroup: 1001
    runAsUser: 1001
  serviceAccountName: kube-prometheus-alertmanager
  storage:    #定义存储卷
    volumeClaimTemplate:
      metadata: {}
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 8Gi
        storageClassName: nfs-client
View Code
复制代码
  • 查看Alertmanager资源生成的StatefulSet控制器(使用helm部署kube-Prometheus的内容)
复制代码
]# kubectl edit -n monitoring statefulset.apps/alertmanager-kube-prometheus-alertmanager
apiVersion: apps/v1
kind: StatefulSet
metadata:
  annotations:
    meta.helm.sh/release-name: kube-prometheus
    meta.helm.sh/release-namespace: monitoring
    prometheus-operator-input-hash: "13509733468393518222"
  generation: 1
  labels:
    app.kubernetes.io/component: alertmanager
    app.kubernetes.io/instance: kube-prometheus
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: kube-prometheus
    helm.sh/chart: kube-prometheus-8.1.11
  name: alertmanager-kube-prometheus-alertmanager
  namespace: monitoring
  ownerReferences:
  - apiVersion: monitoring.coreos.com/v1
    blockOwnerDeletion: true
    controller: true
    kind: Alertmanager
    name: kube-prometheus-alertmanager
spec:
  podManagementPolicy: Parallel
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      alertmanager: kube-prometheus-alertmanager
      app.kubernetes.io/instance: kube-prometheus-alertmanager
      app.kubernetes.io/managed-by: prometheus-operator
      app.kubernetes.io/name: alertmanager
  serviceName: alertmanager-operated
  template:
    metadata:
      annotations:
        kubectl.kubernetes.io/default-container: alertmanager
      creationTimestamp: null
      labels:
        alertmanager: kube-prometheus-alertmanager
        app.kubernetes.io/component: alertmanager
        app.kubernetes.io/instance: kube-prometheus-alertmanager
        app.kubernetes.io/managed-by: prometheus-operator
        app.kubernetes.io/name: alertmanager
        app.kubernetes.io/version: 0.24.0
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchLabels:
                  app.kubernetes.io/component: alertmanager
                  app.kubernetes.io/instance: kube-prometheus
                  app.kubernetes.io/name: kube-prometheus
              namespaces:
              - monitoring
              topologyKey: kubernetes.io/hostname
            weight: 1
      containers:
      - args:
        - --config.file=/etc/alertmanager/config_out/alertmanager.env.yaml
        - --storage.path=/alertmanager
        - --data.retention=120h
        - --cluster.listen-address=
        - --web.listen-address=:9093
        - --web.external-url=http://127.0.0.1:9093/
        - --web.route-prefix=/
        - --cluster.peer=alertmanager-kube-prometheus-alertmanager-0.alertmanager-operated:9094
        - --cluster.reconnect-timeout=5m
        - --web.config.file=/etc/alertmanager/web_config/web-config.yaml
        env:
        - name: POD_IP
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: status.podIP
        image: docker.io/bitnami/alertmanager:0.24.0-debian-11-r46
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 120
          httpGet:
            path: /-/healthy
            port: web
            scheme: HTTP
          periodSeconds: 5
          successThreshold: 1
          timeoutSeconds: 3
        name: alertmanager
        ports:
        - containerPort: 9093
          name: web
          protocol: TCP
        - containerPort: 9094
          name: mesh-tcp
          protocol: TCP
        - containerPort: 9094
          name: mesh-udp
          protocol: UDP
        readinessProbe:
          failureThreshold: 120
          httpGet:
            path: /-/ready
            port: web
            scheme: HTTP
          initialDelaySeconds: 3
          periodSeconds: 5
          successThreshold: 1
          timeoutSeconds: 3
        resources:
          requests:
            memory: 200Mi
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: false
          runAsNonRoot: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
        - mountPath: /etc/alertmanager/config
          name: config-volume
        - mountPath: /etc/alertmanager/config_out
          name: config-out
          readOnly: true
        - mountPath: /etc/alertmanager/certs
          name: tls-assets
          readOnly: true
        - mountPath: /alertmanager
          name: alertmanager-kube-prometheus-alertmanager-db
          subPath: alertmanager-db
        - mountPath: /etc/alertmanager/web_config/web-config.yaml
          name: web-config
          readOnly: true
          subPath: web-config.yaml
      - args:
        - --listen-address=:8080
        - --reload-url=http://127.0.0.1:9093/-/reload
        - --config-file=/etc/alertmanager/config/alertmanager.yaml.gz
        - --config-envsubst-file=/etc/alertmanager/config_out/alertmanager.env.yaml
        command:
        - /bin/prometheus-config-reloader
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SHARD
          value: "-1"
        image: docker.io/bitnami/prometheus-operator:0.60.1-debian-11-r0
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 6
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          tcpSocket:
            port: reloader-web
          timeoutSeconds: 5
        name: config-reloader
        ports:
        - containerPort: 8080
          name: reloader-web
          protocol: TCP
        readinessProbe:
          failureThreshold: 6
          initialDelaySeconds: 15
          periodSeconds: 20
          successThreshold: 1
          tcpSocket:
            port: reloader-web
          timeoutSeconds: 5
        resources:
          limits:
            cpu: 100m
            memory: 50Mi
          requests:
            cpu: 100m
            memory: 50Mi
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: false
          runAsNonRoot: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
        - mountPath: /etc/alertmanager/config
          name: config-volume
          readOnly: true
        - mountPath: /etc/alertmanager/config_out
          name: config-out
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext:
        fsGroup: 1001
        runAsUser: 1001
      serviceAccount: kube-prometheus-alertmanager
      serviceAccountName: kube-prometheus-alertmanager
      terminationGracePeriodSeconds: 120
      volumes:
      - name: config-volume
        secret:
          defaultMode: 420
          secretName: alertmanager-kube-prometheus-alertmanager-generated
      - name: tls-assets
        projected:
          defaultMode: 420
          sources:
          - secret:
              name: alertmanager-kube-prometheus-alertmanager-tls-assets-0
      - emptyDir: {}
        name: config-out
      - name: web-config
        secret:
          defaultMode: 420
          secretName: alertmanager-kube-prometheus-alertmanager-web-config
  updateStrategy:
    type: RollingUpdate
  volumeClaimTemplates:
  - apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      creationTimestamp: null
      name: alertmanager-kube-prometheus-alertmanager-db
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 8Gi
      storageClassName: nfs-client
      volumeMode: Filesystem
    status:
      phase: Pending
status:
  collisionCount: 0
  currentReplicas: 1
  currentRevision: alertmanager-kube-prometheus-alertmanager-b74c5965d
  observedGeneration: 1
  readyReplicas: 1
  replicas: 1
  updateRevision: alertmanager-kube-prometheus-alertmanager-b74c5965d
  updatedReplicas: 1
View Code
复制代码

2、使用helm部署kube-Prometheus

  • Prometheus部署环境如下:
    • Kubernetes版本为v1.20.14。
    • helm版本为v3.8.2。
    • kube-prometheus的版本bitnami/kube-prometheus:8.1.11。

2.1、创建动态存储卷

  • 创建动态存储卷
    • 参看:https://www.cnblogs.com/maiblogs/p/16392831.html的《6.2、动态存储卷》,只需创建到“创建NFS SotageClass”。

2.2、部署kube-prometheus

  • kube-prometheus:8.1.11会自动安装如下组件:
    • prometheus-operator
    • prometheus
    • state-metrics
    • node-exporter
    • blackbox-exporter
    • alertmanager

1、创建名称空间

1
]# kubectl create namespace monitoring

2、下载kube-prometheus的chart

1
2
3
]# helm repo add bitnami https://charts.bitnami.com/bitnami
]# helm search repo prometheus
]# helm pull bitnami/kube-prometheus

3、修改values.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
//解压
]# tar zvfx kube-prometheus-8.1.11.tgz
 
//修改values.yaml
]# vim ./kube-prometheus/values.yaml
prometheus:
  ingress:
    enabled: true
    hostname:
    annotations: {kubernetes.io/ingress.class: "nginx"}
    extraRules:
    - host: prometheus.local
      http:
        paths:
        - path: /
          pathType: Prefix
          backend:
            service:
              name: kube-prometheus-prometheus
              port:
                number: 9090
  externalUrl: "http://127.0.0.1:9090/"
  persistence:
    enabled: true
    storageClass: "nfs-client"
 
alertmanager:
  ingress:
    enabled: true
    hostname:
    annotations: {kubernetes.io/ingress.class: "nginx"}
    extraRules:
    - host: alertmanager.local
      http:
        paths:
        - path: /
          pathType: Prefix
          backend:
            service:
              name: kube-prometheus-alertmanager
              port:
                number: 9093
  externalUrl: "http://127.0.0.1:9093/"
  persistence:
    enabled: true
    storageClass: "nfs-client"
  • 修改后的values.yaml文件

4、应用kube-prometheus

1
]# helm install kube-prometheus kube-prometheus/ -n monitoring

5、访问prometheus和alertmanager

1
2
//修改hosts文件(C:\Windows\System32\drivers\etc)
10.1.1.11 prometheus.local alertmanager.local
  • 使用http://prometheus.local:32080/访问prometheus。

  •  使用http://alertmanager.local:32080/访问alertmanager。

2.3、实现告警

2.3.1、配置alertmanager

1、查看alertmanager配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
]# kubectl exec alertmanager-kube-prometheus-alertmanager-0 -n monitoring -- cat /etc/alertmanager/config_out/alertmanager.env.yaml
]# kubectl get secret alertmanager-kube-prometheus-alertmanager -n monitoring -o go-template='{{ index .data "alertmanager.yaml" }}' | base64 -d
global:
  resolve_timeout: 5m
receivers:
- name: "null"
route:
  group_by:
  - job
  group_interval: 5m
  group_wait: 30s
  receiver: "null"
  repeat_interval: 12h
  routes:
  - match:
      alertname: Watchdog
    receiver: "null"

2、修改alertmanager的配置文件

  • 创建alertmanager.yaml
    • 注意,这里的alertmanager.yaml顶层多了两级alertmanager和config。
    • 注意,route如果没有子节点,就必须设置routes: []。
      • 报错信息,level=error component=configuration msg="Loading configuration file failed" file=/etc/alertmanager/config_out/alertmanager.env.yaml err="undefined receiver \"null\" used in route"
    • 注意,pod将存储卷挂在到了/alertmanager/。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
]# vim alertmanager.yaml
alertmanager:
  config:
    global:
      resolve_timeout: 5m
      smtp_smarthost: 'smtp.qq.com:465'
      smtp_from: 'xxx@qq.com'
      smtp_auth_username: 'xxx@qq.com'
      smtp_auth_password: 'xxx'
      smtp_require_tls: false
    route:
      group_by: ['alertname']
      group_wait: 10s
      group_interval: 10s
      repeat_interval: 10s
      receiver: 'email'
      routes: []
    receivers:
    - name: 'email'
      email_configs:
      - to: 'xxx@xxx.com.cn'
    templates:
    - '/alertmanager/template.tmpl'

3、将告警模板template.tmpl放到存储卷上

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
]# vim /data1/monitoring-alertmanager-kube-prometheus-alertmanager-db-alertmanager-kube-prometheus-alertmanager-0-pvc-85d5342a-9f8c-41c3-95fe-f11c77579b0c/alertmanager-db/template.tmpl
{{ define "__subject" }}
{{ if gt (len .Alerts.Firing) 0 -}}
{{ range .Alerts }}
{{ .Labels.alertname }}{{ .Annotations.title }}
{{ end }}{{ end }}{{ end }}
 
{{ define "email.default.html" }}
{{ range .Alerts }}
告警名称: {{ .Annotations.title }} <br>
告警级别: {{ .Labels.severity }} <br>
告警主机: {{ .Labels.instance }} <br>
告警信息: {{ .Annotations.description }} <br>
维护团队: {{ .Labels.team }} <br>
告警时间:{{ (.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }} <br>
{{ end }}{{ end }}

4、滚动更新kube-prometheus

  • 更新alertmanager.yaml配置文件
1
]# helm upgrade kube-prometheus kube-prometheus/ --values=alertmanager.yaml -n monitoring

2.3.2、创建告警规则

1、创建PrometheusRule资源

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
]# vim node-exporter-rules.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: node-exporter
    app.kubernetes.io/part-of: kube-prometheus
    prometheus: k8s
    role: alert-rules
  name: node-exporter-rules
  namespace: monitoring
spec:
  groups:
  - name: node-exporter
    rules:
    - alert: NodeFilesystemSpaceFillingUp
      expr: up == 3
      for: 10s
      labels:
        severity: "告警级别critical"
        team: "维护团队OPS"
      annotations:
        title: "告警名称Instance {{ $labels.instance }} down"
        description: "告警信息{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 3 minutes."

2、应用PrometheusRule资源

1
2
3
4
5
6
]# kubectl apply -f node-exporter-rules.yaml
 
//查看prometheusrule资源
]# kubectl get prometheusrule -A
NAMESPACE    NAME                  AGE
monitoring   node-exporter-rules   39s

2.4、部署grafana

1、下载grafana的chart

1
2
3
]# helm repo add bitnami https://charts.bitnami.com/bitnami
]# helm search repo grafana
]# helm pull bitnami/grafana

2、修改values.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
//解压
]# tar zvfx grafana-8.2.12.tgz
 
//修改values.yaml
]# vim grafana/values.yaml
admin:
  password: "admin"
persistence:
  storageClass: "nfs-client"
ingress:
  enabled: true
  hostname:
  annotations: {kubernetes.io/ingress.class: "nginx"}
  extraRules:
  - host: grafana.local
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: grafana
            port:
              number: 3000

3、应用grafana

1
]# helm install grafana grafana/ -n monitoring

4、访问grafana

1
2
//修改hosts文件(C:\Windows\System32\drivers\etc)
10.1.1.11 prometheus.local alertmanager.local grafana.local
  • 使用http://grafana.local:32080/访问grafana。

5、添加数据源

  • 在Kubernetes中,集群内部的服务可用通过Kubernetes内部的域名相互访问,Kubernetes内部的域名是:Service_Name.Namespace_Name.svc.cluster.local。
  • 同一个名称空间中的服务,可以直接通过Service_Name相互访问。

3、使用github部署kube-Prometheus

  • 注意,这里只是快速入门安装,并没有使用持久卷
  • Prometheus部署环境如下:
    • Kubernetes版本为v1.20.14。
    • kube-prometheus的版本0.8.o。
  • kube-prometheus和Kubernetes的兼容性

1、下载kube-Prometheus

1
]# wget https://github.com/prometheus-operator/kube-prometheus/archive/refs/tags/v0.8.0.tar.gz

2、快速部署kube-prometheus

1
2
3
4
5
6
7
8
9
10
]# tar zvfx v0.8.0.tar.gz
]# cd kube-prometheus-0.8.0/
 
//将k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.0.0替换为bitnami/kube-state-metrics:2.0.0
]# vim ./manifests/kube-state-metrics-deployment.yaml
 
//先创建名称空间、prometheus-operator等
]# kubectl create -f manifests/setup
//部署全部组件
]# kubectl create -f manifests/
  • 如果之前部署过prometheus,请先清除可能的残留
1
2
]# cd kube-prometheus-0.8.0/
]# kubectl delete --ignore-not-found=true -f manifests/ -f manifests/setup
  • 查看相关资源

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
//查看pod
]# kubectl get pods -A
NAMESPACE         NAME                                        READY   STATUS      RESTARTS   AGE
monitoring        alertmanager-main-0                         2/2     Running     0          31s
monitoring        alertmanager-main-1                         2/2     Running     0          31s
monitoring        alertmanager-main-2                         2/2     Running     0          31s
monitoring        blackbox-exporter-55c457d5fb-4jvmm          3/3     Running     0          30s
monitoring        grafana-9df57cdc4-l7gxk                     1/1     Running     0          29s
monitoring        kube-state-metrics-6cb48468f8-dbdnc         3/3     Running     0          29s
monitoring        node-exporter-6svtr                         2/2     Running     0          29s
monitoring        node-exporter-hpfw9                         2/2     Running     0          29s
monitoring        node-exporter-jksr2                         2/2     Running     0          29s
monitoring        prometheus-adapter-59df95d9f5-rxzdg         1/1     Running     0          29s
monitoring        prometheus-adapter-59df95d9f5-zs46x         1/1     Running     0          29s
monitoring        prometheus-k8s-0                            2/2     Running     1          29s
monitoring        prometheus-k8s-1                            2/2     Running     1          29s
monitoring        prometheus-operator-7775c66ccf-bfxd9        2/2     Running     0          29m
...
 
//查看service
]# kubectl get svc -A
NAMESPACE       NAME                                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                        AGE
monitoring      alertmanager-main                    ClusterIP   10.20.24.158    <none>        9093/TCP                       64s
monitoring      alertmanager-operated                ClusterIP   None            <none>        9093/TCP,9094/TCP,9094/UDP     64s
monitoring      blackbox-exporter                    ClusterIP   10.20.248.189   <none>        9115/TCP,19115/TCP             64s
monitoring      grafana                              ClusterIP   10.20.214.103   <none>        3000/TCP                       63s
monitoring      kube-state-metrics                   ClusterIP   None            <none>        8443/TCP,9443/TCP              63s
monitoring      node-exporter                        ClusterIP   None            <none>        9100/TCP                       63s
monitoring      prometheus-adapter                   ClusterIP   10.20.74.223    <none>        443/TCP                        63s
monitoring      prometheus-k8s                       ClusterIP   10.20.73.57     <none>        9090/TCP                       63s
monitoring      prometheus-operated                  ClusterIP   None            <none>        9090/TCP                       62s
monitoring      prometheus-operator                  ClusterIP   None            <none>        8443/TCP                       30m
...
 
//查看pod控制器
]# kubectl get deployment -A
NAMESPACE         NAME                       READY   UP-TO-DATE   AVAILABLE   AGE
monitoring        blackbox-exporter          1/1     1            1           109s
monitoring        grafana                    1/1     1            1           108s
monitoring        kube-state-metrics         1/1     1            1           108s
monitoring        prometheus-adapter         2/2     2            2           108s
monitoring        prometheus-operator        1/1     1            1           30m
...
 
//查看有状态的pod控制器
]# kubectl get sts -A
NAMESPACE    NAME                READY   AGE
monitoring   alertmanager-main   3/3     2m1s
monitoring   prometheus-k8s      2/2     119s
...

3、访问服务

  • 访问prometheus
    • http://10.1.1.11:19090/
1
2
//监听10.1.1.11:19090,并将请求转发到service后面的pod的9090端口
]# kubectl port-forward svc/prometheus-k8s --address=10.1.1.11 19090:9090 -n monitoring
  • 访问grafana
    • http://10.1.1.11:13000/    (admin:admin)
1
2
//监听10.1.1.11:13000,并将请求转发到service后面的pod的3000端口
]# kubectl port-forward svc/grafana --address=10.1.1.11 13000:3000 -n monitoring
  • 访问alertmanager
    • http://10.1.1.11:19093/
1
2
//监听10.1.1.11:19093,并将请求转发到service后面的pod的9093端口
]# kubectl port-forward svc/alertmanager-main --address=10.1.1.11 19093:9093 -n monitoring

4、创建ingress规则

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
]# vim prometheus-ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: prometheus-ingress
  namespace: monitoring
  annotations:
    kubernetes.io/ingress.class: "nginx"
spec:
  rules:
  - host: prometheus.local
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: prometheus-k8s
            port:
              number: 9090
  - host: alertmanager.local
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: alertmanager-main
            port:
              number: 9093
  - host: grafana.local
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: grafana
            port:
              number: 3000

1

1
#                                                                                                                                        #
posted @   麦恒  阅读(486)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· 单线程的Redis速度为什么快?
点击右上角即可分享
微信分享提示

目录导航