Prometheus Operator配置钉钉告警

配置钉钉告警

1、注册钉钉账号->机器人管理->自定义(通过webhook接入自定义服务)->添加->复制webhook

 

 


上述配置好群机器人,获得这个机器人对应的Webhook地址,记录下来,后续配置钉钉告警插件要用,格式如下
https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxx


2、创建钉钉告警插件(dingtalk-webhook.yaml),并修改文件中 access_token=xxxxxx 为上一步你获得的机器人认证 token
到安装包的路径下创建告警信息。

---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: prometheus-webhook-dingtalk
  name: prometheus-webhook-dingtalk
  namespace: monitoring
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prometheus-webhook-dingtalk
  template:
    metadata:
      labels:
        app: prometheus-webhook-dingtalk
    spec:
      containers:
      - name: prometheus-webhook-dingtalk
        image: timonwong/prometheus-webhook-dingtalk:v0.3.0
        imagePullPolicy: IfNotPresent
        args:
          - --ding.profile=webhook1=https://oapi.dingtalk.com/robot/send?access_token=b6125c3c8d1d47a
          - --template.file=/usr/share/prometheus-webhook-dingtalk/template/webhook-dingtalk.tmpl
        volumeMounts:
        - mountPath: /usr/share/prometheus-webhook-dingtalk/template/
          name: webhook-dingtalk-template
        ports:
        - containerPort: 8060
          protocol: TCP
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
          limits:
            cpu: 200m
            memory: 1000Mi
      volumes:
      - name: webhook-dingtalk-template
        configMap:
          name: webhook-dingtalk-template
          defaultMode: 420

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: prometheus-webhook-dingtalk
  name: prometheus-webhook-dingtalk
  namespace: monitoring
spec:
  ports:
  - port: 8060
    protocol: TCP
    targetPort: 8060
  selector:
    app: prometheus-webhook-dingtalk
  sessionAffinity: None

 

配置钉钉报警模板
cat webhook-dingtalk.tmpl
{{ define "__subject" }}[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] {{ .GroupLabels.SortedPairs.Values | join " " }} {{ if gt (len .CommonLabels) (len .GroupLabels) }}({{ with .CommonLabels.Remove .GroupLabels.Names }}{{ .Values | join " " }}{{ end }}){{ end }}{{ end }}
{{ define "__alertmanagerURL" }}{{ .ExternalURL }}/#/alerts?receiver={{ .Receiver }}{{ end }}

{{ define "__text_alert_list" }}{{ range . }}

**报警内容**
{{ range .Labels.SortedPairs }}> - {{ .Name }}: {{ .Value | markdown | html }}
{{ end }}

**报警信息**
{{ range .Annotations.SortedPairs }}> - {{ .Name }}: {{ .Value | markdown | html }}
{{ end }}

---

{{ end }}{{ end }}

{{ define "___text_alertresovle_list" }}{{ range . }}

---
**报警已恢复** 
{{ range .Labels.SortedPairs }}> - {{ .Name }}: {{ .Value | markdown | html }}
{{ end }}

---
 
{{ end }}{{ end }}


{{ define "ding.link.title" }}{{ template "__subject" . }}{{ end }}
{{ define "ding.link.content" }}

### k8s集群测试环境监控报警
---

{{ index .GroupLabels "alertname" }}

{{ if gt (len .Alerts.Firing) 0 -}}
{{ template "__text_alert_list" .Alerts.Firing }}
{{ end }}

{{ if gt (len .Alerts.Resolved) 0 -}}

{{ template "___text_alertresovle_list" .Alerts.Resolved }}

{{ end }}
{{ end }}

  

报警模板生成 configmap

kubectl create configmap webhook-dingtalk-template --from-file=webhook-dingtalk.tmpl -n monitoring

  

3、应用dingtalk-webhook.yaml

$ kubectl apply -f dingtalk-webhook.yaml

4、添加告警接收器
到安装包的路径下创建告警接收器。(alertmanager.yaml )

cat alertmanager.yaml 
global:
  resolve_timeout: 5m
route:
  group_by: ['job']
  group_wait: 0s
  group_interval: 5m
  repeat_interval: 2h
  receiver: webhook
receivers:
- name: 'webhook'
  webhook_configs:
  - url: 'http://prometheus-webhook-dingtalk:8060/dingtalk/webhook1/send'
    send_resolved: true

  

注:上述配置url: 'http://webhook-dingtalk.monitoring.svc.cluster.local:8060/dingtalk/webhook1/send' 是dingtalk-webhook.yaml文件中svc的地址。
5、替换原有secret

cd /k8s-cmp/yaml/prometheus_Operator/kube-prometheus/manifests
kubectl delete  secret alertmanager-main -n monitoring
kubectl create  secret generic alertmanager-main --from-file=alertmanager.yaml -n monitoring
 6. 脚本测试钉钉报警通道
#! /bin/bash

alert_payload='[
  {
    "labels": {
       "alertname": "报警通道测试",
       "dev": "sda1",
       "instance": "127.0.0.1",
       "severity": "critical"
     },
     "annotations": {
        "info": "报警通道测试",
        "summary": "报警通道测试"
      }
  }
]'

alertmanager_0_ip=`kubectl describe po alertmanager-main-0 -n monitoring | grep "IP:     " |  awk 'END{print $2}'`

curl -XPOST -H "Content-Type: application/json" -d"${alert_payload}" http://${alertmanager_0_ip}:9093/api/v1/alerts

  

 

 报警恢复信息

 

posted @ 2022-03-29 11:30  Oops!#  阅读(984)  评论(0编辑  收藏  举报