Prometheus Operator配置钉钉告警
配置钉钉告警
1、注册钉钉账号->机器人管理->自定义(通过webhook接入自定义服务)->添加->复制webhook
上述配置好群机器人,获得这个机器人对应的Webhook地址,记录下来,后续配置钉钉告警插件要用,格式如下
https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxx
2、创建钉钉告警插件(dingtalk-webhook.yaml),并修改文件中 access_token=xxxxxx 为上一步你获得的机器人认证 token
到安装包的路径下创建告警信息。
--- apiVersion: apps/v1 kind: Deployment metadata: labels: app: prometheus-webhook-dingtalk name: prometheus-webhook-dingtalk namespace: monitoring spec: replicas: 1 selector: matchLabels: app: prometheus-webhook-dingtalk template: metadata: labels: app: prometheus-webhook-dingtalk spec: containers: - name: prometheus-webhook-dingtalk image: timonwong/prometheus-webhook-dingtalk:v0.3.0 imagePullPolicy: IfNotPresent args: - --ding.profile=webhook1=https://oapi.dingtalk.com/robot/send?access_token=b6125c3c8d1d47a - --template.file=/usr/share/prometheus-webhook-dingtalk/template/webhook-dingtalk.tmpl volumeMounts: - mountPath: /usr/share/prometheus-webhook-dingtalk/template/ name: webhook-dingtalk-template ports: - containerPort: 8060 protocol: TCP resources: requests: cpu: 100m memory: 100Mi limits: cpu: 200m memory: 1000Mi volumes: - name: webhook-dingtalk-template configMap: name: webhook-dingtalk-template defaultMode: 420 --- apiVersion: v1 kind: Service metadata: labels: app: prometheus-webhook-dingtalk name: prometheus-webhook-dingtalk namespace: monitoring spec: ports: - port: 8060 protocol: TCP targetPort: 8060 selector: app: prometheus-webhook-dingtalk sessionAffinity: None
配置钉钉报警模板
cat webhook-dingtalk.tmpl {{ define "__subject" }}[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] {{ .GroupLabels.SortedPairs.Values | join " " }} {{ if gt (len .CommonLabels) (len .GroupLabels) }}({{ with .CommonLabels.Remove .GroupLabels.Names }}{{ .Values | join " " }}{{ end }}){{ end }}{{ end }} {{ define "__alertmanagerURL" }}{{ .ExternalURL }}/#/alerts?receiver={{ .Receiver }}{{ end }} {{ define "__text_alert_list" }}{{ range . }} **报警内容** {{ range .Labels.SortedPairs }}> - {{ .Name }}: {{ .Value | markdown | html }} {{ end }} **报警信息** {{ range .Annotations.SortedPairs }}> - {{ .Name }}: {{ .Value | markdown | html }} {{ end }} --- {{ end }}{{ end }} {{ define "___text_alertresovle_list" }}{{ range . }} --- **报警已恢复** {{ range .Labels.SortedPairs }}> - {{ .Name }}: {{ .Value | markdown | html }} {{ end }} --- {{ end }}{{ end }} {{ define "ding.link.title" }}{{ template "__subject" . }}{{ end }} {{ define "ding.link.content" }} ### k8s集群测试环境监控报警 --- {{ index .GroupLabels "alertname" }} {{ if gt (len .Alerts.Firing) 0 -}} {{ template "__text_alert_list" .Alerts.Firing }} {{ end }} {{ if gt (len .Alerts.Resolved) 0 -}} {{ template "___text_alertresovle_list" .Alerts.Resolved }} {{ end }} {{ end }}
报警模板生成 configmap kubectl create configmap webhook-dingtalk-template --from-file=webhook-dingtalk.tmpl -n monitoring
3、应用dingtalk-webhook.yaml
$ kubectl apply -f dingtalk-webhook.yaml
4、添加告警接收器
到安装包的路径下创建告警接收器。(alertmanager.yaml )
cat alertmanager.yaml global: resolve_timeout: 5m route: group_by: ['job'] group_wait: 0s group_interval: 5m repeat_interval: 2h receiver: webhook receivers: - name: 'webhook' webhook_configs: - url: 'http://prometheus-webhook-dingtalk:8060/dingtalk/webhook1/send' send_resolved: true
注:上述配置url: 'http://webhook-dingtalk.monitoring.svc.cluster.local:8060/dingtalk/webhook1/send' 是dingtalk-webhook.yaml文件中svc的地址。
5、替换原有secret
cd /k8s-cmp/yaml/prometheus_Operator/kube-prometheus/manifests
kubectl delete secret alertmanager-main -n monitoring
kubectl create secret generic alertmanager-main --from-file=alertmanager.yaml -n monitoring
6. 脚本测试钉钉报警通道
#! /bin/bash alert_payload='[ { "labels": { "alertname": "报警通道测试", "dev": "sda1", "instance": "127.0.0.1", "severity": "critical" }, "annotations": { "info": "报警通道测试", "summary": "报警通道测试" } } ]' alertmanager_0_ip=`kubectl describe po alertmanager-main-0 -n monitoring | grep "IP: " | awk 'END{print $2}'` curl -XPOST -H "Content-Type: application/json" -d"${alert_payload}" http://${alertmanager_0_ip}:9093/api/v1/alerts
报警恢复信息