prometheus Alertmanager webhook
一.自定义邮件告警
在alertmanager服务的配置文件中指定自定义告警文件
# vim alertmanager.yaml global: resolve_timeout: 5m smtp_smarthost: 'smtp.qq.com:465' smtp_from: 'x34989xxx@qq.com' smtp_auth_username: 'x34989xx@qq.com' smtp_auth_password: 'lxxxxxxxxtubdfd' smtp_require_tls: false smtp_hello: 'qq.com' templates: - '/etc/alertmanager/template/*.tmpl' route: group_by: ['alertname'] group_wait: 10s group_interval: 10s repeat_interval: 1m receiver: 'email' receivers: - name: 'email' email_configs: - to: '20xxxxxx@qq.com' html: '{{ template "email.to.html" . }}' headers: { Subject: "告警" } send_resolved: true inhibit_rules: - source_match: severity: 'critical' target_match: severity: 'warning' equal: ['alertname', 'dev', 'instance']
书写自定义配置文件
{{ define "email.to.html" }} {{ if gt (len .Alerts.Firing) 0 }} {{ range .Alerts }} ======== 异常告警========== <br> 告警程序: Alertmanager <br> 告警主机: {{ .Annotations.summary }} <br> 告警类型: {{ .Annotations.alarmPolicyType }} <br> 告警级别: {{ .Labels.severity }} <br> 告警状态: {{ .Status }} <br> 告警详情: {{ .Annotations.description }} <br> 触发时间: {{ (.StartsAt.Add 28800e9) "2022-8-11 17:30:01" }} <br> ==========end============= <br> {{ end }} {{ end }} {{ if gt (len .Alerts.Resolved) 0 }} {{ range .Alerts }} ======== <span style=color:#00FF00;font-size:11px;font-weight:bold;> 告警恢复 </span>==========<br> 告警程序: Alertmanager <br> 告警主机: {{ .Annotations.summary }} <br> 告警类型: {{ .Annotations.alarmPolicyType }} <br> 告警级别: {{ .Labels.severity }} <br> 告警状态: {{ .Status }} <br> 告警详情: {{ .Annotations.description }} <br> 触发时间: {{ (.StartsAt.Add 28800e9) "2022-8-11 17:30:01" }} <br> 恢复时间: {{ (.EndsAt.Add 28800e9) "2022-8-11 17:30:01" }} <br> ===========end============ <br> {{ end }} {{ end }} {{ end }}
注:里面的标签需要根据内置的标签和自定义的标签来配置
二.使用docker部署微信机器人告警
注:这里的自定义告警用webhook来转发
1.制作镜像
1)准备webhook启动代码
注:里面的标签需要和自己的配置的相同
~]# vim app.py # -*- coding: utf-8 -*- import os import json import requests import arrow from flask import Flask from flask import request app = Flask(__name__) def bytes2json(data_bytes): data = data_bytes.decode('utf8').replace("'", '"') return json.loads(data) def makealertdata(data): for output in data['alerts'][:]: try: pod_name = output['labels']['pod'] except KeyError: try: pod_name = output['labels']['pod_name'] except KeyError: pod_name = 'null' try: namespace = output['labels']['namespace'] except KeyError: namespace = 'null' try: message = output['annotations']['message'] except KeyError: try: message = output['annotations']['description'] except KeyError: message = 'null' if output['status'] == 'firing': status_zh = '<font color=\"warning\">告警</font>' title = '【%s】告警 %s 有新的报警' % (status_zh, output['annotations']['alarmPolicyType']) send_data = { "msgtype": "markdown", "markdown": { "content": "## %s \n\n" %title + ">**告警级别**: %s \n\n" % output['labels']['severity'] + ">**告警类型**: %s \n\n" % output['annotations']['metricDisplayName'] + ">**告警主机**: %s \n\n" % output['annotations']['summary'] + ">**告警负责人**: %s \n\n" % "<@v_keo>" + ">**告警详情**: %s \n\n" % message + ">**告警状态**: %s \n\n" % output['status'] + ">**触发时间**: %s \n\n" % arrow.get(output['startsAt']).to('Asia/Shanghai').format( 'YYYY-MM-DD HH:mm:ss ZZ') } } elif output['status'] == 'resolved': status_zh = '<font color=\"info\">恢复</font>' title = '【%s】环境 %s 有报警恢复' % (status_zh, output['annotations']['alarmPolicyType']) send_data = { "msgtype": "markdown", "markdown": { "content": "## %s \n\n" %title + ">**告警级别**: %s \n\n" % output['labels']['severity'] + ">**告警类型**: %s \n\n" % output['annotations']['metricDisplayName'] + ">**告警主机**: %s \n\n" % output['annotations']['summary'] + ">**告警负责人**: %s \n\n" % "<@v_keo>" + ">**告警详情**: %s \n\n" % message + ">**告警状态**: %s \n\n" % output['status'] + ">**触发时间**: %s \n\n" % arrow.get(output['startsAt']).to('Asia/Shanghai').format( 'YYYY-MM-DD HH:mm:ss ZZ') + ">**触发结束时间**: %s \n" % arrow.get(output['endsAt']).to('Asia/Shanghai').format( 'YYYY-MM-DD HH:mm:ss ZZ') } } return send_data def send_alert(data): token = os.getenv('ROBOT_TOKEN') if not token: print('you must set ROBOT_TOKEN env') return url = 'https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=%s' % token send_data = makealertdata(data) req = requests.post(url, json=send_data) result = req.json() if result['errcode'] != 0: print('notify dingtalk error: %s' % result['errcode']) @app.route('/', methods=['POST', 'GET']) def send(): if request.method == 'POST': post_data = request.get_data() send_alert(bytes2json(post_data)) return 'success' else: return 'weclome to use prometheus alertmanager dingtalk webhook server!' if __name__ == '__main__': app.run(host='0.0.0.0', port=5000)
2)准备python需要的模块文件
~]# vim requirements.txt certifi==2018.10.15 chardet==3.0.4 Click==7.0 Flask==1.0.2 idna==2.7 itsdangerous==1.1.0 Jinja2==2.10 MarkupSafe==1.1.0 requests==2.20.1 urllib3==1.24.1 Werkzeug==0.14.1 arrow==0.13.1
3)书写dockerfile文件
~]# vim Dockerfile FROM python:3.6.4 # set working directory WORKDIR /src # add app ADD . /src # install requirements RUN pip install selectivesearch -i http://pypi.douban.com/simple --trusted-host pypi.douban.com RUN pip install -r requirements.txt EXPOSE 500 # run server CMD python app.py
4)制作镜像
docker build -f /data/prometheus/Dockerfile2/webkook/Dockerfile -t webhook:v1 . --network=host
# -f : 指定dockerfile文件位置
2.启动容器和指定webhook容器
1)启动容器
docker run --name=webhook --net=host -v /etc/localtime:/etc/localtime -v /data/prometheus/Dockerfile2/webkook/app.py:/src/app.py -d -e ROBOT_TOKEN=xxxxx4-69ac-458c-af07-a69xxebsxxd webhook:v1 # /etc/localtime 时间要一致 # -e ROBOT_TOKEN= 指定机器人 key
2)告警配置指定webhook
]# cat alertmanager.yaml-wx global: resolve_timeout: 5m route: group_by: ['alertname'] group_wait: 10s group_interval: 10s repeat_interval: 1m receiver: 'web.hook' receivers: - name: 'web.hook' webhook_configs: - url: 'http://起容器的ip:5000' send_resolved: true inhibit_rules: - source_match: severity: 'critical' target_match: severity: 'warning' equal: ['alertname', 'dev', 'instance']
- 注:如果转发不了记得把端口映射出来