AlertManager集成自研组件
Alertmanager配置

global: resolve_timeout: 5m smtp_smarthost: 'smtp.163.com:25' smtp_from: 'cfgitlab_admin@163.com' smtp_auth_username: 'cfgitlab_admin@163.com' smtp_auth_password: '11111111' smtp_require_tls: false templates: - '/root/prom/alertmanager-0.26.0.linux-amd64/email.tmpl' route: group_by: ['alertname'] group_wait: 30s group_interval: 5m repeat_interval: 1h receiver: 'dingding.webhook1' routes: - receiver: 'dingding.webhook1' continue: true - receiver: 'email.webhook1' receivers: - name: 'email' email_configs: - to: '66666666@qq.com' html: '{{ template "email.to.html" . }}' send_resolved: true - name: 'dingding.webhook1' webhook_configs: - url: 'http://10.30.92.71:8060/dingtalk/webhook1/send' send_resolved: true - name: 'email.webhook1' webhook_configs: #自定义的web服务API地址 Alertmanager会自动把告警相关数据post到这个API地址 - url: 'http://10.30.92.71:5000' send_resolved: true
自研组件开发

# -*- coding: utf-8 -*- import os import json import requests import arrow from flask import Flask from flask import request import socks import socket import smtplib from email.mime.text import MIMEText app = Flask(__name__) socks.set_default_proxy(socks.SOCKS5, "10.30.90.14", 1081) socket.socket = socks.socksocket smtp_server = "smtp.163.com" smtp_port = "25" sender_email = "cfgitlab_admin@163.com" sender_password = "222222" def makealertdata(data): data = json.loads(data) for output in data['alerts'][:]: try: message = output['annotations']['message'] except KeyError: try: message = output['annotations']['description'] except KeyError: message = 'null' if output['status'] == 'firing': status_zh = '报警' title = '【%s】 %s 有新的报警' % (status_zh, output['labels']['alertname']) send_data =f''' {title}\n **告警级别**: {output['labels']['severity']} **告警类型**: {output['labels']['alertname']} **告警主机**: {output['labels']['instance']} **告警详情**: {message} **告警状态**: {output['status']} **触发时间**: {arrow.get(output['startsAt']).to('Asia/Shanghai').format( 'YYYY-MM-DD HH:mm:ss ZZ')} ''' elif output['status'] == 'resolved': status_zh = '恢复' title = '【%s】 %s 有新的报警' % (status_zh, output['labels']['alertname']) send_data = f''' {title}\n **告警级别**: {output['labels']['severity']} **告警类型**: {output['labels']['alertname']} **告警主机**: {output['labels']['instance']} **告警详情**: {message} **告警状态**: {output['status']} **触发时间**: {arrow.get(output['startsAt']).to('Asia/Shanghai').format( 'YYYY-MM-DD HH:mm:ss ZZ')} **触发结束时间**: {arrow.get(output['endsAt']).to('Asia/Shanghai').format( 'YYYY-MM-DD HH:mm:ss ZZ')} ''' return send_data def send_alert(data): msg = MIMEText(makealertdata(data), _charset='utf-8') msg['Subject'] ='环境监控告警' msg['From'] = sender_email msg['To'] = '1272724@qq.com' server = smtplib.SMTP(smtp_server, smtp_port) server.login(sender_email, sender_password) server.sendmail(sender_email, msg['To'], msg.as_string()) print("邮件已成功发送!") @app.route('/', methods=['POST', 'GET']) def send(): if request.method == 'POST': post_data = request.get_data() send_alert(post_data) return 'success' else: return 'weclome to use prometheus alertmanager email webhook server!' if __name__ == '__main__': app.run(host='0.0.0.0', port=5000)
接口联调
告警信息统计分析
技术栈:Prometheus+Alertmanager+Snitch+Grafana
alertsnitch实现了一个webhook,通过在alertmanager中配置receiver为alertsbnitch webhook将告警发送给alertsnitch,alertsnitch对数据进行处理
https://gitlab.com/yakshaving.art/alertsnitch
alertsnitch安装部署

wget https://gitlab.com/yakshaving.art/alertsnitch/-/tree/master?ref_type=heads export GO111MODULE=on export GOPROXY=https://goproxy.io go mod tidy export ALERTSNITCH_BACKEND="mysql" export ALERTSNITCH_DSN="taishi:Tdb123@2022@tcp(10.32.3.13:6306)/alertsnitch" /data/app/taishi/mysql/bin/mysql -h10.32.3.13 -uroot -P6306 -p123456 create database alertsnitch; use alertsnitch; source /home/admin/0.0.1-bootstrap.sql source /home/admin/0.1.0-fingerprint.sql [root@soc alertsnitch-master]# go run main.go 2024/02/04 14:09:55 Starting listener on :9567 using mysql database driver
Alertmanager配置通知alertsnitch

route: group_by: ['alertname'] group_wait: 30s group_interval: 5m repeat_interval: 1h receiver: 'whole' routes: - receiver: 'whole' continue: true - receiver: 'dingding.webhook1' continue: true - receiver: 'email.webhook1' receivers: - name: 'email' email_configs: - to: '1272@qq.com,4239@qq.com,28443@qq.com,276@qq.com,1567@qq.com,10765@qq.com' html: '{{ template "email.to.html" . }}' send_resolved: true - name: 'dingding.webhook1' webhook_configs: - url: 'http://10.30.92.71:8060/dingtalk/webhook1/send' send_resolved: true - name: 'whole' webhook_configs: - url: 'http://10.30.92.71:9567/webhook' send_resolved: true - name: 'email.webhook1' webhook_configs: - url: 'http://10.30.92.71:5000' send_resolved: true
grafana集成alertsnitch
1.https://grafana.com/grafana/dashboards/15833-prometheus-alert-history/
2.导入dashboard
导入dashboard的时候数据源要选新创建的mysql 不要选默认的数据源prometheus
Alertmanager路由树分发规则配置

route: group_by: ['alertname'] group_wait: 30s group_interval: 5m repeat_interval: 1h receiver: 'all' //匹配不到就走这条路由,默认的匹配路由 routes: - receiver: es_team match_re: job: elasticsearch_exporter continue: true - receiver: db_team match_re: job: mysql_exporter continue: true - receiver: all match_re: job: .* continue: true receivers: - name: 'all' webhook_configs: - send_resolved: true http_config: {} url: http://127.0.0.1:5000/alert max_alerts: 0 - name: "es_team" webhook_configs: - send_resolved: true http_config: {} url: http://127.0.0.1:5005/alert max_alerts: 0 - name: "db_team" webhook_configs: - send_resolved: true http_config: {} url: http://127.0.0.1:5008/alert max_alerts: 0
告警消息发送抑制规则配置
告警消息发送静默规则配置
本文来自博客园,作者:不懂123,转载请注明原文链接:https://www.cnblogs.com/yxh168/p/18003477
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· 单线程的Redis速度为什么快?
2020-02-02 k8s调度器优先级和抢占机制
2020-02-02 k8s的调度器架构和策略