Prometheus数据流
1. 背景
Prometheus的数据流一般为:
exporter ==> prometheus ==> alertmanager ==> 接收告警渠道(邮件/企业微信)
本文通过自研接口调用官方接口,向每个环节输入或输出数据,实现自定义向Prometheus推送数据、获取Prometheus告警接口数据、向alertmanager推送数据、获取alermanager告警接口数据等功能
本文自研程序均为python
2. exporter ==> prometheus
当官方的采集器无法满足实际需求时,可以自定义开发采集器(详情请参考:Prometheus自研采集器(python)
(若不需自研采集器请跳过此节)
本文例子exporter文件名为test_exporter.py
具体程序例子如下:
#!/usr/bin/python3
# coding:utf-8
from prometheus_client import Gauge, start_http_server, Counter
from prometheus_client.core import CollectorRegistry
import prometheus_client
import uvicorn
from fastapi import FastAPI
from fastapi.responses import PlainTextResponse
# 定义api对象
app = FastAPI()
# 设置接口访问路径/metrics
@app.get('/metrics', response_class=PlainTextResponse)
def get_data():
'''
该函数为采集数据函数,该例子模拟采集到数据标签label1、label2和label3,数据data1
'''
# 定义client_python里提供的prometheus Gauge数据类型
REGISTRY = CollectorRegistry(auto_describe=False)
example_G = Gauge("this_is_a_metric_name", "this is a metric describe", ["label1", "label2", "label3"], registry=REGISTRY)
label1 = '111'
label2 = '222'
label3 = '333'
data1 = '444'
# 调用Gauge数据类型,把采集的数据放入到设定的example_G
example_G.labels(label1,label2,label3).set(data1)
return prometheus_client.generate_latest(REGISTRY)
# 用uvicorn调用接口,启用端口为9330
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=9330, log_level="info")
修改prometheus.yml配置文件,添加采集任务(添加后需要重启Prometheus)
scrape_configs:
- job_name: "test_exporter"
scrape_interval: 30s
static_configs:
- targets:
- 127.0.0.1:9330
3. prometheus ==>
-
通过Prometheus的alerts接口获取告警信息
在此例中用prom_data接收告警信息,指定的程序端口为5005
文件名为get_prom_data.py
具体程序如下:
#!/usr/bin/python3
# coding:utf-8
from fastapi import FastAPI, Request, status
from fastapi.encoders import jsonable_encoder
from fastapi.exceptions import RequestValidationError
from fastapi.responses import JSONResponse
from pydantic import BaseModel, errors
import uvicorn
app = FastAPI()
@app.exception_handler(RequestValidationError)
async def validation_exception_handler(request: Request, exc: RequestValidationError):
print(exc.errors())
print(exc.body)
print(request.headers)
print(request.body())
return JSONResponse(status_code=status.HTTP_404_NOT_FOUND,content=exc.body)
@app.post("/api/v2/alerts")
async def read_unicorn(prom_data: list):
print(prom_data)
if __name__=='__main__':
uvicorn.run('get_prom_data:app',host='0.0.0.0',port=5005,reload=True)
- 修改Prometheus中prometheus.yml配置文件,指定接收告警信息的程序:
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
- localhost:5005
-
启动程序,且Prometheus告警规则触发后即可收到告警信息
告警信息格式为list
在Prometheus中同一条告警规则输出的告警信息将聚合到一条信息发出
告警信息包括:
告警规则中annotations字段信息
最后触发告警时间
开始触发告警时间
告警规则信息
实例标签信息告警示例如下:
[{'annotations': {'alert_note': 'this is a note', 'summary': 'dns解析失败,正确是ip为 0, 当前解析结果为 0'}, 'endsAt': '2021-12-26T06:52:43.750Z', 'startsAt': '2021-12-24T07:39:58.750Z', 'generatorURL': '/prometheus/graph?g0.expr=dns_extend_success+%3D%3D+0&g0.tab=1', 'labels': {'IPADDR': '0', 'alert_rule_id': 'dfejjer34', 'alertname': 'dns解析失败', 'index': 'dns_success', 'instance_name': 'http://173.16.181.227/ips', 'job': 'ping_status', 'network_area': '专网', 'platform': '信息平台', 'room': '机房', 'service': '获取二维码接口', 'severity': 'warning', 'system': '门户子系统', 'team': '厂商', 'url': 'http://173.16.181.227/ips'}},
]
4. ==> alertmanager
-
推送告警信息到alertmanager
推送的告警信息为list
主要有lables和annotations两个指标的信息,该两个分类中可自定义标签和告警信息
可用json格式数据同时发送多条告警信息
#!/usr/bin/python
# -*- coding: utf8 -*-
from datetime import datetime, date, time, timedelta
import requests
import json
import time
import threading
def send_alertmanager(url):
headers = {
"Content-Type": "application/json"
}
data = [
{
"labels": {
"operation": "接口调用告警",
"supplier": "创智",
"interface": "创智-hgs-广州-人员基本信息获取",
"time":"2021-10-01 09:26:12",
"severity":"critical"
},
"annotations": {
"transfer_amount": "(total:1000/fail:0)",
"severity":"critical"
}
}
]
r = requests.post(url,headers=headers,data=json.dumps(data))
if __name__=='__main__':
url = "http://localhost:9093/api/v2/alerts"
send_alertmanager(url)
- alertmanager中接收数据如下:
5. alertmanager ==>
-
接收alertmanager输出的告警信息
在此例中用data接收告警信息,指定的程序端口为5006
文件名为get_alert.py
具体程序如下:
#!/usr/bin/python
# -*- coding: utf8 -*-
from fastapi import FastAPI,Request
import fastapi
from fastapi.encoders import jsonable_encoder
from fastapi.exceptions import RequestValidationError
from pydantic import BaseModel
from typing import Optional
from requests.models import MissingSchema
import uvicorn
import requests
import hashlib
import json
from datetime import datetime
app = FastAPI(title='接口文档',docs_url=None, redoc_url=None)
#定义接管RequestValidationError的方法
@app.exception_handler(RequestValidationError)
async def validation_exception_handler(request: Request, exc: RequestValidationError):
print(exc.errors())
print(exc.body)
print(request.headers)
print(request.body())
return JSONResponse(status_code=status.HTTP_404_NOT_FOUND,content=exc.body)
class info(BaseModel):
status: str
groupLabels: dict
alerts: list
commonLabels: dict
@app.post('/get_alerts')
def result(info:info):
data=str(info)
print(data)
return{'city':info.status,'province':info.groupLabels,'country':info.alerts}
if __name__=='__main__':
uvicorn.run('get_alert:app',host='0.0.0.0',port=5006,reload=True)
- 在alertmanager中修改配置:
routes:
- receiver: "get_alerts"
group_by: [service]
continue: true
receivers:
- name: "get_alerts"
webhook_configs:
- url: http://localhost:5006/get_alerts
send_resolved: true
-
启动程序,且满足alertmanager发出告警规则后即可收到告警信息
告警信息包括:
告警状态(warning/firing)
最后触发告警时间
开始触发告警时间
告警规则信息
实例标签信息
本文来自博客园,作者:liu_kx,转载请注明原文链接:https://www.cnblogs.com/liukx/p/15733238.html