Prometheus数据流

1. 背景

Prometheus的数据流一般为:

exporter ==> prometheus ==> alertmanager ==> 接收告警渠道(邮件/企业微信)

本文通过自研接口调用官方接口,向每个环节输入或输出数据,实现自定义向Prometheus推送数据、获取Prometheus告警接口数据、向alertmanager推送数据、获取alermanager告警接口数据等功能
本文自研程序均为python

2. exporter ==> prometheus

当官方的采集器无法满足实际需求时,可以自定义开发采集器(详情请参考:Prometheus自研采集器(python)
(若不需自研采集器请跳过此节)

本文例子exporter文件名为test_exporter.py
具体程序例子如下:

#!/usr/bin/python3
# coding:utf-8
from prometheus_client import Gauge, start_http_server, Counter
from prometheus_client.core import CollectorRegistry
import prometheus_client
import uvicorn
from fastapi import FastAPI
from fastapi.responses import PlainTextResponse

# 定义api对象
app = FastAPI()

# 设置接口访问路径/metrics
@app.get('/metrics', response_class=PlainTextResponse)
def get_data():
    '''
    该函数为采集数据函数,该例子模拟采集到数据标签label1、label2和label3,数据data1
    '''
    # 定义client_python里提供的prometheus Gauge数据类型
    REGISTRY = CollectorRegistry(auto_describe=False)
    example_G = Gauge("this_is_a_metric_name", "this is a metric describe", ["label1", "label2", "label3"], registry=REGISTRY)
    
    label1 = '111'
    label2 = '222'
    label3 = '333'
    data1 = '444'
    # 调用Gauge数据类型,把采集的数据放入到设定的example_G
    example_G.labels(label1,label2,label3).set(data1)
    return prometheus_client.generate_latest(REGISTRY)
 
# 用uvicorn调用接口,启用端口为9330
if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=9330, log_level="info")

修改prometheus.yml配置文件,添加采集任务(添加后需要重启Prometheus)

scrape_configs:
  - job_name: "test_exporter"
    scrape_interval: 30s  
    static_configs:
    - targets:
      - 127.0.0.1:9330

3. prometheus ==>

  • 通过Prometheus的alerts接口获取告警信息

    在此例中用prom_data接收告警信息,指定的程序端口为5005

    文件名为get_prom_data.py

    具体程序如下:

#!/usr/bin/python3
# coding:utf-8
from fastapi import FastAPI, Request, status
from fastapi.encoders import jsonable_encoder
from fastapi.exceptions import RequestValidationError
from fastapi.responses import JSONResponse
from pydantic import BaseModel, errors
import uvicorn

app = FastAPI()

@app.exception_handler(RequestValidationError)
async def validation_exception_handler(request: Request, exc: RequestValidationError):
    print(exc.errors())
    print(exc.body)
    print(request.headers)
    print(request.body())
    return JSONResponse(status_code=status.HTTP_404_NOT_FOUND,content=exc.body)

@app.post("/api/v2/alerts")
async def read_unicorn(prom_data: list):
    print(prom_data)

if __name__=='__main__':
    uvicorn.run('get_prom_data:app',host='0.0.0.0',port=5005,reload=True)
  • 修改Prometheus中prometheus.yml配置文件,指定接收告警信息的程序:
# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
       - localhost:5005
  • 启动程序,且Prometheus告警规则触发后即可收到告警信息

    告警信息格式为list
    在Prometheus中同一条告警规则输出的告警信息将聚合到一条信息发出
    告警信息包括:
    告警规则中annotations字段信息
    最后触发告警时间
    开始触发告警时间
    告警规则信息
    实例标签信息

    告警示例如下:

[{'annotations': {'alert_note': 'this is a note', 'summary': 'dns解析失败,正确是ip为 0, 当前解析结果为 0'}, 'endsAt': '2021-12-26T06:52:43.750Z', 'startsAt': '2021-12-24T07:39:58.750Z', 'generatorURL': '/prometheus/graph?g0.expr=dns_extend_success+%3D%3D+0&g0.tab=1', 'labels': {'IPADDR': '0', 'alert_rule_id': 'dfejjer34', 'alertname': 'dns解析失败', 'index': 'dns_success', 'instance_name': 'http://173.16.181.227/ips', 'job': 'ping_status', 'network_area': '专网', 'platform': '信息平台', 'room': '机房', 'service': '获取二维码接口', 'severity': 'warning', 'system': '门户子系统', 'team': '厂商', 'url': 'http://173.16.181.227/ips'}}, 
]

4. ==> alertmanager

  • 推送告警信息到alertmanager

    推送的告警信息为list
    主要有lables和annotations两个指标的信息,该两个分类中可自定义标签和告警信息
    可用json格式数据同时发送多条告警信息

#!/usr/bin/python
# -*- coding: utf8 -*-
from datetime import datetime, date, time, timedelta
import requests
import json
import time
import threading

def send_alertmanager(url):
    headers = {
        "Content-Type": "application/json"
    }
    data = [
 {
  "labels": {
     "operation": "接口调用告警",
     "supplier": "创智",
	 "interface": "创智-hgs-广州-人员基本信息获取",
     "time":"2021-10-01 09:26:12",
     "severity":"critical"
   },
   "annotations": {
      "transfer_amount": "(total:1000/fail:0)",
      "severity":"critical"
    }
 }
]
    r = requests.post(url,headers=headers,data=json.dumps(data))

if __name__=='__main__':
    url = "http://localhost:9093/api/v2/alerts"
    send_alertmanager(url)
  • alertmanager中接收数据如下:

5. alertmanager ==>

  • 接收alertmanager输出的告警信息

    在此例中用data接收告警信息,指定的程序端口为5006

    文件名为get_alert.py

    具体程序如下:

#!/usr/bin/python
# -*- coding: utf8 -*-
from fastapi import FastAPI,Request
import fastapi
from fastapi.encoders import jsonable_encoder
from fastapi.exceptions import RequestValidationError
from pydantic import BaseModel
from typing import Optional
from requests.models import MissingSchema
import uvicorn
import requests
import hashlib
import json
from datetime import datetime
app = FastAPI(title='接口文档',docs_url=None, redoc_url=None)

#定义接管RequestValidationError的方法
@app.exception_handler(RequestValidationError)
async def validation_exception_handler(request: Request, exc: RequestValidationError):
    print(exc.errors())
    print(exc.body)
    print(request.headers)
    print(request.body())
    return JSONResponse(status_code=status.HTTP_404_NOT_FOUND,content=exc.body)

class info(BaseModel):
    status: str
    groupLabels: dict
    alerts: list
    commonLabels: dict

@app.post('/get_alerts')
def result(info:info):
    data=str(info)
    print(data)
    return{'city':info.status,'province':info.groupLabels,'country':info.alerts}

if __name__=='__main__':
    uvicorn.run('get_alert:app',host='0.0.0.0',port=5006,reload=True)
  • 在alertmanager中修改配置:
routes:
  - receiver: "get_alerts"
    group_by: [service]
    continue: true
receivers:
  - name: "get_alerts"
    webhook_configs:
    - url: http://localhost:5006/get_alerts
      send_resolved: true
  • 启动程序,且满足alertmanager发出告警规则后即可收到告警信息

    告警信息包括:
    告警状态(warning/firing)
    最后触发告警时间
    开始触发告警时间
    告警规则信息
    实例标签信息

posted @ 2021-12-26 16:08  liu_kx  阅读(273)  评论(0编辑  收藏  举报