庄泽波の博客

好记性不如烂笔头

Prometheus Alert Rules with Some Metrics

Using Prometheus as a monitor system, it is quite efficent. The most important one is that alert template is quite flexible,

I can alert the message with some other metrics value except the current metric value, it is quite convient. For example,

groups:
- name: example
  rules:
  - alert: Load alert
    expr: node_load1 > 1
    for: 5s
    labels:
      severity: page
    annotations:
      title: 'load1: {{ $value }}, load5: {{ printf `node_load5{instance="%s"}` $labels.instance | query | first | value }}, load15: {{ printf `node_load15{instance="%s"}` $labels.instance | query | first | value}}'
      summary: High load

After configuring alertmanager and adding webhook_configs,  I can capture the  result of alert as following:

{"receiver":"default","status":"firing","alerts":[{"status":"firing","labels":{"alertname":"Load alert","instance":"127.0.0.1:9100","job":"prometheus","severity":"page"},"annotations":{"summary":"High load","title":"load1: 60.1494140625, load5: 38.009765625, load15: 23.18359375"},"startsAt":"2018-07-15T22:59:09.508199934+08:00","endsAt":"0001-01-01T00:00:00Z","generatorURL":"http://bogon:9090/graph?g0.expr=node_load1+%3E+1\u0026g0.tab=1"}],"groupLabels":{},"commonLabels":{"alertname":"Load alert","instance":"127.0.0.1:9100","job":"prometheus","severity":"page"},"commonAnnotations":{"summary":"High load","title":"load1: 60.1494140625, load5: 38.009765625, load15: 23.18359375"},"externalURL":"http://bogon:9093","version":"4","groupKey":"{}:{}"}

We can get the values of load average in annotations: 

load1: 60.1494140625, load5: 38.009765625, load15: 23.18359375

Afert receiving the message, we know the detail of load average in a machine.

posted on 2018-07-15 23:10  庄泽波  阅读(806)  评论(0编辑  收藏  举报

导航