Prometheus Alert Manager -- Difference between group_wait, group_interval, and repeat_interval

Definition
group_interval:group_interval dictates how long to wait before sending notifications about new alerts that are added to a group of alerts that have been alerted on before。
repeat_interval:If there is nothing change in the Alert Group -- no new alert added or new old alert resolved, when will the next alert be sent out. 在Group没有发生更新的情况下重新发送通知的时间间隔

 

Test Setting
group_wait 1m
group_interval 10m
repeat_interval 60m
keep_firing_for 1d
Data Retention Period 2d

 

Testing Plan and Result

 Firing Time1st Send Time2nd Send TimeLast Send Time
Incident A T0      
Incident B T0 + 30s T1 = T0 + 1m / T0 + 10m? T1 + 10m / T1 + 60m ?  

 Firing TimeSend Time2nd Send TimeLast Send Time
Incident A T0 T0 + 1m    
Incident B T0 + 5m T1 = T0 + 6m or T0 + 10m?    

 Firing TimeSend Time2nd Send TimeLast Send Time
Incident A T0 T1 = T0 + 1m    
Incident B T0 + 5m T2 = T1 + 5m or T1 + 10m?    
Incident C T0 + 30m T3 = T0 +    

 

Reference: 

Prometheus告警模型分析 - 姚灯灯! - 博客园

What’s the difference between group_interval, group_wait, and repeat_interval? – Robust Perception | Prometheus Monitoring Experts

Receiving alert more often then repeat interval · Issue #3326 · prometheus/alertmanager · GitHub

posted on 2024-11-07 14:00  hejing195  阅读(9)  评论(0编辑  收藏  举报

导航