kube-prometheus-stack 自定义 alertmanager 配置推送webhook
创建AlertmanagerConfig资源
在没有使用 prometheus-operator 的情况下,需要手动配置 alertmanager.yaml 来路由&发送从 prometheus 接收的警报。
使用 prometheus-operator 之后,事情变得简单一些。只需要创建 AlertmanagerConfig 资源,prometheus-operator 会自动 merge 所有的 AlertmanagerConfig 资源生成/更新 alertmanager.yaml
,并通知 alertmanager 重载配置。
默认情况下,prometheus-operator会关注所有namespace下的所有AlertmanagerConfig:
kubectl get -n kube-prom alertmanagers
kubectl get -n kube-prom alertmanagers/kube-promethues-stack-kube-alertmanager -o yaml
# spec.alertmanagerConfigNamespaceSelector: {},表示不作筛选
# spec.alertmanagerConfigSelector: {},表示不作筛选
创建一个简单警报路由规则
apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
name: testwebhook
namespace: kube-prom
spec:
route:
receiver: webhook
groupBy: ["instance", "job"]
groupWait: "10s"
groupInterval: "20s"
repeatInterval: "30s"
receivers:
- name: webhook
webhookConfigs:
- url: "http://10.0.2.11:8080/webhook/send"
sendResolved: true
inhibitRules:
- sourceMatch:
- name: severity
value: 'critical'
targetMatch:
- name: severity
value: 'warning'
equal: ['instance']
参考:
https://github.com/prometheus-community/helm-charts/issues/2224
https://kkgithub.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#alertmanagerconfig
kubectl apply -f alertmanager-config.yaml
kubectl edit svc kube-promethues-stack-kube-alertmanager -n kube-prom
kubectl get svc kube-promethues-stack-kube-alertmanager -n kube-prom
创建资源后,打开alertmanager管理后台 http://10.0.2.12:32466/#/status
页面,确认 Config 已经包含相关的配置信息(可能需要稍等一会)。
AlertmanagerConfig 资源详情参考:https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#alertmanagerconfig
创建 PrometheusRule 资源
类似 AlertmanagerConfig,可以通过创建 PrometheusRule 资源来创建警报规则(rule),prometheus-operator 会自动把所有 rule 配置 merge 到 prometheus.yml。
默认情况下,prometheus-operator 会关注所有 namespace 下匹配 label release=kube-prometheus-stack
的 PrometheusRule :
kubectl get -n kube-prom prometheuses
kubectl get -n kube-prom prometheuses/kube-promethues-stack-kube-prometheus -o yaml
# spec.ruleNamespaceSelector: {},表示不作筛选
# spec.ruleSelector:
# matchLabels:
# release: kube-prometheus-stack
创建一个能立即触发报警的规则:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
prometheus: k8s
ole: alert-rules
name: kube-prom-kube-prom-stack-kube-prome-prometheus.rules
namespace: kube-prom
spec:
groups:
- name: disk
rules:
- alert: diskFree
annotations:
value: "{{$value}}"
summary: "{{ $labels.job }} 项目实例 {{ $labels.instance }} 磁盘使用率大于 80%"
description: "{{ $labels.instance }} {{ $labels.mountpoint }} 磁盘使用率大于80% (当前的值: {{ $value }}%),请及时处理"
expr: |
(1-(node_filesystem_free_bytes{fstype=~"ext4|xfs",mountpoint!="/boot"} / node_filesystem_size_bytes{fstype=~"ext4|xfs",mountpoint!="/boot"}) )*100 > 80
for: 1m
labels:
severity: warning
kubectl apply -f prometheus-rule.yaml
注意:labels 的severity: warning
和前面创建 AlertmanagerConfig 的 inhibitRules 配置匹配,为什么需要namespace: kube-prom
?prometheus-operator 会在 AlertmanagerConfig 的 matchers 强制加上这个标签,issue 讨论:https://github.com/prometheus-operator/prometheus-operator/issues/3737
kubectl edit svc kube-promethues-stack-kube-prometheus -n kube-prom
kubectl get svc kube-promethues-stack-kube-prometheus -n kube-prom
创建资源后,打开prometheus管理后台 http://10.0.2.12:30133/rules
页面,搜索diskFree确认能找到新添加的规则(可能需要稍等一会)。
PrometheusRule 资源详情参考:https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#prometheusrule
编写 /webhook/send 接口
创建springboot项目,添加如下依赖
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.olive</groupId>
<artifactId>test-promethues</artifactId>
<version>0.0.1-SNAPSHOT</version>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
<version>3.2.0</version>
</dependency>
<dependency>
<groupId>com.alibaba.fastjson2</groupId>
<artifactId>fastjson2</artifactId>
<version>2.0.49</version>
</dependency>
</dependencies>
</project>
创建 controller
package com.olive;
import java.time.LocalDateTime;
import java.util.HashMap;
import java.util.Map;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RestController;
import com.alibaba.fastjson2.JSON;
@RestController
public class RevcController {
@PostMapping("/webhook/send")
public Map<String, String> create(@RequestBody Map<String, Object> entity) {
System.out.println(LocalDateTime.now());
System.out.println(JSON.toJSONString(entity));
Map<String, String> result = new HashMap<String, String>();
result.put("code", "success");
return result;
}
}
创建springboot引导类
package com.olive;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
public class App {
public static void main(String[] args) {
SpringApplication.run(App.class, args);
}
}
参考:
https://www.cnblogs.com/roy2220/p/14867024.html