kube-prometheus-stack 自定义 alertmanager 配置推送webhook

创建AlertmanagerConfig资源

在没有使用 prometheus-operator 的情况下,需要手动配置 alertmanager.yaml 来路由&发送从 prometheus 接收的警报。

使用 prometheus-operator 之后,事情变得简单一些。只需要创建 AlertmanagerConfig 资源,prometheus-operator 会自动 merge 所有的 AlertmanagerConfig 资源生成/更新 alertmanager.yaml,并通知 alertmanager 重载配置。

默认情况下,prometheus-operator会关注所有namespace下的所有AlertmanagerConfig:

kubectl get -n kube-prom alertmanagers

kubectl get -n kube-prom alertmanagers/kube-promethues-stack-kube-alertmanager -o yaml

# spec.alertmanagerConfigNamespaceSelector: {},表示不作筛选
# spec.alertmanagerConfigSelector: {},表示不作筛选

创建一个简单警报路由规则

apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
  name: testwebhook
  namespace: kube-prom
spec:
  route:
    receiver: webhook
    groupBy: ["instance", "job"]
    groupWait: "10s"
    groupInterval: "20s"
    repeatInterval: "30s"
  receivers:
  - name: webhook
    webhookConfigs:
      - url: "http://10.0.2.11:8080/webhook/send"
        sendResolved: true
  inhibitRules:
  - sourceMatch:
    - name: severity
      value: 'critical'
    targetMatch:
    - name: severity
      value: 'warning'
    equal: ['instance']

参考:

https://github.com/prometheus-community/helm-charts/issues/2224
https://kkgithub.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#alertmanagerconfig
kubectl apply -f alertmanager-config.yaml
kubectl edit svc kube-promethues-stack-kube-alertmanager -n kube-prom
kubectl get svc kube-promethues-stack-kube-alertmanager -n kube-prom

创建资源后,打开alertmanager管理后台 http://10.0.2.12:32466/#/status 页面,确认 Config 已经包含相关的配置信息(可能需要稍等一会)。

AlertmanagerConfig 资源详情参考:https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#alertmanagerconfig

创建 PrometheusRule 资源

类似 AlertmanagerConfig,可以通过创建 PrometheusRule 资源来创建警报规则(rule),prometheus-operator 会自动把所有 rule 配置 merge 到 prometheus.yml。

默认情况下,prometheus-operator 会关注所有 namespace 下匹配 label release=kube-prometheus-stack 的 PrometheusRule :

kubectl get -n kube-prom prometheuses
kubectl get -n kube-prom prometheuses/kube-promethues-stack-kube-prometheus -o yaml
# spec.ruleNamespaceSelector: {},表示不作筛选
# spec.ruleSelector:
#   matchLabels:
#     release: kube-prometheus-stack

创建一个能立即触发报警的规则:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
     prometheus: k8s
     ole: alert-rules
  name: kube-prom-kube-prom-stack-kube-prome-prometheus.rules
  namespace: kube-prom
spec:
  groups:
  - name: disk
    rules:
    - alert: diskFree
      annotations:
        value: "{{$value}}"
        summary: "{{ $labels.job }}  项目实例 {{ $labels.instance }} 磁盘使用率大于 80%"
        description: "{{ $labels.instance }}  {{ $labels.mountpoint }}  磁盘使用率大于80%  (当前的值: {{ $value }}%),请及时处理"
      expr: |
        (1-(node_filesystem_free_bytes{fstype=~"ext4|xfs",mountpoint!="/boot"} / node_filesystem_size_bytes{fstype=~"ext4|xfs",mountpoint!="/boot"}) )*100 > 80
      for: 1m
      labels:
        severity: warning
kubectl apply -f prometheus-rule.yaml

注意:labels 的severity: warning和前面创建 AlertmanagerConfig 的 inhibitRules 配置匹配,为什么需要namespace: kube-prom?prometheus-operator 会在 AlertmanagerConfig 的 matchers 强制加上这个标签,issue 讨论:https://github.com/prometheus-operator/prometheus-operator/issues/3737

kubectl edit svc kube-promethues-stack-kube-prometheus -n kube-prom 
kubectl get svc kube-promethues-stack-kube-prometheus -n kube-prom

创建资源后,打开prometheus管理后台 http://10.0.2.12:30133/rules页面,搜索diskFree确认能找到新添加的规则(可能需要稍等一会)。

PrometheusRule 资源详情参考:https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#prometheusrule

编写 /webhook/send 接口

创建springboot项目,添加如下依赖

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.olive</groupId>
  <artifactId>test-promethues</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  	<dependencies>
	  	<dependency>
		    <groupId>org.springframework.boot</groupId>
		    <artifactId>spring-boot-starter-web</artifactId>
		    <version>3.2.0</version>
	    </dependency>
	    <dependency>
			<groupId>com.alibaba.fastjson2</groupId>
			<artifactId>fastjson2</artifactId>
			<version>2.0.49</version>
		</dependency>
 	</dependencies>
</project>

创建 controller

package com.olive;

import java.time.LocalDateTime;
import java.util.HashMap;
import java.util.Map;

import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RestController;

import com.alibaba.fastjson2.JSON;

@RestController
public class RevcController {

	@PostMapping("/webhook/send")
	public Map<String, String> create(@RequestBody Map<String, Object> entity) {
		System.out.println(LocalDateTime.now());
		System.out.println(JSON.toJSONString(entity));
		 Map<String, String> result = new HashMap<String, String>();
		 result.put("code", "success");
		return result;
	}

}

创建springboot引导类

package com.olive;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
public class App {

    public static void main(String[] args) {
        SpringApplication.run(App.class, args);
    }
}

参考:

https://www.cnblogs.com/roy2220/p/14867024.html
posted @ 2024-10-27 09:41  BUG弄潮儿  阅读(44)  评论(0编辑  收藏  举报