1. 下载安装包

https://prometheus.io/download/

 

2. 上传解压

-rwxr-xr-x. 1 3434 3434 26971621 Dec 11 22:13 alertmanager
-rw-r--r--. 1 3434 3434      380 Dec 11 22:51 alertmanager.yml
-rwxr-xr-x. 1 3434 3434 22458246 Dec 11 22:14 amtool
-rw-r--r--. 1 3434 3434    11357 Dec 11 22:51 LICENSE
-rw-r--r--. 1 3434 3434      457 Dec 11 22:51 NOTICE

 

3. 修改配置文件 alertmanager.yml

vim  alertmanager.yml

global:
  resolve_timeout: 5m
  smtp_smarthost: 'smtp.163.com:25'
  smtp_from: '13551031535@163.com'
  smtp_auth_password: 'xxx'
  smtp_require_tls: false
  smtp_auth_username: '13551081535@163.com'


route:
  group_by: ['alertname']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 2m  # 两条相同告警的时间间隔
  receiver: 'email'  # 接收者

receivers:
- name: 'email'      # 要与route中的receiver值一致
  email_configs:     # 官网上提供了此配置项
  - to: 'zhengqinfeng09@163.com'  # 邮件接收者

#inhibit_rules:
#  - source_match:
#      severity: 'critical'
#    target_match:
#      severity: 'warning'
#    equal: ['alertname', 'dev', 'instance']

 

4. 启动alertmanager服务

./alertmanager --config.file=alertmanager.yml

 

5. 修改prometheus.yml,配置与alertmanager之间的通信

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
       - 127.0.0.1:9093   # 配置与alertmanager之间通信

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
   - "rules/node_rules.yml"   # 配置告警规则

 

6. 配置告警规则

vim rules/node_rule.yml

groups:
- name: 通用实例监控
  rules:
  - alert: 实例DOWN
    expr: up == 0
    for: 1m  # 如果1m之内,实例都是up==0状态,才会告警
    labels:
      severity: error
    annotations:
      description: '{{ $labels.instance }} of job {{ $labels.job }} 挂掉超过1分钟.'
      summary: '实例:{{ $labels.instance }}已死,请处理...'

 

7. 使prometheus配置生效

kill -hup pid

 

8. 验证

posted on 2020-04-21 22:02  显示账号  阅读(363)  评论(0编辑  收藏  举报