【原】elastalert 配置使用
关于Elastalert使用
1.安装
参考链接:
- https://elastalert.readthedocs.io/en/latest/running_elastalert.html
- https://github.com/Yelp/elastalert
- https://marchal.tech/blog/2019/08/27/elastalert-enable-alert-only-in-specific-hour-range/
- https://github.com/0xSeb/elastalert_hour_range
- https://github.com/xuyaoqiang/elastalert-dingtalk-plugin
运行环境要求:
Python 3.6
1.1 基础环境
本地环境为 centos7
1.1.1 准备 yum 源
阿里源的地址:https://developer.aliyun.com/mirror/
1.备份旧的源
mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup
2.下载新的 CentOS-Base.repo 到 /etc/yum.repos.d/
wget -O /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
或者
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
3.运行 yum makecache 生成缓存
4.其他
非阿里云ECS用户会出现 Couldn't resolve host 'mirrors.cloud.aliyuncs.com' 信息,不影响使用。用户也可自行修改相关配置: eg:
sed -i -e '/mirrors.cloud.aliyuncs.com/d' -e '/mirrors.aliyuncs.com/d' /etc/yum.repos.d/CentOS-Base.repo
1.1.2 安装 python3.6
# 需要先安装阿里源
yum -y install epel-release
yum -y install python36 python36-devel python36-pip gcc gcc-c++
1.2 安装elastalert
使用 pip3 安装
pip3 install elastalert
或者
$ git clone https://github.com/Yelp/elastalert.git
$ cd elastalert
$ pip install "setuptools>=11.3"
$ python setup.py install
根据 Elasticsearch 版本选择安装
Elasticsearch 5.0+:
$ pip3 install "elasticsearch>=5.0.0"
Elasticsearch 2.X:
$ pip3 install "elasticsearch<3.0.0"
2.准备配置文件
2.1 目录结构
[root@liyongjian5179 ~]# tree elastalert
elastalert
├── config.yaml
├── elastalert_modules
│ ├── dingtalk_alert.py
│ ├── hour_range_enhancement.py
│ ├── __init__.py
│ └── Readme.md
├── rules
│ └── rule.yaml
├── smtp_auth.yaml
└── start.sh
2.2 配置文件 config.yaml
可以直接下载示例文件:https://github.com/Yelp/elastalert/blob/master/config.yaml.example
[root@liyongjian5179 elastalert]# cat config.yaml
# This is the folder that contains the rule yaml files
# Any .yaml file will be loaded as a rule
rules_folder: rules
# How often ElastAlert will query Elasticsearch
# The unit can be anything from weeks to seconds
run_every:
minutes: 1
# ElastAlert will buffer results from the most recent
# period of time, in case some log sources are not in real time
buffer_time:
minutes: 15
# The Elasticsearch hostname for metadata writeback
# Note that every rule can have its own Elasticsearch host
es_host: 192.168.100.129
# The Elasticsearch port
es_port: 9200
# The AWS region to use. Set this when using AWS-managed elasticsearch
#aws_region: us-east-1
# The AWS profile to use. Use this if you are using an aws-cli profile.
# See http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html
# for details
#profile: test
# Optional URL prefix for Elasticsearch
#es_url_prefix: elasticsearch
# Connect with TLS to Elasticsearch
#use_ssl: True
# Verify TLS certificates
#verify_certs: True
# GET request with body is the default option for Elasticsearch.
# If it fails for some reason, you can pass 'GET', 'POST' or 'source'.
# See http://elasticsearch-py.readthedocs.io/en/master/connection.html?highlight=send_get_body_as#transport
# for details
#es_send_get_body_as: GET
# Option basic-auth username and password for Elasticsearch
es_username: elastic
es_password: XXXXXXXXXXXXXXXXXXXXX
# Use SSL authentication with client certificates client_cert must be
# a pem file containing both cert and key for client
#verify_certs: True
#ca_certs: /path/to/cacert.pem
#client_cert: /path/to/client_cert.pem
#client_key: /path/to/client_key.key
# The index on es_host which is used for metadata storage
# This can be a unmapped index, but it is recommended that you run
# elastalert-create-index to set a mapping
writeback_index: elastalert_status
writeback_alias: elastalert_alerts
# If an alert fails for some reason, ElastAlert will retry
# sending the alert until this time period has elapsed
alert_time_limit:
days: 2
# Custom logging configuration
# If you want to setup your own logging configuration to log into
# files as well or to Logstash and/or modify log levels, use
# the configuration below and adjust to your needs.
# Note: if you run ElastAlert with --verbose/--debug, the log level of
# the "elastalert" logger is changed to INFO, if not already INFO/DEBUG.
#logging:
# version: 1
# incremental: false
# disable_existing_loggers: false
# formatters:
# logline:
# format: '%(asctime)s %(levelname)+8s %(name)+20s %(message)s'
#
# handlers:
# console:
# class: logging.StreamHandler
# formatter: logline
# level: DEBUG
# stream: ext://sys.stderr
#
# file:
# class : logging.FileHandler
# formatter: logline
# level: DEBUG
# filename: elastalert.log
#
# loggers:
# elastalert:
# level: WARN
# handlers: []
# propagate: true
#
# elasticsearch:
# level: WARN
# handlers: []
# propagate: true
#
# elasticsearch.trace:
# level: WARN
# handlers: []
# propagate: true
#
# '': # root logger
# level: WARN
# handlers:
# - console
# - file
# propagate: false
2.3 添加控制警报时间段模块
位置: 和启动的命令放一起,否则会找不到模块
参考链接:https://marchal.tech/blog/2019/08/27/elastalert-enable-alert-only-in-specific-hour-range/
原 github 地址:https://github.com/0xSeb/elastalert_hour_range
详细使用见报警规则文件
[root@liyongjian5179 elastalert]# mkdir elastalert_modules/
[root@liyongjian5179 elastalert_modules]# cat __init__.py
[root@liyongjian5179 elastalert_modules]# cat hour_range_enhancement.py
#!/usr/bin/python3
import dateutil.parser
from elastalert.enhancements import BaseEnhancement
from elastalert.enhancements import DropMatchException
class HourRangeEnhancement(BaseEnhancement):
def process(self, match):
timestamp = None
try:
timestamp = dateutil.parser.parse(match['@timestamp']).time()
except Exception:
try:
timestamp = dateutil.parser.parse(match[self.rule['timestamp_field']]).time()
except Exception:
pass
if timestamp is not None:
time_start = dateutil.parser.parse(self.rule['start_time']).time()
time_end = dateutil.parser.parse(self.rule['end_time']).time()
if(self.rule['drop_if'] == 'outside'):
if timestamp < time_start or timestamp > time_end:
raise DropMatchException()
elif(self.rule['drop_if'] == 'inside'):
if timestamp >= time_start and timestamp <= time_end:
raise DropMatchException()
2.4 报警规则 rule.yaml
2.4.1 邮件报警
[root@liyongjian5179 elastalert]# mkdir rules
[root@liyongjian5179 elastalert]# cat ./rules/rule.yaml
# Alert when the rate of events exceeds a threshold
# (Optional)
# Elasticsearch host
es_host: 192.168.100.129
# (Optional)
# Elasticsearch port
es_port: 9200
# (OptionaL) Connect with SSL to Elasticsearch
#use_ssl: True
# (Optional) basic-auth username and password for Elasticsearch
es_username: elastic
es_password: XXXXXXXXXXXXXXXXXXXXX
# (Required)
# Rule name, must be unique
name: "[lyj] upstream_status=200"
# (Required)
# Type of alert.
# the frequency rule type alerts when num_events events occur with timeframe time
type: frequency
# (Required)
# Index to search, wildcard supported
index: filebeat-*
# (Required, frequency specific)
# Alert when this many documents matching the query occur within a timeframe
num_events: 1
# (Required, frequency specific)
# num_events must occur within this amount of time to trigger an alert
timeframe:
#hours: 1
minutes: 3
# (Required)
# A list of Elasticsearch filters used for find events
# These filters are joined with AND and nested in a filtered query
# For more info: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html
filter:
- query:
query_string:
query: "upstream_status: 200"
#query: "upstream_status: 200 AND upstream_http_custom_status: E*"
# 24H mode
# 如果没配置"elastalert.enhancements.TimeEnhancement",就是 UTC 时间
start_time: "22:00"
end_time: "06:00"
# Drop match and cancel alert if (inside/outside) range
# inside: 在上面的时间范围内
# outside: 在上面的时间范围外
drop_if: "inside"
# 增加插件实现时间范围内报警
match_enhancements:
- "elastalert.enhancements.TimeEnhancement" #时区
- "elastalert_modules.hour_range_enhancement.HourRangeEnhancement"
smtp_host: smtp.qq.com
smtp_port: 465
smtp_ssl: true
smtp_auth_file: /root/elastalert/smtp_auth.yaml
#回复给那个邮箱
email_reply_to: 632712943@qq.com
##从哪个邮箱发送
from_addr: 632712943@qq.com
# (Required)
# The alert is use when a match is found
alert:
- "email"
# (required, email specific)
# a list of email addresses to send alerts to
email:
- "liyongjian5179@163.com"
alert_subject: "报警:发现接口错误,匹配到了{}条日志,匹配{}次"
alert_subject_args:
- num_hits
- num_matches
alert_text_type: alert_text_only
### Error frequency exceeds
alert_text: |
您好,submit 接口错误次数超限,请检查接口状态!
> 截止发邮件前匹配到的请求数:{}
> 截止发邮件前匹配到的次数:{}
> 规则名称: {}
> 接口: {}
> timestamp: {}
alert_text_args:
- num_hits
- num_matches
- name
- request
- "@timestamp"
# 5分钟内相同的报警不会重复发送
realert:
minutes: 5
# 指数级扩大 realert 时间,中间如果有报警,
# 则按照5>10>20>40>60不断增大报警时间到制定的最大时间,
# 如果之后报警减少,则会慢慢恢复原始realert时间
exponential_realert:
hours: 1
2.4.2 钉钉报警
git clone https://github.com/xuyaoqiang/elastalert-dingtalk-plugin.git
cd elastalert-dingtalk-plugin
pip3 install -r requirements.txt -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
cp -r elastalert_modules/dingtalk_alert.py /root/elastalert/elastalert_modules/
然后报警规则文件中添加
alert:
- "elastalert_modules.dingtalk_alert.DingTalkAlerter"
#- "email"
dingtalk_webhook: "https://oapi.dingtalk.com/robot/send?access_token=xxxxxxx"
dingtalk_msgtype: "text"
dingtalk_alert.py
文件内容如下
#! /usr/bin/env python
# -*- coding: utf-8 -*-
"""
@author: xuyaoqiang
@contact: xuyaoqiang@gmail.com
@date: 2017-09-14 17:35
@version: 0.0.0
@license:
@copyright:
"""
import json
import requests
from elastalert.alerts import Alerter, DateTimeEncoder
from requests.exceptions import RequestException
from elastalert.util import EAException
class DingTalkAlerter(Alerter):
required_options = frozenset(['dingtalk_webhook', 'dingtalk_msgtype'])
def __init__(self, rule):
super(DingTalkAlerter, self).__init__(rule)
self.dingtalk_webhook_url = self.rule['dingtalk_webhook']
self.dingtalk_msgtype = self.rule.get('dingtalk_msgtype', 'text')
self.dingtalk_isAtAll = self.rule.get('dingtalk_isAtAll', False)
self.digtalk_title = self.rule.get('dingtalk_title', '')
def format_body(self, body):
return body.encode('utf8')
def alert(self, matches):
headers = {
"Content-Type": "application/json",
"Accept": "application/json;charset=utf-8"
}
body = self.create_alert_body(matches)
payload = {
"msgtype": self.dingtalk_msgtype,
"text": {
"content": body
},
"at": {
"isAtAll":False
}
}
try:
response = requests.post(self.dingtalk_webhook_url,
data=json.dumps(payload, cls=DateTimeEncoder),
headers=headers)
response.raise_for_status()
except RequestException as e:
raise EAException("Error request to Dingtalk: {0}".format(str(e)))
def get_info(self):
return {
"type": "dingtalk",
"dingtalk_webhook": self.dingtalk_webhook_url
}
pass
2.5 邮箱用户名密码文件 smtp_auth.yaml
[root@liyongjian5179 elastalert]# cat smtp_auth.yaml
user: "632712943@qq.com"
password: "XXXX"
2.6 创建索引
elastalert-create-index
2.7 启动脚本
[root@liyongjian5179 elastalert]# cat start.sh
#!/bin/bash
> ./nohup.out
nohup elastalert --config config.yaml &
# 详细输出
# nohup elastalert --config config.yaml --verbose &
# 验证规则文件是否能正常匹配
# elastalert-test-rule --config config.yaml ./rules/rule.yaml
2.8 验证规则文件
[root@liyongjian5179 elastalert]# elastalert-test-rule --config config.yaml ./rules/rule.yaml
INFO:elastalert:Note: In debug mode, alerts will be logged to console but NOT actually sent.
To send them but remain verbose, use --verbose instead.
Didn't get any results.
INFO:elastalert:Note: In debug mode, alerts will be logged to console but NOT actually sent.
To send them but remain verbose, use --verbose instead.
1 rules loaded
INFO:apscheduler.scheduler:Adding job tentatively -- it will be properly scheduled when the scheduler starts
INFO:elastalert:Queried rule [lyj] upstream_status=200 from 2020-11-29 18:10 CST to 2020-11-29 18:13 CST: 0 / 0 hits
Would have written the following documents to writeback index (default is elastalert_status):
elastalert_status - {'rule_name': '[lyj] upstream_status=200', 'endtime': datetime.datetime(2020, 11, 29, 10, 13, 40, 204145, tzinfo=tzutc()), 'starttime': datetime.datetime(2020, 11, 29, 10, 10, 38, 404145, tzinfo=tzutc()), 'matches': 0, 'hits': 0, '@timestamp': datetime.datetime(2020, 11, 29, 10, 13, 40, 241449, tzinfo=tzutc()), 'time_taken': 0.009022235870361328}
2.9 报警效果
钉钉
邮件