elastalert + supervisor
部署安装 elastalert
在
https://github.com/Yelp/elastalert
上下载源码,监控ES日志报错量,进行推送告警
上一篇文章:
https://www.cnblogs.com/evescn/p/13098343.html
安装 elastalert
,需要先安装 Python3
软件
# cd /data/
# git clone https://github.com/Yelp/elastalert.git
# cd elastalert/
# python3 ./setup.py install --dry-run ## 测试是否能直接安装成功
# python3 ./setup.py install
elastalert
配置方法
- 配置
config.yaml
# cp config.yaml.example config.yaml
# vim config.yaml
rules_folder: /data/elastalert/rules
run_every:
seconds: 30
buffer_time:
minutes: 10
es_host: ES_IP
es_port: 9200
es_username: username
es_password: passowrd
writeback_index: elastalert_status
writeback_alias: elastalert_alerts
alert_time_limit:
days: 2
- 配置
rule
规则
# mkdir rules
# cp example_rules/example_frequency.yaml /data/elastalert/rules/test_nginx.yaml
# vim /data/elastalert/rules/test_nginx.yaml
name: Nginx not 200
type: frequency
index: test_nginx-*
num_events: 2
timeframe:
minutes: 10
filter:
- query:
query_string:
query: "NOT status: [ 200 TO 299]"
- query:
query_string:
query: "NOT status: [ 300 TO 399]"
alert:
- "command"
command: ["python3", "/data/elastalert/weixin.py", "测试环境 Nginx 报警,报警:", "接口:{url} 出现状态码{status}频率高!", "客户端IP:{clientip}" ]
其他配置方式参考官网:
https://elastalert.readthedocs.io/en/latest/
编写报警脚本
#!/usr/bin/env python3
# _*_coding:utf-8 _*_
import urllib.request
import json
import sys
import simplejson
# 企业微信应用告警
def gettoken(corpid, corpsecret):
gettoken_url = 'https://qyapi.weixin.qq.com/cgi-bin/gettoken?corpid=' + corpid + '&corpsecret=' +corpsecret
print(gettoken_url)
try:
token_file = urllib.request.urlopen(gettoken_url)
except urllib.request.HTTPError as e:
print(e.code)
print(e.read().decode("utf8"))
sys.exit()
token_data = token_file.read().decode('utf-8')
token_json = json.loads(token_data)
token_json.keys()
token = token_json['access_token']
return token
# 企业微信应用告警
def senddata(access_token, subject, content, server):
send_url = 'https://qyapi.weixin.qq.com/cgi-bin/message/send?access_token=' + access_token
send_values = {
"touser": "@all", # 企业号中的用户帐号,在zabbix用户 Media中配置,如果配置不正常,将按部门发送。
"toparty": "2PURL", # 企业号中的部门id。
"msgtype": "text", # 消息类型。
"agentid": "1000038", # 企业号中的应用id。
"text": {
"content": str(subject + '\n\n' + content + '\n' + server)
},
"safe": "0",
}
send_data = simplejson.dumps(send_values, ensure_ascii=False).encode('utf-8')
send_request = urllib.request.Request(send_url, send_data)
response = json.loads(urllib.request.urlopen(send_request).read())
print(str(response))
# 企业微信告警机器人
def senddata_report(subject, content, server):
send_url = 'https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ'
send_values = {
"msgtype": "text",
"text": {
"content": str(subject + '\n\n' + content + '\n' + server)
}
}
send_data = simplejson.dumps(send_values, ensure_ascii=False).encode('utf-8')
send_request = urllib.request.Request(send_url, send_data)
response = json.loads(urllib.request.urlopen(send_request).read())
print(str(response))
if __name__ == '__main__':
try:
subject = str(sys.argv[1])
content = str(sys.argv[2])
server = str(sys.argv[3])
except IndexError:
print('需要传3个参数')
else:
corpid = 'XXXXXXXXXXXXXXXXXXXXXXXX' # 企业号的标识
corpsecret = 'YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY' # 管理组凭证密钥
# accesstoken = gettoken(corpid, corpsecret)
# senddata(accesstoken, subject, content, server)
senddata_report(subject, content, server)
启动服务
- 调用接口向ES中创建索引
# elastalert-create-index --config /data/elastalert/config.yaml
- 启动服务前测试服务配置正常
# elastalert-test-rule --config /data/elastalert/config.yaml /data/elastalert/rules/frequency.yaml
- 启动服务前测试报警功能正常
# elastalert-test-rule --config /data/elastalert/config.yaml /data/elastalert/rules/frequency.yaml --alert
- 后台启动服务
nohup python -m elastalert.elastalert --config /data/elastalert/config.yaml --rule /data/elastalert/rules/frequency.yaml >> /data/elastalert/elastalert.log 2>&1 &
安装 supervisor
supervisor
部署后,能管理elastalert
服务,elastalert
不再使用nohup
方式启动
安装
pip3 install supervisor
- 创建
supervisor
所需目录
# mkdir /data/supervisor/ -pv
- 创建
supervisor
配置文件
# echo_supervisord_conf > /data/supervisor/supervisord.conf
- 编辑
supervisord.conf
文件
# vim /etc/supervisord.conf
[unix_http_server]
file=/data/supervisor/supervisor.sock ; the path to the socket file ; 修改了 sock套接字位置
[supervisord]
logfile=/tmp/supervisord.log ; main log file; default $CWD/supervisord.log
logfile_maxbytes=50MB ; max main logfile bytes b4 rotation; default 50MB
logfile_backups=10 ; # of main logfile backups; 0 means none, default 10
loglevel=info ; log level; default info; others: debug,warn,trace
pidfile=/tmp/supervisord.pid ; supervisord pidfile; default supervisord.pid
nodaemon=false ; start in foreground if true; default false
silent=false ; no logs to stdout if true; default false
minfds=1024 ; min. avail startup file descriptors; default 1024
minprocs=200 ; min. avail process descriptors;default 200
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[supervisorctl]
serverurl=unix:///data/supervisor/supervisor.sock ; use a unix:// URL for a unix socket ; 修改了 sock套接字位置
[include]
files = /data/supervisor/conf/*.ini ; 修改了 加载配置文件的位置
- 启动
supervisord -c /data/supervisor/supervisord.conf
管理 elastalert
服务
# mkdir /data/supervisor/conf/
# cat /data/supervisor/conf/elastalert.ini
[program:elastalert_nginx]
directory = /data/elastalert
command = /data/python36/bin/elastalert --config /data/elastalert/config.yaml --rule /data/elastalert/rules/test_nginx.yaml
autostart = True
autorestart = True
user = root
- 重新加载配置
# supervisorctl -c /data/supervisor/supervisord.conf update
# supervisorctl -c /data/supervisor/supervisord.conf status
elastalert_nginx RUNNING pid 28685, uptime 1 day, 1:31:51
设置为 systemctl
服务
kill
老进程
kill PID
- 编写
supervisor.service
文件
# vim /usr/lib/systemd/system/supervisord.service
[Unit]
Description=Supervisor daemon
[Service]
Type=forking
ExecStart=/data/python36/bin/supervisord -c /data/supervisor/supervisord.conf
ExecStop=/data/python36/bin/supervisorctl shutdown
ExecReload=/data/python36/bin/supervisorctl reload
KillMode=process
Restart=on-failure
RestartSec=30s
[Install]
WantedBy=multi-user.target
- 启动服务
systemctl start supervisord
systemctl enable supervisord
- 查看服务
# supervisorctl -c /data/supervisor/supervisord.conf status
elastalert_nginx RUNNING pid 28685, uptime 1 day, 1:31:51