zabbix系列(九)zabbix3.0实现自动触发zabbix-agent端shell脚本任务
zabbix实现自动触发远程脚本执行命令
Zabbix触发器(trigger)达到阀值后会有动作(action)执行:发送告警信息或执行远程命令
环境
Server:基于centos6.5 final x86_64
Zabbix:zabbix-3.0.4 server/agent
注意事项
1.远程执行命令是server端向agent端执行,不支持主动模式的agent;
2.不支持代理模式;
3.zabbix用户必须对命令具有执行权限,可以使用sudo赋予root权限(配置sudo无密码方式);
4.远程命令只是执行,执行成功与否并不检测并确认,可在” Monitoring-->Events”中查看action执行时,或在”Reports-->Action log”中查看远程命令是否执行成功(成功为” Executed”)。
zabbix-agent端的操作:
场景:监控某个服务器的web页面http相应码如果连续4次不为200则触发重启相关服务action
1.打开客户端远程执行命令的开关,记得重启zabbix-agent服务
vim /etc/zabbix/zabbix_agentd.conf
EnableRemoteCommands = 1
2.visudo打开关于zabbix操作的命令
①添加如下内容
# allows 'zabbix' user to run all commands without password.
zabbix ALL=NOPASSWD: ALL
# allows 'zabbix' user to restart apache without password.
zabbix ALL=NOPASSWD: /bin/bash /usr/local/zabbix-agent/scripts/restart_ad_server.sh
②注释掉如下一行,否则命令无法执行:
# Default requiretty
添加相关的脚本:
vim /usr/local/zabbix-agent/scripts/restart_ad_server.sh
#!/bin/bash
# kill ad-server process
count_num=`ps -ef|grep ad-server-1.0.0.jar|grep -v grep|wc -l`
if [ $count_num -eq 1 ];then
ps -ef|grep ad-server-1.0.0.jar|grep -v grep|kill -9 `awk '{print $2}'`
sleep 2
fi
# start ad-server
cd /data/ad-push/ && /bin/bash /data/ad-push/start.sh start
添加执行权限
chmod +x /usr/local/zabbix-agent/scripts/restart_ad_server.sh
zabbix-server端操作
设置Action
Configuration-->Actions-->Create action
Action
在Action选项中,
定义Name:adpush_not_200_restart_ad_server
#action name自定义即可,action选项的其余部分可采用默认值,如下:
Conditions
在Conditions选项中添加新的条件判断,以使判断更有针对性,如
New condition:Trigger severity = Warning
New condition:Trigger name like ads_9010_status_not 200
#trigger name对应步骤1中定义的trigger name,如下:
Operations
在Operations选项中,添加新的”Action operation”,点击”New”,
Operation type:选择”Remote Command”
Target list:添加target为”Current host”
#agent在本机
Type:选择”Custom script”
Execute on:选择”Zabbix agent”,命令为 "sudo /bin/bash /usr/local/zabbix-agent/scripts/restart_ad_server.sh"
执行远程命令重新启动nginx服务举例:
#执行命令的账号是zabbix账号,非root账号,不采用sudo命令会导致命令执行后不生效
#另外需要说明是,尝试过使用具体的命令而非脚本,结果是命令执行了但不生效,因为没有具体的失败日志,也分析不出原因
#其余部分采用默认值,点击”Add”即可,如下:
问题:
但是执行命令失败,无法自动重启服务:于是授予zabbix所有用户权限
# allows 'zabbix' user to restart apache without password.
zabbix ALL=(ALL) NOPASSWD: ALL
修改restart_ad_server.sh
/usr/local/zabbix-agent/scripts/restart_ad_server.sh
#!/bin/bash
# kill ad-server process
ps -ef|grep ad-server-1.0.0.jar|/bin/kill `awk '{print $2}'`
sleep 2
# start ad-server
# start.sh使用绝对路径
cd /home/ad-push/ && /bin/bash /home/ad-push/start.sh start
#!/bin/bash # kill ad-server process ps -ef|grep ad-server-1.0.0.jar|/bin/kill `awk '{print $2}'` sleep 2 ps -ef|grep ad-server-1.1.0.jar|/bin/kill `awk '{print $2}'` sleep 2 # start ad-server cd /data/ad-push/ && /bin/bash /data/ad-push/start.sh start cd /data/ad-push2/ && /bin/bash /data/ad-push2/start.sh start
修改start.sh脚本如下:
java也使用绝对路径:
/home/java/jdk1.8.0_40/bin/java
#!/bin/bash LANG="zh_CN.UTF-8" APP_HOME=$(echo `pwd` | sed 's/bin//') APPPIDFILE=$APP_HOME/app.pid case $1 in start) echo "Starting server... " HEAP_MEMORY=1024m PERM_MEMORY=64m JMX_PORT=1111 JMX_HOST=1.1.1.1 JAVA_OPTS="-server -XX:+UseConcMarkSweepGC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=jvm.log -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dsun.net.inetaddr.ttl=15 " shift ARGS=($*) for ((i=0; i<${#ARGS[@]}; i++)); do case "${ARGS[$i]}" in -D*) JAVA_OPTS="${JAVA_OPTS} ${ARGS[$i]}" ;; -Heap*) HEAP_MEMORY="${ARGS[$i+1]}" ;; -Perm*) PERM_MEMORY="${ARGS[$i+1]}" ;; -JmxPort*) JMX_PORT="${ARGS[$i+1]}" ;; -JmxHost*) JMX_HOST = "${ARGS[$i+1]}" ;; esac done JAVA_OPTS="${JAVA_OPTS} -Xms${HEAP_MEMORY} -Xmx${HEAP_MEMORY} -XX:PermSize=${PERM_MEMORY} -XX:MaxPermSize=${PERM_MEMORY} -XX:MaxDirectMemorySize=128m -Dcom.sun.management.jmxremote.port=${JMX_PORT} -Djava.rmi.server.hostname=${JMX_HOST} -Dapp.home=${APP_HOME}" echo "start jvm args ${JAVA_OPTS}" nohup <span style="color:#ff0000;">/home/java/jdk1.8.0_40/bin/java</span> $JAVA_OPTS -cp .:./ad-server-1.0.0.jar org.springframework.boot.loader.JarLauncher > /dev/null & echo $! > $APPPIDFILE echo STARTED ;; stop) echo "Stopping server ... " if [ ! -f $APPPIDFILE ] then echo "error: count not find file $APPPIDFILE" exit 1 else kill -9 $(cat $APPPIDFILE) rm $APPPIDFILE echo STOPPED fi ;; *) echo "Please enter start|stop ... " ;; esac exit 0