进程监控-monit
一、monit安装
1 yum -y install pam* openssl flex openssl-devel 2 tar -xf monit.tar.gz && cd monit && ./configure && make && make install
二、用法参考
2.1、监控进程
#以进程PID方式监控 check process DD_CRON with pidfile /var/run/crond.pid start program = "/etc/init.d/crond start" stop program = "/etc/init.d/crond stop" #以进程名字方式监控 check process OPENSIPS matching "/data/opensips/sbin/opensips -P /var/run/opensips.pid -m 2048 -M 16" start program = "/data/opensips/sbin/opensipsctl start" stop program = "/data/opensips/sbin/opensipsctl stop"
2.2、监控端口
#监控端口,并以指定用户启动 #TCP端口 check host tomcat_service with address 127.0.0.1 start program = "/etc/init.d/tomcat start" as uid "user" and gid "user" and with timeout 60 seconds stop program = "/etc/init.d/tomcat stop" as uid "user" and gid "user" and with timeout 60 seconds if failed port 8088 with timeout 60 seconds then restart if 2 restarts within 2 cycles then exec "/usr/bin/python /home/shell-scripts/send_messages/send_meassage.py tomcat_service" #UDP端口 check host devhost-udp with address XXX.XXX.XXX.XXX if failed port 3001 type udp then exec "/etc/init.d/app restart"
2.3、监控内存
#如果单进程内存使用超过 200M 则告警 check process NXLOG with pidfile /var/run/graylog/collector-sidecar/nxlog.pid start program = "/etc/init.d/collector-sidecar start" stop program = "/etc/init.d/collector-sidecar stop" if totalmem > 200 MB then restart
2.4、监控磁盘
#监控磁盘使用率大于 65% 并告警 check filesystem ROOT with path /dev/vda1 if space usage > 65% then alert
2.5、监控CPU和内存
#监控ping或icmp协议判断网络是否异常 check host example with address 0.0.0.0 if failed icmp type echo count 3 with timeout 20 seconds then alert if failed ping then alert if failed port 5060 type udp protocol sip then alert
2.7、按时间段监控
#在星期1至星期天 凌晨0点至2点不做监控 check program PROC with path "/root/check_sip/check_status.sh" if status = 0 for 3 times within 5 cycles then exec "/bin/sh /root/call_mobile.sh" repeat every 4 cycles alert 648813099@qq.com with reminder on 4 cycles not every "* 0-2 * * 1-7"
2.8、自定义监控脚本
#检测web端口状态: check program webstatus with path "/home/check_http_status.sh" if status = 0 then exec "/home/check_http_status.sh start" repeat every 5 cycles #(如果问题依然存在,在5次循环后,执行/home/check_http_status.sh start)
2.9、文件监控
check file http_log with path /var/log/httpd/error_log ignore match "doorduweb" ignore match "% Total" ignore match "Dload Upload" ignore match "Permission denied" ignore match "favicon.ico" if match "[error]" then alert alert a@email.com alert b@email.com
四、参考文档
https://mmonit.com/monit/documentation/monit.html
https://mmonit.com/wiki/Monit/ConfigurationExamples#syslogd
燃烧吧,骚年.