monit配置文件
监控模式:(MONITRING MODE)
Monit支持三种监控模式,
active--Monitj监控一个服务,为了防止一系列问题,Monit会执行以及发送警报,停止,启动,重启,这是一个缺省的模式
passive--MOnit监控一个服务,不会尝试去修复这个问题,但还是会发送警报
manual--Monit监控进入active模式,通过monit的控制,比如在控制台执行命令,比如 Monit start sybase
(Monit will call sybase's start method and enable monitoring)
ALERT MESSAGES
Monit会发送一个邮件提醒,在下列情况
o A service timed out o A service does not exist o A service related data access problem o A service related program execution problem o A service is of invalid object type o A program status failed o A icmp problem o A port connection problem o A resource statement match o A file checksum problem o A file size problem o A file/directory timestamp problem o A file/directory/filesystem permission problem o A file/directory/filesystem uid problem o A file/directory/filesystem gid problem o An action is done per administrator's request
Monit 会发送一个警报只要被监控对象发生了改变,这些对象包括
o Monit started, stopped or reloaded o A file checksum changed o A file size changed o A file content match o A file/directory timestamp changed o A filesystem mount flags changed o A process PID changed o A process PPID changed
警报状态有两种形式
Global -- common for all services
local -- per service
在没一种形式下你都可以发送多个警报状态,换句话说你可以发懊恼过不同的邮件到不同的地址
Setting a global alert statement
{
如果在监控服务发生了改变,Monit将会发送一个警报到全局列表的所有的接受者,下面是全局警报的语法
SET ALERT mail-address [[NOT]{EVENTS}] [MAIL-FORMAT {mail-format}] [REMINDER number]
简单使用:set alert foo@bar
EVENTS,MAIL-FORMAT,REMINDER看下面使用用法
Setting a local alert statement
每一个服务可以有他自己的接收列表
ALERT mail-address [[NOT]{EVENTS}] [MAIL-FORMAT {mail-format}] [REMINDER number]
没有了SET就成了局部的了
或者NOALERT mail-address
如果你只想接受某些服务的某些警报信息的话,比如你只想接受timeout或者nonexist事件,那么你可以这么写
check process myproc with pidfile /var/run/my.pid
alert foo@bar only on { timeout, nonexist }
...
你可以指定除去某些事件外发送警报信息,比如你想监听所有时间除了instance事件,那么你可以这么写
check system myserver alert foo@bar but not on { instance } ...
相当于
alert foo@bar on { action checksum connection content data exec fsflags gid icmp invalid nonexist permission pid ppid resource size status timeout timestamp uid uptime }
一个instance事件是指Monit程序启动或者停止
你也可以根据事件的不同来发送给不同的邮件
alert foo@bar { nonexist, timeout, resource, icmp, connection }
alert security@bar on { checksum, permission, uid, gid }
alert manager@bar
可以在邮件过滤器中使用的事件如下:
action,checksum, connection, content, data, exec, fsflags, gid, icmp,instance, invalid, nonexist, permission, pid, ppid, resource, size, status, timeout, timestamp, uid, uptime
你可以使用
noalert appadmin@bar来进行不接受警报的邮箱
set alert foo@bar check process myfoo with pidfile /var/run/myfoo.pid ... check process mybar with pidfile /var/run/mybar.pid alert foo@bar only on { timeout }
$EVENT A string describing the event that occurred. The values are fixed and are: Event: | Failure state: | Success state: ------------------------------------------------------------------- ACTION | "Action done" | "Action done" CHECKSUM | "Checksum failed" | "Checksum succeeded" CONNECTION| "Connection failed" | "Connection succeeded" CONTENT | "Content failed", | "Content succeeded" DATA | "Data access error" | "Data access succeeded" EXEC | "Execution failed" | "Execution succeeded" FSFLAG | "Filesystem flags failed"| "Filesystem flags succeeded" GID | "GID failed" | "GID succeeded" ICMP | "ICMP failed" | "ICMP succeeded" INSTANCE | "Monit instance changed" | "Monit instance changed not" INVALID | "Invalid type" | "Type succeeded" NONEXIST | "Does not exist" | "Exists" PERMISSION| "Permission failed" | "Permission succeeded" PID | "PID failed" | "PID succeeded" PPID | "PPID failed" | "PPID succeeded" RESOURCE | "Resource limit matched" | "Resource limit succeeded" SIZE | "Size failed" | "Size succeeded" STATUS | "Status failed" | "Status succeeded" TIMEOUT | "Timeout" | "Timeout recovery" TIMESTAMP | "Timestamp failed" | "Timestamp succeeded" UID | "UID failed" | "UID succeeded" UPTIME | "Uptime failed" | "Uptime succeeded" $SERVICE The service entry name in monitrc $DATE The current time and date (RFC 822 date style). $HOST The name of the host Monit is running on $ACTION The name of the action which was done. Action names are fixed and are:http://write.blog.csdn.net/postedit/9564261 Action: | Name: -------------------- ALERT | "alert" EXEC | "exec" RESTART | "restart" START | "start" STOP | "stop" UNMONITOR| "unmonitor" $DESCRIPTION The description of the error condition