涛子 - 简单就是美

成单纯魁增,永继振国兴,克复宗清政,广开家必升

  博客园  :: 首页  :: 新随笔  :: 联系 :: 订阅 订阅  :: 管理
Q:有没有已经实施的案例
A:
http://wiki.lustre.org/Check_MK/Graphite/Graphios_Setup_Guide
Q:check-mk-agent怎么安装?
A:
使用epel源,yum -y install check-mk-agent xinetd && /etc/init.d/xinetd start && chkconfig xinetd on,同时开放tcp:6556端口,目前epel源最高版本 1.2.6p16-3.el6
Q:客户端获取应用监控信息,如mysql等
A: 
放置在/usr/share/check-mk-agent/plugins目录,如果需要异步的话,在plugins目录新建以秒为名称的目录,如/usr/share/check-mk-agent/plugins/60
Q:服务端过滤无用监控信息
A:
编辑/opt/omd/sites/monitor/etc/check_mk/main.mk,添加以下内容
ignored_checktypes = ["chrony", "kernel", "cpu.threads", "logwatch", "megaraid_pdisks", "ipmi", "ipmi_sensors", "mounts", "ps", "ps.perf", "postfix_mailq", "postfix_mailq_status", "logins", "omd_apache", "omd_status", "livestatus_status"]
Q:监控数据量大的时候,是如何做优化的
A:
单机用目录方式进行优化,另外多机的话,还支持分布式
Q:如何变更检测时间及重试次数
A:
Host & Service Parameters -> Monitoring Configuration -> Service Checks -> Normal check interval for service checks 设置1分钟
Host & Service Parameters -> Monitoring Configuration -> Service Checks -> Retry check interval for service checks 设置1分钟
Host & Service Parameters -> Monitoring Configuration -> Service Checks -> Maximum number of check attempts for service 设置1次

Host & Service Parameters -> Monitoring Configuration -> Service Checks -> Normal check interval for host checks 设置1分钟
Host & Service Parameters -> Monitoring Configuration -> Service Checks -> Retry check interval for host checks 设置1分钟
Host & Service Parameters -> Monitoring Configuration -> Service Checks -> Maximum number of check attempts for service 设置1次
Q:如何自动添加发现的新监控项
A:
Host & Service Parameters -> Monitoring Configuration -> Inventory and Check_MK settings -> Periodic service discovery 增加rule
Perform service discovery every 设置为1h
Severity of unmonitored services 设置为warning
Severity of vanished services 设置为ok
Automatically update service configuration中,mode设置为Add unmonitored services & remove varnished services
Group discovery and activation for up to 设置为10m
Q:支持批量主机导入吗
A:Host -> Bulk host import
Q:建立主机组并关联主机
A: 
Host & Service Groups  -> new host group
Host Tags -> new tag group
Host & Service Parameters -> Grouping -> Assignment of hosts to host groups
新建rule ,  Assignment of hosts to host groups 选择group_bj,Conditions下host tag选择 location is group_bj
Q:磁盘io模式由summary变更为separate
A:
Host & Service Parameters -> Parameters for discovered services -> Storage, Filesystems and Files -> Discovery mode for Disk IO check 
新建rule,选择Create a separate check for each physical disk
Q:网络设备接口名称由id变更为description
A:
Host & Service Parameters -> Parameters for discovered services -> Discovery - automatic service detection -> Network Interface and Switch Port Discovery,新建rule,选择Use description as service name for network interface checks,选择use description
Q:改变监控阀值
A:
#load
Host & Service Parameters -> Parameters for discovered services -> Operating System Resources -> CPU load (not utilization!),添加rule,选择fixed level,warning & critical设置为10

#cpu
Host & Service Parameters -> Parameters for discovered services -> Operating System Resources -> CPU utilization on Linux/UNIX,添加rule,选择Alert on too high CPU utilization,warning & critical设置为95,选择Alert on too high disk wait (IO wait),warning & critical设置为50

#memory
Host & Service Parameters -> Parameters for discovered services -> Operating System Resources -> Memory and Swap usage on Linux,添加rule,选择Levels for RAM,warning & critical设置为95

#storage
Host & Service Parameters -> Parameters for discovered services -> Storage, Filesystems and Files -> Filesystems (used space and growth)
添加rule,选择Levels for filesystem free space,dynamic level,
Filesystem larger than 1T -> Absolute free space -> warning & critical设置为20000MB
Filesystem larger than 300G -> Absolute free space -> warning & critical设置为10000MB
Filesystem larger than 100G -> Absolute free space -> warning & critical设置为5000MB
Filesystem larger than 10G -> Absolute free space -> warning & critical设置为1000MB
Filesystem larger than 1G -> Absolute free space -> warning & critical设置为100MB
Filesystem larger than 300M -> Absolute free space -> warning & critical设置为20MB
Filesystem larger than 100M -> Absolute free space -> warning & critical设置为10MB
Filesystem larger than 0B -> Absolute free space -> warning & critical设置为0MB

#traffic
Host & Service Parameters -> Parameters for discovered services -> Networking -> Network interfaces and switch ports,新建rule,

1g
Operating speed -> ignore speed,Operational state -> 1 - up,Assumed input speed -> 1g, Assumed output speed -> 1g, Measurement unit -> Bits,Used bandwidth (minimum or maximum traffic) -> in/out & upper & percent level & warning 95 & critical 95,Average values -> 10mins,Port Specification -> em1 em2 em3 em4 eth0 eth1 eth2 eth3

2g
Operating speed -> ignore speed,Operational state -> 1 - up,Assumed input speed -> 2000000000 bit, Assumed output speed -> 2000000000 bit, Measurement unit -> Bits,Used bandwidth (minimum or maximum traffic) -> in/out & upper & percent level & warning 95 & critical 95,Average values -> 10mins,Port Specification -> bond0 bond1

10g
Operating speed -> ignore speed,Operational state -> 1 - up,Assumed input speed -> 10g, Assumed output speed -> 10g, Measurement unit -> Bits,Used bandwidth (minimum or maximum traffic) -> in/out & upper & percent level & warning 95 & critical 95,Average values -> 10mins,Port Specification -> p1p1 p1p2 p2p1 p2p2
cmk -L |grep redis
redis.info                                   tcp    (no man page present)

cmk --checks=redis.info -I kvm-48-113

cmk --debug -nv kvm-48-113
posted on 2017-02-20 12:57  北京涛子  阅读(413)  评论(0编辑  收藏  举报