11-Zabbix自动发现LLD实现进程使用CPU和内存监控 + 结合主动模式应该会更好

原文:https://blog.csdn.net/u013272009/article/details/90486079

Zabbix 自动发现(LLD)
LLD : Low-level discovery

官网文档: https://www.zabbix.com/documentation/4.0/manual/discovery/low_level_discovery

作用: 可以指定规则(rule),来达成不确定数量的监测项的自动配置生成

自定义 LLD 规则, 参见上官网文档中的 Creating custom LLD rules 节,比较有用

比如使用 zabbix 实现服务器进程 CPU 、 MEM 的使用情况,则使用 LLD 较为合适

实际例子

 

 

如上图, Server CPU all 图表,服务进程数量等开好时才确定。下次开可能又不一样

下面,使用 LLD 来实现上述图表

1. 编写获取服务名脚本
例如 kgetserver.sh :

#!/bin/bash
echo '{"data":['
n0=`ps -aux | grep Server | grep -v grep | grep -v $0 | grep -v kgetcpu | grep -v kgetmem | grep -v tail | wc -l`
ps -aux | grep Server | grep -v grep | grep -v $0 | grep -v kgetcpu | grep -v kgetmem | grep -v tail | awk -v n=$n0 '{printf "{\"{#PROCESSNAME}\":\"\\\"";for(i=11;i<=NF;i++){printf $i;if(i<NF)printf " "};printf "\\\"\"}";if(NR<n)printf ",";printf "\n"}'
echo ']}'

执行可输出:

{"data":[
{"{#PROCESSNAME}":"\"./MgrServer.dbg\""},
{"{#PROCESSNAME}":"\"./LogServer_Ex.dbg\""},
{"{#PROCESSNAME}":"\"./RecordServer_Ex.dbg --stderrthreshold 0 --log_dir ../log -s 1\""},
{"{#PROCESSNAME}":"\"./RecordServer_Ex.dbg --stderrthreshold 0 --log_dir ../log -s 2\""},
{"{#PROCESSNAME}":"\"./RecordServer_Ex.dbg --stderrthreshold 0 --log_dir ../log -s 3\""},
{"{#PROCESSNAME}":"\"./RecordServer_Ex.dbg --stderrthreshold 0 --log_dir ../log -s 4\""},
{"{#PROCESSNAME}":"\"./RecordServer_Ex.dbg --stderrthreshold 0 --log_dir ../log -s 5\""},
{"{#PROCESSNAME}":"\"./RecordServer_Ex.dbg --stderrthreshold 0 --log_dir ../log -s 6\""},
{"{#PROCESSNAME}":"\"./ProxyServer_Ex.dbg\""},
{"{#PROCESSNAME}":"\"./RedisSyncServer_Ex.dbg -s 1\""},
{"{#PROCESSNAME}":"\"./RedisSyncServer_Ex.dbg -s 2\""},
{"{#PROCESSNAME}":"\"./RedisSyncServer_Ex.dbg -s 3\""},
{"{#PROCESSNAME}":"\"./RedisSyncServer_Ex.dbg -s 4\""},
{"{#PROCESSNAME}":"\"./RedisSyncServer_Ex.dbg -s 5\""},
{"{#PROCESSNAME}":"\"./RedisSyncServer_Ex.dbg -s 6\""},
{"{#PROCESSNAME}":"\"./RobotServer_Ex.dbg\""},
{"{#PROCESSNAME}":"\"./LoginServer.dbg --stderrthreshold 0 --log_dir ../log -s 1\""},
{"{#PROCESSNAME}":"\"./LoginServer.dbg --stderrthreshold 0 --log_dir ../log -s 2\""},
{"{#PROCESSNAME}":"\"./LoginServer.dbg --stderrthreshold 0 --log_dir ../log -s 3\""}
]}

本脚本就是 rule ,通过本脚本可以找到要监测的服务项

再比如 kgetproc.sh :

#!/bin/bash
echo '{"data":['
n0=`ps -aux | grep $1 | grep -v grep | grep -v $0 | grep -v kgetcpu | grep -v kgetmem | grep -v tail | grep -v defunct | wc -l`
ps -aux | grep $1 | grep -v grep | grep -v $0 | grep -v kgetcpu | grep -v kgetmem | grep -v tail | grep -v defunct | awk -v n=$n0 '{printf "{\"{#PROCESSNAME}\":\"\\\"";for(i=11;i<=NF;i++){printf $i;if(i<NF)printf " "};printf "\\\"\", \"{#PROCESSPID}\":";printf $2;printf ",\"{#PROCESSNO}\":"; printf NR; printf "}";if(NR<n)printf ",";printf "\n"}'
echo ']}'

执行可输出:

[root@host-192-168-21-36 opt]# ./kgetproc.sh codis-server
{"data":[
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23790\"", "{#PROCESSPID}":28381,"{#PROCESSNO}":1},
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23791\"", "{#PROCESSPID}":28486,"{#PROCESSNO}":2},
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23792\"", "{#PROCESSPID}":28523,"{#PROCESSNO}":3},
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23793\"", "{#PROCESSPID}":28576,"{#PROCESSNO}":4},
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23794\"", "{#PROCESSPID}":28597,"{#PROCESSNO}":5},
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23795\"", "{#PROCESSPID}":28633,"{#PROCESSNO}":6},
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23796\"", "{#PROCESSPID}":28671,"{#PROCESSNO}":7},
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23797\"", "{#PROCESSPID}":28707,"{#PROCESSNO}":8},
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23798\"", "{#PROCESSPID}":28735,"{#PROCESSNO}":9},
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23799\"", "{#PROCESSPID}":28780,"{#PROCESSNO}":10}
]}

2. 编写获取某进程CPU 、 MEM 占用脚本
比如 kgetcpu.sh :

#!/bin/bash

mypid=`ps aux | grep "$1" | grep -v grep | grep -v "$0" | grep -v tail | grep -v defunct | grep -v vi | awk '{print $2}'`
getactive=`top -b n 1 | awk -v v=$mypid '{if($1==v){print $9};}'`
if [[ -n $getactive ]]; then
    echo $getactive
else
    echo "0"
fi

 

比如 kgetmem.sh :

#!/bin/bash

mypid=`ps aux | grep "$1" | grep -v grep | grep -v "$0" | grep -v tail | grep -v defunct | grep -v vi | awk '{print $2}'`
getactive=`top -b n 1 | awk -v v=$mypid '{if($1==v){print $6};}'`
if [[ ""$getactive != "" ]]; then
    if [[ ${getactive} =~ "g" ]];then
        getactive=${getactive%%g*}
        echo "$getactive*1024*1024" | bc
    else
        n=$[getactive*1024];
        echo $n
    fi
else
    echo "0"
fi

 

以上脚本定义了每个监测项要监测的内容

3. 配置监测项
比如 /etc/zabbix/zabbix_agentd.d/userparameter_mygraph.conf :

UserParameter=myGraph.server_cpu[*],sudo /opt/kgetcpu.sh $1
UserParameter=myGraph.server_mem[*],sudo /opt/kgetmem.sh $1
UserParameter=myGraph.server_process[*],sudo /opt/kgetserver.sh
UserParameter=myGraph.proc[*],sudo /opt/kgetproc.sh $1

重启服务

systemctl restart zabbix-agent.service

剩下的就是使用 zabbix frontend ,在页面上操作了

4. 模版(Templates)上创建 Discovery rule

 

 类似上图

5. 模版(Templates)上创建 Item prototypes

 

 

类似上图

6. 模版(Templates)上创建 Graph prototype

类似上图

至此,所有监测项会自动生成

Host 上创建 Server CPU all 图形
(目前是手动创建的,按道理也可以自动生成。 有时间翻翻文档,再补上)

posted @ 2020-05-29 17:11  番茄土豆西红柿  阅读(91)  评论(0编辑  收藏  举报
TOP