zabbix自定义监控项

zabbix自定义监控项

监控指标

  • 系统指标
    • 内存
    • CPU
    • 硬盘
  • 文件监控
  • 网络监控
  • 硬件监控(通过IPMI实现)
    • 硬盘温度
    • 电源是否异常
    • CPU温度
  • 业务监控

自定义监控流程

  • 开启自定义监控的功能

    • 修改agentd.conf文件,修改以下2行配置
      • UnsafeUserParameters=1
      • UserParameters=key,command 格式为:UserParameter=<键值>,<命令>
  • 写脚本

  • web界面配置监控项,触发器

环境说明

环境 IP地址 主机名 需要安装的应用 系统版本
服务端 192.168.110.30 zabbix.example.com lamp架构 zabbix_server zabbix_agent redhat 8
客户端 192.168.110.40 zabbix-agent zabbix_agent redhat 8

配置服务端和客户端详情参考:zabbix监控配置流程+实例演示

1. 自定义监控进程

本次监控进程以httpd服务进程为例,编译安装httpd步骤详情见:利用shell脚本实现安装httpd服务

准备工作:

//客户端
#关闭防火墙和selinux
[root@zabbix-agent ~]# systemctl stop firewalld
[root@zabbix-agent ~]# setenforce 0

#通过上面给的链接使用脚本编译安装apache
[root@zabbix-agent ~]# bash httpd.sh

#取消警告,映射软连接
[root@zabbix-agent ~]# sed -i '/#ServerName/s/#//g' /etc/httpd24/httpd.conf
[root@zabbix-agent ~]# ln -s /usr/local/apache/bin/apachectl /usr/bin/apachectl

#启动apache服务
[root@zabbix-agent ~]# apachectl start
[root@zabbix-agent ~]# ss -antl
State     Recv-Q    Send-Q         Local Address:Port          Peer Address:Port    
LISTEN    0         128                  0.0.0.0:80                 0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:22                 0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:10050              0.0.0.0:*       
LISTEN    0         128                     [::]:22                    [::]:* 

第一步:编写脚本

//服务端
#创建脚本目录
[root@zabbix-agent ~]# mkdir /scripts

#脚本内容
[root@zabbix-agent ~]# vim /scripts/check_process.sh 

#以查看进程的方式,过滤apache进程,输出1表示进程有问题,0表示没问题
#!/bin/bash

count=$(ps -ef | grep -Ev "grep|$0" | grep -c "$1")
if [ $count -eq 0 ];then
        echo "1"
else
        echo "0"
fi

#给脚本执行权限
[root@zabbix-agent ~]# chmod +x /scripts/check_process.sh 
[root@zabbix-agent ~]# ll /scripts/
total 4
-rwxr-xr-x. 1 root root 118 Apr 29 00:02 check_process.sh

#测试
[root@zabbix-agent ~]# apachectl start
[root@zabbix-agent ~]# bash /scripts/check_process.sh httpd
0
[root@zabbix-agent ~]# apachectl stop
[root@zabbix-agent ~]# bash /scripts/check_process.sh httpd
1

第二步:开启自定义监控功能

//客户端
#开启自定义监控并添加指标
[root@zabbix-agent ~]# vim /usr/local/etc/zabbix_agentd.conf

# Mandatory: no
# Default:
# TLSCipherAll=

#在最后面添加以下内容
UnsafeUserParameters=1
UserParameter=check_apache,/scripts/check_process.sh httpd

#重启zabbix
[root@zabbix-agent ~]# pkill zabbix
[root@zabbix-agent ~]# zabbix_agentd 
[root@zabbix-agent ~]# ss -antl
State     Recv-Q    Send-Q         Local Address:Port          Peer Address:Port    
LISTEN    0         128                  0.0.0.0:22                 0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:10050              0.0.0.0:*       
LISTEN    0         128                     [::]:22                    [::]:*   

#使用服务端测试是否能获取客户端的指标
[root@zabbix ~]# zabbix_get -s 192.168.110.40 -k check_apache
1

第三步:web界面配置

  1. 添加监控项

点击Configuration ---> Hosts ---> 客户机(192.168.110.40)的Items ---> 右上角Create Items

image

  1. 添加触发器

点击 Configuration ---> Hosts ---> 客户机(192.168.110.40)的Triggers ---> 右上角Create triggers

image

  1. 配置媒介和动作

配置媒介和动作方法详情请见:zabbix监控服务-邮箱告警的三种配置方式

  1. 触发触发器
//客户端
#停止apache服务,触发告警
[root@zabbix-agent ~]# apachectl stop
[root@zabbix-agent ~]# ss -antl
State     Recv-Q    Send-Q         Local Address:Port          Peer Address:Port    
LISTEN    0         128                  0.0.0.0:22                 0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:10050              0.0.0.0:*       
LISTEN    0         128                     [::]:22                    [::]:*  
  1. 邮箱验证

image

2. 自定义监控日志

下载log.py文件到本机

log.py文件详情见:leidazhuang_Github

编写脚本

log.py作用:检查日志文件中是否有指定的关键字

  • 第一个参数为日志文件名(必须有,相对路径、绝对路径均可)
  • 第二个参数为seek position文件的路径(可选项,若不设置则默认为/tmp/logseek文件。相对路径、绝对路径均可)
  • 第三个参数为搜索关键字,默认为 Error
//客户端
#安装python36
[root@zabbix-agent ~]# yum -y install python36

#本脚本用于检查日志文件中是否有指定关键字
[root@zabbix-agent scripts]# cat log.py 
#!/usr/bin/env python3
import sys
import re

def prePos(seekfile):
    global curpos
    try:
        cf = open(seekfile)
    except IOError:
        curpos = 0
        return curpos
    except FileNotFoundError:
        curpos = 0
        return curpos
    else:
        try:
            curpos = int(cf.readline().strip())
        except ValueError:
            curpos = 0
            cf.close()
            return curpos
        cf.close()
    return curpos

def lastPos(filename):
    with open(filename) as lfile:
        if lfile.readline():
            lfile.seek(0,2)
        else:
            return 0
        lastPos = lfile.tell()
    return lastPos

def getSeekFile():
    try:
        seekfile = sys.argv[2]
    except IndexError:
        seekfile = '/tmp/logseek'
    return seekfile

def getKey():
    try:
        tagKey = str(sys.argv[3])
    except IndexError:
        tagKey = 'Error'
    return tagKey

def getResult(filename,seekfile,tagkey):
    destPos = prePos(seekfile)
    curPos = lastPos(filename)

    if curPos < destPos:
        curpos = 0

    try:
        f = open(filename)
    except IOError:
        print('Could not open file: %s' % filename)
    except FileNotFoundError:
        print('Could not open file: %s' % filename)
    else:
        f.seek(destPos)

        while curPos != 0 and f.tell() < curPos:
            rresult = f.readline().strip()
            global result
            if re.search(tagkey, rresult):
                result = 1
                break
            else:
                result = 0

        with open(seekfile,'w') as sf:
            sf.write(str(curPos))
    finally:
        f.close()
    return result

if __name__ == "__main__":
    result = 0
    curpos = 0
    tagkey = getKey()
    seekfile = getSeekFile()
    result = getResult(sys.argv[1],seekfile,tagkey)
    print(result)

添加指标

//客户端
#开启自定义监控功能,添加指标
[root@zabbix-agent ~]# vim /usr/local/etc/zabbix_agentd.conf
# Mandatory: no
# Default:
# TLSCipherAll=

UnsafeUserParameters=1
UserParameter=check_apache,/scripts/check_process.sh httpd
#在最后面添加以下内容
UserParameter=check_logs[*],/scripts/log.py $1 $2 $3

#重启zabbix
[root@zabbix-agent ~]# pkill zabbix
[root@zabbix-agent ~]# zabbix_agentd 
[root@zabbix-agent ~]# ss -antl
State     Recv-Q    Send-Q         Local Address:Port          Peer Address:Port    
LISTEN    0         80                   0.0.0.0:3306               0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:80                 0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:22                 0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:10050              0.0.0.0:*       
LISTEN    0         128                     [::]:22                    [::]:* 

#手动触发警告
[root@zabbix-agent ~]# echo 'Error' >> /usr/local/apache/logs/error_log

//服务端
#使用服务端测试是否能获取客户端的指标
#监控/usr/local/apache/logs/error_logs文件,seek position文件为默认的/tmp/seek,关键字为Error

#第一次结果为0说明搜索到Error
[root@zabbix ~]# zabbix_get -s 192.168.110.40 -k  check_logs["/usr/local/apache/logs/error_log","/tmp/seek","Error"]
1

#第二次结果为1是因为在第一次的报错之后,并没有找到Error
[root@zabbix ~]# zabbix_get -s 192.168.110.40 -k  check_logs["/usr/local/apache/logs/error_log","/tmp/seek","Error"]
0

web界面配置

  1. 添加监控项

点击Configuration ---> Hosts ---> 客户机(192.168.110.40)的Items ---> 右上角Create Items

image

  1. 添加触发器

点击 Configuration ---> Hosts ---> 客户机(192.168.110.40)的Triggers ---> 右上角Create triggers

image

  1. 配置媒介和动作

配置媒介和动作方法详情请见:zabbix监控服务-邮箱告警的三种配置方式

  1. 触发触发器
//客户端
#手动输入Error触发警告
[root@zabbix-agent ~]# echo 'Error' >> /usr/local/apache/logs/error_log 

//服务端
#取值为1
[root@zabbix ~]# zabbix_get -s 192.168.110.40 -k  check_logs["/usr/local/apache/logs/error_log","/tmp/logseek","Error"]
1
  1. 邮箱测试

image

3. 自定义监控mysql主从状态

环境说明

增加一台机器 192.168.110.50,当作主msater数据库

环境 IP地址 需要安装的应用 系统版本
主:master 192.168.110.50 mariadb redhat 8
从:slave 192.168.110.40 mariadb redhat 8

准备工作

//master端
#安装数据库
[root@master ~]# yum -y install mariadb*

#启动服务
[root@master ~]# systemctl enable --now mariadb

#关闭防火墙和selinux
[root@master ~]# systemctl disenable --now firewalld
Unknown operation disenable.
[root@master ~]# systemctl disable --now firewalld
Removed /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
[root@master ~]# sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
[root@master ~]# setenforce 0

//slave端
#安装数据库
[root@slave ~]# yum -y install mariadb*

#启动服务
[root@slave ~]# systemctl enable --now mariadb

#关闭防火墙和selinux
[root@slave ~]# systemctl disenable --now firewalld
Unknown operation disenable.
[root@slave ~]# systemctl disable --now firewalld
Removed /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
[root@slave ~]# sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
[root@slave ~]# setenforce 0

配置master主数据库

//master端
#测试数据库
[root@master ~]# mysql -uroot
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 22
Server version: 10.3.28-MariaDB-log MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
#授权
MariaDB [(none)]> grant replication slave on *.* to 'repl'@'192.168.110.40' identified by 'repl123!';
Query OK, 0 rows affected (0.001 sec)
#刷新
MariaDB [(none)]> flush privileges;
Query OK, 0 rows affected (0.001 sec)

MariaDB [(none)]> exit
Bye

#配置my.cnf文件
[root@master ~]# vim /etc/my.cnf
#
# include all files from the config directory
#
!includedir /etc/my.cnf.d
#最后面加下以下信息
[mysqld]
log-bin=mysql-bin
server-id=1

#重启mysql,查看状态
[root@master ~]# systemctl restart mariadb
[root@master ~]# mysql -uroot
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 9
Server version: 10.3.28-MariaDB-log MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> show master status;
+------------------+----------+--------------+------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000001 |      328 |              |                  |
+------------------+----------+--------------+------------------+
1 row in set (0.000 sec)

MariaDB [(none)]> exit
Bye

配置slave从数据库

//slave端
#测试数据库
[root@slave ~]# mysql -uroot
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 8
Server version: 10.3.28-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> exit
Bye

#配置my.cnf文件
[root@slave ~]# vim /etc/my.cnf
#
# include all files from the config directory
#
!includedir /etc/my.cnf.d
#最后添加以下信息
[mysqld]
server-id=20
relay-log=myrelay

#重启mysql,配置并启动主从复制
[root@slave ~]# mysql -uroot
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 9
Server version: 10.3.28-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> change master to \
    -> master_host='192.168.110.50',
    -> master_user='repl',
    -> master_password='repl123!',
    -> master_log_file='mysql-bin.000001',
    -> master_log_pos=328;
    
Query OK, 0 rows affected (0.003 sec)

MariaDB [(none)]> start slave;

Query OK, 0 rows affected (0.002 sec)

MariaDB [(none)]> show slave status \G
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 192.168.110.50
                   Master_User: repl
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: mysql-bin.000001
           Read_Master_Log_Pos: 652
                Relay_Log_File: myrelay.000003
                 Relay_Log_Pos: 555
         Relay_Master_Log_File: mysql-bin.000001
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB: 
           Replicate_Ignore_DB: 
    Slave_Transactional_Groups: 0
1 row in set (0.000 sec)

MariaDB [(none)]> exit
Bye

编写脚本

//slave端
#脚本内容
[root@slave ~]# vim /scripts/check_mysql_repl.sh 

#!/bin/bash
  
count=$(mysql -uroot -e 'show slave status\G'|grep 'Running:'|awk '{print $2}'|grep -c 'Yes')

if [ $count -ne 2 ];then
        echo '1'
else
        echo '0'
fi

#给脚本执行权限
[root@slave ~]# chmod +x /scripts/check_mysql_repl.sh 
[root@slave ~]# ll /scripts/
total 12
-rwxr-xr-x. 1 root root  179 Apr 29 15:33 check_mysql_repl.sh
-rwxr-xr-x. 1 root root  118 Apr 29 00:02 check_process.sh

#测试脚本
[root@slave ~]# bash /scripts/check_mysql_repl.sh 
0

添加指标

//slave端
#开启自定义监控功能,添加指标
[root@slave ~]# vim /usr/local/etc/zabbix_agentd.conf
# Mandatory: no
# Default:
# TLSCipherAll=

UnsafeUserParameters=1
UserParameter=check_apache,/scripts/check_process.sh httpd
UserParameter=check_logs[*],/scripts/log.py $1 $2 $3
#在最后面添加以下内容
UserParameter=check_mysql_repl,/scripts/check_mysql_repl.sh

#重启zabbix
[root@slave ~]# pkill zabbix
[root@slave ~]# zabbix_agentd 
[root@slave ~]# ss -antl
State     Recv-Q    Send-Q         Local Address:Port          Peer Address:Port    
LISTEN    0         80                   0.0.0.0:3306               0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:80                 0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:22                 0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:10050              0.0.0.0:*       
LISTEN    0         128                     [::]:22                    [::]:* 

#使用服务端测试是否能获取客户端的指标
[root@zabbix ~]# zabbix_get -s 192.168.110.40 -k check_mysql_repl
0

web界面配置

  1. 添加监控项

点击Configuration ---> Hosts ---> 客户机(192.168.110.40)的Items ---> 右上角Create Items

image

  1. 添加触发器

点击 Configuration ---> Hosts ---> 客户机(192.168.110.40)的Triggers ---> 右上角Create triggers

image

  1. 配置媒介和动作

配置媒介和动作方法详情请见:zabbix监控服务-邮箱告警的三种配置方式

  1. 触发触发器
//slave端
#关闭slave,触发告警
[root@slave ~]# mysql -uroot
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 70
Server version: 10.3.28-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> stop slave;
Query OK, 0 rows affected, 1 warning (0.015 sec)

MariaDB [(none)]> show slave status \G
*************************** 1. row ***************************
                Slave_IO_State: 
                   Master_Host: 192.168.110.50
                   Master_User: repl
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: mysql-bin.000001
           Read_Master_Log_Pos: 652
                Relay_Log_File: myrelay.000006
                 Relay_Log_Pos: 555
         Relay_Master_Log_File: mysql-bin.000001
              Slave_IO_Running: No
             Slave_SQL_Running: No
               Replicate_Do_DB: 
           Replicate_Ignore_DB: 
1 row in set (0.000 sec)

MariaDB [(none)]> exit
Bye
  1. 邮箱验证

image

4. 自定义监控mysql主从延迟

编写脚本

//slave端
#脚本内容
[root@slave ~]# vim /scripts/check_mysql_delay.sh 

#!/bin/bash
  
mysql -uroot -e 'show slave status \G'|grep 'Seconds_Behind_Master:'|awk '{print $2}'

#给脚本执行权限
[root@slave ~]# chmod +x /scripts/check_mysql_delay.sh
[root@slave ~]# ll /scripts/
total 16
-rwxr-xr-x. 1 root root  100 Apr 29 17:11 check_mysql_delay.sh
-rwxr-xr-x. 1 root root  179 Apr 29 15:33 check_mysql_repl.sh
-rwxr-xr-x. 1 root root  118 Apr 29 00:02 check_process.sh

#测试脚本
[root@slave ~]# mysql -uroot -e 'show slave status \G'
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 192.168.110.50
                   Master_User: repl
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: mysql-bin.000001
           Read_Master_Log_Pos: 652
                Relay_Log_File: myrelay.000007
                 Relay_Log_Pos: 555
         Relay_Master_Log_File: mysql-bin.000001
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB: 
           Replicate_Ignore_DB: 
#slave落后master的秒数
         Seconds_Behind_Master: 0
[root@slave ~]# bash /scripts/check_mysql_delay.sh 
0

添加指标

//slave端
#开启自定义监控功能,添加指标
[root@slave ~]# vim /usr/local/etc/zabbix_agentd.conf
# Mandatory: no
# Default:
# TLSCipherAll=

UnsafeUserParameters=1
UserParameter=check_apache,/scripts/check_process.sh httpd
UserParameter=check_logs[*],/scripts/log.py $1 $2 $3
UserParameter=check_mysql_repl,/scripts/check_mysql_repl.sh
#在最后面添加以下内容
UserParameter=check_mysql_delay,/scripts/check_mysql_delay.sh

//重启zabbix
[root@slave ~]# pkill zabbix
[root@slave ~]# zabbix_agentd 

//使用服务端测试是否能获取客户端的指标
[root@zabbix ~]# zabbix_get -s 192.168.110.40 -k check_mysql_delay
0

web界面配置

  1. 添加监控项

点击Configuration ---> Hosts ---> 客户机(192.168.110.40)的Items ---> 右上角Create Items

image

  1. 添加触发器

点击 Configuration ---> Hosts ---> 客户机(192.168.110.40)的Triggers ---> 右上角Create triggers

image

  1. 配置媒介和动作

配置媒介和动作方法详情请见:zabbix监控服务-邮箱告警的三种配置方式

  1. 触发触发器
//slave端
#开启slave,延迟为0,触发告警
[root@slave ~]# mysql -uroot
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 407
Server version: 10.3.28-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> start slave;
Query OK, 0 rows affected (0.001 sec)

MariaDB [(none)]> show slave status \G
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 192.168.110.50
                   Master_User: repl
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: mysql-bin.000001
           Read_Master_Log_Pos: 652
                Relay_Log_File: myrelay.000008
                 Relay_Log_Pos: 555
         Relay_Master_Log_File: mysql-bin.000001
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB: 
           Replicate_Ignore_DB: 
#延迟为0 
         Seconds_Behind_Master: 0
1 row in set (0.000 sec)

MariaDB [(none)]> exit
Bye

这里临时修改为延迟为0的时候触发告警

image

  1. 邮箱验证

image

大功告成,以上就是自定义监控的详细步骤!!!

posted @ 2021-04-29 07:46  我爱吃芹菜~  阅读(318)  评论(0编辑  收藏  举报
Title