实现 Zabbix Server 的高可用

Zabbix Server 的高可用架构图:



思路简述:
1)准备两台zabbix-server分别为10.0.0.200、10.0.0.201并连接数据库:10.0.0.40。(安装过程省略,不在此话题)
2)在两台zabbix-server服务器上安装Keepalive,MASTER:10.0.0.200为主节点、BACKUP:10.0.0.201为备节点并实现高可用。(安装过程省略)
3)修改zabbix-server和zabbix-agent配置文件,使zabbix-agent能够接收zabbix-server发送过来的监控项列表。

注意事项:
1)某些应用服务(zabbix)做HA时,倘若要连接数据库进行数据写入那么不在VIP浮动的节点应用服务必须关闭,可通过notify(节点状态检测功能)配合脚本或ssh远程对应用服务发送start或stop命令),否则多个节点应用服务开启倘若同时对数据库写入会造成主键冲突问题。
2)一些重复性的配置可以从MASTER拷贝过去,稍作修改即可。

1.安装Keepalive后修改其配置文件。
MASTER:
1)全局配置


cat /etc/keepalived/keepalived.conf

! Configuration File for keepalived

global_defs {
   notification_email {
     acassen@firewall.loc
     failover@firewall.loc
     sysadmin@firewall.loc
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 192.168.200.1
   smtp_connect_timeout 30
   router_id node1.qinglin.com  
   vrrp_skip_check_adv_addr
   #vrrp_strict
   #vrrp_garp_interval 0
   #vrrp_gna_interval 0
   vrrp_mcast_group4 233.6.6.6  
}

vrrp_script check_zabbix {
    script "/etc/keepalived/check_zabbix.sh"
    interval 1
    weight -30
    fall 3
    rise 2
    timeout 2
}


include /etc/keepalived/conf.d/*.conf

2)用于检测zabbix-server的VRRP脚本。

cat /etc/keepalived/check_zabbix.sh
#!/bin/bash
killall -0 zabbix_server
chmod +x /etc/keepalived/check_zabbix.sh

3)虚拟路由器配置(VIP)

cat /etc/keepalived/conf.d/vip_zabbix.conf
vrrp_instance VI_1 {  
    state MASTER      
    interface eth0    
    virtual_router_id 50  
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        10.0.0.100/24 dev eth0 label eth0:1
    }

  track_script {
  check_zabbix
}

#此3段内容:利用节点状态检测功能,当节点状态发生变化时向notify脚本传值,配合脚本逻辑对zabbix-server服务执行start或stop命令.(以此来解决zabbix-server同时运行对Mysql同时写入带来的主键冲突问题)
notify_master "/etc/keepalived/notify.sh master"   
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"
}

 

4)配置notify脚本

cat /etc/keepalived/notify.sh

#!/bin/bash

COLOR="\E[1;$[RANDOM%7+31]m"
END="\E[0m"

case $1 in
master)
        systemctl restart zabbix-server.service
        ;;
backup)
        systemctl stop zabbix-server.service
        ;;
*)
        echo -e ${COLOR}节点故障,请立即修复!!!${END}
esac
chmod + /etc/keepalived/notify.sh

BACKUP:

1)全局配置,:BACKUP不需要配置检测zabbix-server服务脚本,原因:BAKCUP节点的zabbix-server应用服务默认stop才能解决数据库主键冲突的问题,倘若配置了VRRP脚本根据脚本逻辑:BACKUP的zabbix-server在服务stop的状态优先级会-30。这会导致倘若MASTER的zabbix-server如果宕机,MASTER的优先级即便-30后最终还是大于BACKUP的优先级(100-30)>(80-30),从而致使VIP无法浮动到BACKUP节点上。这样高可用就无法实现。

cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
   notification_email {
     acassen@firewall.loc
     failover@firewall.loc
     sysadmin@firewall.loc
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 192.168.200.1
   smtp_connect_timeout 30
   router_id node2.qinglin.com
   vrrp_skip_check_adv_addr
   #vrrp_strict
   #vrrp_garp_interval 0
   #vrrp_gna_interval 0
   vrrp_mcast_group4 233.6.6.6
}

include /etc/keepalived/conf.d/*.conf

2)虚拟路由器配置(VIP)

cat /etc/keepalived/conf.d/vip_zabbix.conf

vrrp_instance VI_1 {  
    state BACKUP      
    interface eth0    
    virtual_router_id 50  
    priority 80
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        10.0.0.100/24 dev eth0 label eth0:1
    }



notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"
}

3)配置notify脚本

cat /etc/keepalived/notify.sh

#!/bin/bash

COLOR="\E[1;$[RANDOM%7+31]m"
END="\E[0m"

case $1 in
master)
        systemctl restart zabbix-server.service
        ;;
backup)
        systemctl stop zabbix-server.service
        ;;
*)
        echo -e ${COLOR}节点故障,请立即修复!!!${END}
esac
chmod + /etc/keepalived/notify.sh

2.重启MASTER和BACKUP和Keepalive服务。

systemctl restart keepalived.service

1)可以看到VIP备创建出来并浮动到MASTER上

 

2)BAKCUP上原先正在运行状态的zabbix-server因为启动Keepalive服务后所执行的notify脚本而关闭。(无10051端口)

测试环节:
1)stop掉MASTER的zabbix-server。

现象:可以看到VIP已浮动到BACKUP上且zabbix-server已开启而此时MASTER的zabix-server已处于关闭状态,这就说明VRRP和notify脚本都起到了该有的作用。
BACKUP:

MASTER:

潜在问题:因为BACKUP上不能配置监测zabbix-server的VRRP脚本,这就会衍生出一个问题:如果VIP浮动到BACKUP后zabbix-server也挂掉。MASTER是非抢占式模式,当MASTER的zabbix-server修复后,VIP也不会浮动到MASTER上。这就是由于Keepalive默认只关注自身的节点状态(BACKUP只是挂掉了应用服务而非节点本身),只有配置了VRRP才会关注应用服务状态,才会根据应用服务来进行切换。
结论:生产中MASTER建议设置为抢占或延时抢占式。

3.修改zabbix-server和zabibx-agent配置文件。
agent:server选项中指向VIP

 

server:两台zabbix-server:10.0.0.200、10.0.0.201开启SourceIP功能并指向VIP。
作用:agent虽然指向VIP但实际发送给agent监控项信息的src是server的物理网卡IP,因此agent不会接收server的连接请求(通过日志可查看)。所以server需要通过此项将src改成VIP,这样才能与agent相互之间建立联系。<不信可以试试>

systemctl restart zabbix-server.service

4.通过VIP访问zabbix-web,将agent:10.0.0.202添加入主机列表(通用模板)。可以看到通信灯亮绿。

5.此时再关闭MASTER的zabbix-server后发现页面仍旧能访问不受影响。到此实验截至(懒得贴图,就是有那么个事)

posted on 2021-09-16 14:23  1251618589  阅读(2)  评论(0编辑  收藏  举报

导航