Nginx入门篇(七)之Nginx+keepalived高可用集群

  • 一、keepalived介绍

  keepalived软件最开始是转为负载均衡软件LVS而设计,用来管理和监控LVS集群系统中各个服务节点的状态,后来又加入了可实现高可用的VRRP功能。所以Keepalived除了能管理LVS以外,还可以作为其他服务(如:Nginx、Haproxy、MySQL)的高可用解决方案的软件。Keepalived是类似工作在lay3、lay4和lay7的交换机制的软件。

  Keepalived软件是通过VRRP协议实现高可用功能。VRRP(虚拟路由器冗余协议)目的就是为了解决静态路由单点故障的问题,它能够保证当个别节点宕机时,整个网络还可以正常地运行。所以Keepalived一方面有配置管理LVS的功能,还可以对LVS下面的节点进行健康监测,另一方面又可以实现系统网络服务的高可用功能。

  • 二、Keepalived的三个功能

1、管理LVS负载均衡软件

2、实现对LVS集群节点健康检查功能

3、作为系统网络服务的高可用功能(重点)

  Keepalived的作用是检测Web服务器的状态,如果有1台web或MySQL服务器宕机或故障,Keepalived检测到后,会将故障的Web服务器或MySQL服务器从集群当中剔除,而当服务器恢复正常后,Keepalived会自动将剔除的服务器重新加入到集群当中,这些工作无需人工参与,需要人工参与的是服务器故障的修复。

  • 三、Keepalived的工作原理  

  Keepalived高可用之间是通过VRRP进行通信的。那什么是VRRP协议呢?

  (1)VRRP,全称Virtual Router Redundancy Protocol,中文为虚拟路由冗余协议,VRRP的出现是为了解决静态路由的单点故障。

  (2)VRRP是通过一种竞选协议机制来决定将路由任务交给某台VRRP路由器的。

  (3)VRRP用IP多播的方式(默认多播地址:224.0.0.18)实现高可用对之通信。

  (4)工作做时,主节点发包,备用节点接包,当备用节点接收不到主节点发送的数据包时,就会启动接管程序接管主节点的资源。备用节点可以有多个,通过优先级竞选,但一般的Keepalived系统运行工作中都是一对。

  (5)VRRP使用了加密协议加密数据,但是目录官方还是推荐以明文的方式配置认证类型和密码。

  明确了VRRP协议,再看Keepalived工作原理:

  Keepalived高可用对之间是通过VRRP进行通信,VRRP通过竞选机制来确定主备,主的优先级高于备,因此工作时,主会优先获得所有资源,备节点处于等待状态,当主宕机后,备用节点则会接管主节点资源,然后顶替主节点对外提供服务。

  在Keepalived服务对之间,只有作为主的服务器会一直发送VRRP广播包,告诉备用节点主节点还活着,此时备用节点不会抢占主,当主不可用时,即备监听不到主发送的广播包时,就会启动相关的服务接管资源,保证业务的连续性,接管速度最快可以小于1秒。

  • 四、Keepalived高可用服务部署

1、环境说明

Hostname IP 角色说明
lb01 192.168.56.12 keepalived  MASTER
lb02 192.168.56.13 keepalived  BACKUP

 

2、部署Keepalived

(1)安装keepalived 

[root@lb01 ~]# yum install -y keepalived
[root@lb02 ~]# yum install -y keepalived
[root@lb01 ~]# rpm -qa keepalived
keepalived-1.3.5-6.el7.x86_64
[root@lb02 ~]# rpm -qa keepalived
keepalived-1.3.5-6.el7.x86_64
View Code

(2)keepalived.conf配置文件高可用部分解析

[root@lb01 ~]# cat /etc/keepalived/keepalived.conf 
! Configuration File for keepalived

global_defs {    #定义服务故障报警的E-mail地址,可配多个地址,可选配置
   notification_email {
     acassen@firewall.loc
     failover@firewall.loc
     sysadmin@firewall.loc
   }
   notification_email_from Alexandre.Cassen@firewall.loc    #指定发送邮件的发件人,即发送人地址,可选配置
   smtp_server 192.168.200.1    #指定发送邮件的smtp服务器,本机开启了sendmail或postfix就可以使用上面的默认地址发送邮件,可选配置
   smtp_connect_timeout 30        #链接smtp超时时间,可选配置
   router_id LVS_DEVEL            #Keepalived服务器的路由标识,在同一局域网内该标识具有唯一性
   vrrp_skip_check_adv_addr
   vrrp_strict
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}

vrrp_instance VI_1 {        #VRRP实例定义区块,定义了一个VI_1的实例,每个vrrp_instance实例可以认为是Keepalived服务的一个实例或作为一个业务服务,在主节点中有的vrrp_instance实例,备用节点也要存在,这样故障才能接管。
    state MASTER            #定义Keepalived的主备状态,只能有MASTER和BACKUP两种状态,并且状态字符要大写
    interface eth0            #定义Keepalived使用的网卡接口
    virtual_router_id 51    #虚拟路由ID标识,这个标识最好是一个数字,并且唯一。MASTER和BACKUP配置中相同实例的这个id必须一致,否则会脑裂。
    priority 100            #优先级配置,数值越大,实例优先级越高,建议MASTER和BACKUP相差50以上为佳。
    advert_int 1            #同步通知间隔,也就是MASTER和BACKUP之间通信检查时间间隔,单位为秒,默认为1.
    authentication {        #权限认证配置
        auth_type PASS        #认证类型有PASS、AH2中,官方推荐PASS,不超过8个字符,同一实例MASTER和BACKUP使用相同密码才能正常通信。
        auth_pass 1111        #认证密码
    }
    virtual_ipaddress {        #虚拟IP地址,可以配置多个IP地址,每个一行,配置时最好明确指定子网掩码和虚拟IP绑定的网络接口。
        192.168.200.16
        192.168.200.17
        192.168.200.18
    }
}
View Code

3、Keepalived高可用服务单实例演示

(1)配置Keepalived主服务器lb01 MASTER

[root@lb01 keepalived]# cp keepalived.conf keepalived.conf.bak
[root@lb01 keepalived]# vim keepalived.conf
! Configuration File for keepalived

global_defs {
   notification_email {
        123456@qq.com
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id lb01
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 55
    priority 150
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.56.20/24 dev eth0 label eth0:1
    }
}
[root@lb01 keepalived]# systemctl start keepalived    //配置完成启动keepalived
[root@lb01 keepalived]# ip addr |grep 192.168.56.20    //查看是否有配置的虚拟IP:192.168.56.20
    inet 192.168.56.20/24 scope global secondary eth0:1
View Code

(2)配置Keepalived备服务器lb02 BACKUP

[root@lb02 keepalived]# cp keepalived.conf keepalived.conf.bak
[root@lb02 keepalived]# vim keepalived.conf
! Configuration File for keepalived

global_defs {
   notification_email {
        123456@qq.com
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id lb02
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 55
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.56.20/24 dev eth0 label eth0:1
    }
}
[root@lb02 keepalived]# systemctl start keepalived    //配置完成启动keepalived
[root@lb02 keepalived]# ip addr |grep 192.168.56.20    //查看是否有配置的虚拟IP:192.168.56.20,备用服务器查看是不存在虚拟IP的,如果有返回结果,说明脑裂了
View Code

(3)高可用主备服务器切换测试

1)停止主上的keepalived服务,查看lb01和lb02的虚拟ip
[root@lb01 keepalived]# systemctl stop keepalived     //停止主上的keepalived服务
[root@lb01 keepalived]# ip addr |grep 192.168.56.20    //lb01上停止keepalived后,查看lb01上是不存在虚拟ip:192.168.56.20  
[root@lb02 keepalived]# ip addr |grep 192.168.56.20    //lb02上可以看到虚拟ip:192.168.56.20,实现了VIP漂移
    inet 192.168.56.20/24 scope global secondary eth0:12)重新启动主上的keepalived服务,查看lb01和lb02的虚拟ip
[root@lb01 keepalived]# systemctl start keepalived     //重新启动lb01上的keepalived
[root@lb01 keepalived]# ip addr |grep 192.168.56.20    //可以看到虚拟ip又重新回到了lb01上
    inet 192.168.56.20/24 scope global secondary eth0:1
[root@lb02 keepalived]# ip addr |grep 192.168.56.20    //lb02上再查询虚拟ip信息是不存在虚拟ip的

 4、Keepalived双实例双主模式演示

(1)修改lb01和lb02的主配置文件,增加一个实例vrrp_VI2

[root@lb01 keepalived]# cat keepalived.conf
! Configuration File for keepalived

global_defs {
   notification_email {
    123456@qq.com
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id lb01
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 55
    priority 150
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.56.20/24 dev eth0 label eth0:1
    }
}

vrrp_instance VI_2 {        //增加一个vrrp实例VI2
    state BACKUP
    interface eth0
    virtual_router_id 56
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.56.30/24 dev eth0 label eth0:2    //虚拟ip为192.168.56.30
    }
}

[root@lb02 keepalived]# cat keepalived.conf
! Configuration File for keepalived

global_defs {
   notification_email {
        123456@qq.com
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id lb02
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 55
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.56.20/24 dev eth0 label eth0:1
    }
}

vrrp_instance VI_2 {        //增加一个vrrp实例VI2
    state MASTER
    interface eth0
    virtual_router_id 56
    priority 150
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.56.30/24 dev eth0 label eth0:2    //虚拟ip为192.168.56.30
    }
}
View Code

(2)在lb01和lb02上分别重启Keepalived服务,观察初始VIP设置情况

[root@lb01 keepalived]# systemctl restart keepalived
[root@lb01 keepalived]# ip addr |egrep "192.168.56.20|192.168.56.30"
    inet 192.168.56.20/24 scope global secondary eth0:1

[root@lb02 keepalived]# systemctl restart keepalived
[root@lb02 keepalived]# ip addr |egrep "192.168.56.20|192.168.56.30"
    inet 192.168.56.30/24 scope global secondary eth0:2
View Code

启动lb01的Keepalived服务后,初始状态启动了192.168.56.20这个VIP地址,即由VI_1实例配置的VIP对外提供服务。

启动lb02的Keepalived服务后,初始状态启动了192.168.56.30这个VIP地址,即由VI_2实例配置的VIP对外提供服务。

(3)高可用故障切换测试

[root@lb01 keepalived]# systemctl stop keepalived    //停止lb01的keepalived服务
[root@lb01 keepalived]# ip addr |egrep "192.168.56.20|192.168.56.30"    //在lb01上是无法查看到vip
[root@lb02 keepalived]# ip addr |egrep "192.168.56.20|192.168.56.30"    //在lb02上是可以查看到2个vip地址的
    inet 192.168.56.30/24 scope global secondary eth0:2
    inet 192.168.56.20/24 scope global secondary eth0:1
[root@lb01 keepalived]# systemctl start keepalived        //重新启动lb01上的keepalived服务
[root@lb01 keepalived]# ip addr |egrep "192.168.56.20|192.168.56.30"    //可以看到vip地址192.168.56.20飘移回来了
    inet 192.168.56.20/24 scope global secondary eth0:1

同理测试停止lb02上的keepalived服务查看vip信息

[root@lb02 keepalived]# systemctl stop keepalived
[root@lb02 keepalived]# ip addr |egrep "192.168.56.20|192.168.56.30"
[root@lb01 keepalived]# ip addr |egrep "192.168.56.20|192.168.56.30"
    inet 192.168.56.20/24 scope global secondary eth0:1
    inet 192.168.56.30/24 scope global secondary eth0:2
[root@lb02 keepalived]# systemctl start keepalived
[root@lb02 keepalived]# ip addr |egrep "192.168.56.20|192.168.56.30"
    inet 192.168.56.30/24 scope global secondary eth0:2
[root@lb01 keepalived]# ip addr |egrep "192.168.56.20|192.168.56.30"
    inet 192.168.56.20/24 scope global secondary eth0:1
View Code
  • 五、Nginx负载均衡配置Keepalived服务

1、环境说明:

Hostname IP 角色说明
lb01 192.168.56.12 Nginx+Keepalived(MASTER)
lb02 192.168.56.13 Nginx+Keepalived(BACKUP)
web01 192.168.56.11 web01服务-->Nginx
web02 192.168.0.130 web02服务-->Nginx

2、配置web01和web02

[root@web01 vhosts]# cat www.abc.org.conf 
server {
    listen 80;
    server_name 192.168.56.11;
    root /vhosts/html/www;
    index index.html index.htm index.php;
}
[root@web02 vhosts]# cat www.abc.org.conf 
server {
    listen 8080;
    server_name 192.168.0.130;
    root /vhosts/html/www;
    index index.html index.htm index.php;
}

测试web01和web02的主页,进行区分
[root@localhost vhosts]# curl 192.168.56.11
welcome to 192.168.56.11
[root@localhost vhosts]# curl 192.168.0.130:8080
welcome to use 192.168.0.130
View Code

3、在lb01和lb02上配置Nginx负载均衡

[root@lb01 keepalived]# cat /etc/nginx/nginx.conf
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;

include /usr/share/nginx/modules/*.conf;

events {
    worker_connections 1024;
}

http {
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile            on;
    tcp_nopush          on;
    tcp_nodelay         on;
    keepalive_timeout   65;
    types_hash_max_size 2048;

    include             /etc/nginx/mime.types;
    default_type        application/octet-stream;

    include /etc/nginx/conf.d/*.conf;
    upstream web_server_pool {
        server 192.168.56.11:80 weight=1;
        server 192.168.0.130:8080 weight=1;
    }
    server {
    listen 80;
    server_name 192.168.56.20;     //此处的server_name需要配置VIP的地址
    location / {
        proxy_pass http://web_server_pool;
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-For $remote_addr;
    }
    }

}
View Code

4、在lb01和lb02上配置Keepalived服务

[root@lb01 keepalived]# cat keepalived.conf
! Configuration File for keepalived

global_defs {
   notification_email {
    123456@qq.com
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id lb01
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 55
    priority 150
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.56.20/24 dev eth0 label eth0:1
    }
}

[root@lb02 keepalived]# cat keepalived.conf
! Configuration File for keepalived

global_defs {
   notification_email {
        123456@qq.com
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id lb02
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 55
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.56.20/24 dev eth0 label eth0:1
    }
}

注意lb01和lb02中Keepalived配置的不同之处
View Code

5、访问测试

直接访问:http://192.168.56.20,可以看到刷新页面,分别得到不同的结果,说明Nginx的负载均衡功能实现了,如图:

再停止lb01上的keepalived,再查看是否能够保持访问

[root@lb01 keepalived]# systemctl stop keepalived
[root@lb01 keepalived]# ip addr |grep "192.168.56.20"
[root@lb02 keepalived]# ip addr |grep "192.168.56.20"  //可以看到停止lb01上的keepalived后,vip在lb02上
    inet 192.168.56.20/24 scope global secondary eth0:1

再进行访问:http://192.168.56.20,一样可以保持访问结果,这就实现了Keepalived的高可用功能,如图:

6、解决Nginx监控检查的问题

  按照前面的操作,顺利地实现了Nginx的反向代理和负载均衡,也实现了Keepalived的高可用功能,在默认情况下,Keepalived仅仅在对方机器宕机或者Keepalived服务停止时才会接管也业务,而在实际工作当中,也会有其中一台负载均衡器的Nginx宕机了,而Keepalived服务还在运行,这就会导致用户访问的VIP:192.168.56.20无法找到对应的服务。尝试把lb01的Nginx停止,再查看访问情况

1)首先先进行访问测试,可以看到都是正常的
[root@localhost vhosts]# curl 192.168.56.20
welcome to 192.168.56.110
[root@localhost vhosts]# curl 192.168.56.20
welcome to use 192.168.0.1302)停止lb01上的nginx,查看vip依旧还在lb01上
[root@lb01 keepalived]# systemctl stop nginx
[root@lb01 keepalived]# ip addr |grep "192.168.56.20"
    inet 192.168.56.20/24 scope global secondary eth0:13)再进行测试访问,发现连接被拒绝
[root@localhost vhosts]# curl 192.168.56.20
curl: (7) Failed connect to 192.168.56.20:80; Connection refused

那么,如何解决这种业务服务宕机还可以将IP漂移到备用节点上呢?这就需要Keepalived监测脚本了。首先先写一个脚本,如下:

#!/bin/bash
d=`date --date today +%Y%m%d_%H:%M:%S`
counter=$(ps -C nginx --no-heading |wc -l)
if [ "${counter}" = "0" ]; then
    systemctl start nginx.service
    sleep 2
    counter=$(ps -C nginx --no-heading|wc -l)
    if [ "${counter}" = "0" ]; then
        echo "$d nginx was down.keepalived will stop." >> /var/log/check_ng.log
    systemctl stop keepalived
    fi
fi

此处在监测到nginx进程为0时,会重新启动nginx,再进行统计nginx的进程数量,如果依旧为0,则将keepalived服务停止,启用高可用故障切换。实验阶段,为了看到效果,使用一下脚本,只要监测到了nginx进程数为0,即刻停止keepalived服务,脚本如下:

此脚本在lb01和lb02上都需要存在的,脚本路径:/etc/keepalived/check_nginx.sh
[root@lb01 keepalived]# cat check_nginx.sh #!/bin/bash d=`date --date today +%Y%m%d_%H:%M:%S` counter=$(ps -C nginx --no-heading|wc -l) if [ $counter -eq 0 ]; then echo "$d nginx was down.keepalived will stop." >> /var/log/check_ng.log systemctl stop keepalived fi

再对lb01和lb02的keepalived.conf配置文件进行修改,增加脚本模块:

[root@lb01 keepalived]# cat keepalived.conf
! Configuration File for keepalived

global_defs {
   notification_email {
    123456@qq.com
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id lb01
}

vrrp_script chk_nginx {  #定义vrrp脚本,检测nginx进程,此处一定要注意和"{"的空格,如果没有空格,会导致脚本不会执行,切记切记!!!
    script "/etc/keepalived/check_nginx.sh"  #执行脚本,当Nginx服务有问题,就停掉Keepalived
    interval 2        #监测的间隔时间为2s
    weight 2
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 55
    priority 150
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.56.20/24 
    }
    track_script {
    chk_nginx    #在vrrp实例VI_1启用chk_nginx这个脚本
    }
}

下面测试过程和结果:

(1)在lb01上查看keepalived的vip和进程以及nignx的端口
[root@lb01 keepalived]# !ip ip addr |grep "192.168.56.20" inet 192.168.56.20/24 scope global secondary eth0 [root@lb01 keepalived]# netstat -tulnp |grep nginx tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 7624/nginx: master [root@lb01 keepalived]# ps -ef |grep keepalived root 7633 1 0 23:49 ? 00:00:00 /usr/sbin/keepalived -D root 7634 7633 0 23:49 ? 00:00:00 /usr/sbin/keepalived -D root 7635 7633 0 23:49 ? 00:00:00 /usr/sbin/keepalived -D

(2)模拟Nginx故障,停止Nginx服务,再查看(1)中的相关信息 [root@lb01 keepalived]# systemctl stop nginx [root@lb01 keepalived]#
!nets netstat -tulnp |grep nginx [root@lb01 keepalived]# !ip ip addr |grep "192.168.56.20" [root@lb01 keepalived]# ps -ef |grep keepalived root 7881 5009 0 23:51 pts/1 00:00:00 grep --color=auto keepalived
(3)在lb02上查看VIP信息是否存在,并验证web服务访问是否正常
[root@lb02 keepalived]#
!ip ip addr |grep "192.168.56.20" inet 192.168.56.20/24 scope global secondary eth0 [root@localhost vhosts]# curl 192.168.56.20 welcome to use 192.168.0.130 [root@localhost vhosts]# curl 192.168.56.20 welcome to 192.168.56.110

通过上述的脚本监测,可以实现了真正的Nginx+Keepalived高可用故障切换功能。

7、写一个监测Keepalived脑裂的脚本

  为了防止高可用功能出现脑裂现象,还可以在备用服务器上写一个监测脚本,如果可以ping通主节点并且备用节点有VIP就报警。

 (1)在lb02上写一个监测脚本并执行

[root@lb02 keepalived]# cat check_split_brain.sh 
#!/bin/bash
lb01_vip="192.168.56.20"
lb01_ip="192.168.56.12"
while true
do
    ping -c 2 -W 3 $lb01_ip &>/dev/null
    if [ $? -eq 0 -a `ip add|grep "$lb01_vip"|wc -l` -eq 1 ]
    then
    echo "ha is split brain.warning."
    else
    echo "ha is ok."
    fi
    sleep 3 
done
[root@lb02 keepalived]# sh check_split_brain.sh 
ha is ok.
ha is ok.
ha is ok.

正常情况下,主节点还活着,VIP 192.168.56.20就在主节点上,不会报警,提示:ha is ok

(2)模拟脑裂:停止主节点lb01上的Keepalived,查看lb02上的脚本执行情况

[root@lb01 keepalived]# systemctl stop keepalived
[root@lb02 keepalived]# sh check_split_brain.sh 
ha is ok.
ha is split brain.warning.
ha is split brain.warning.

从上可以看到脚本会报警有脑裂的错误,即可将此叫脚本放在zabbix监控服务当中,实现脑裂报警。

 

posted @ 2018-07-21 11:45  烟雨浮华  阅读(1102)  评论(0编辑  收藏  举报