Nginx配置主动健康检查

Nginx配置主动健康检查

在使用nginx的反向代理过程中,对于服务器节点的健康检查和故障转移很重要。

早期使用nginx的时候,用来做故障转移用到的是如下配置(比较粗暴)

upstream portals 
{    
    server172.16.68.134:8082 max_fails=1 fail_timeout=5;    
    server172.16.68.135:8082 max_fails=1 fail_timeout=5;    
    server172.16.68.136:8082 max_fails=1 fail_timeout=5;    
    server172.16.68.137:8082 max_fails=1 fail_timeout=5;
}

经过实际测试,在5s内,如果第一个服务器节点都不返回,在这5s内,请求不会向这台服务器转发,5s的超时时间到了,再次发起请求,就按照轮转规则,该到这台服务器还是会过去,这时候再经历5s,请求不会到这台服务器。这样子实际达不到想要的效果,在服务没恢复以前,请求不到这台服务器。

之后在网上找关于nginx健康检查的模块、组件 ,找到了淘宝的nginx_upstream_check_module。

安装过程比较简单,nginx增加这个module的编译即可。

Github地址:

https://github.com/yaoweibin/nginx_upstream_check_module

下载地址:

https://www.sumaott.com/download/%E5%B7%A5%E5%85%B7/nginx_upstream_check_module-0.3.0.tar.gz

nginx、pcre的编译目录均默认/home/soft,将下载的nginx_upstream_check_module-0.3.0.tar.gz上传至/home/soft后解压:

tar -zxvf nginx_upstream_check_module-0.3.0.tar.gz

重新编译:

#进入编译目录
cd /home/soft/nginx-1.10.1
#打补丁
patch -p0 < ../nginx_upstream_check_module-0.3.0/check_1.11.1+.patch
#确认configure参数与现网一致,只增加一个module
./configure --prefix=/usr/local/nginx --with-pcre=/home/soft/pcre-8.36/  --with-http_stub_status_module --with-http_ssl_module  --add-module=/home/soft/nginx_upstream_check_module-0.3.0/
#执行make
make
#备份现网nginx执行文件
cd /usr/local/nginx/sbin
mv nginx nginx.old.20181016
#拷贝升级后的执行文件到现网目录
cp /home/soft/nginx-1.10.1/objs/nginx /usr/local/nginx/sbin
#测试nginx版本及是否正常
./nginx -V
./nginx -t

在nginx中用到的配置是:

    upstream portals {    
        server 192.166.62.137:8080;
        server 192.166.66.85:8080;
        server 192.166.62.231:8080;
        server 192.166.66.88:8080;
        check interval=5000 rise=2 fall=5 timeout=1000 type=http;    
        check_http_send"HEAD / HTTP/1.0\r\n\r\n";   check_http_expect_alive http_2xx http_3xx;
    }
    server {
        listen 8080;
        charset utf-8;
        location /status {
            check_status;
            access_log   off;
            #allow 192.166.62.25;
            #deny all;
        }
        location / {
            proxy_pass http://portal_service_pool;
            index  index.html;
        }

interval间隔5s,连续失败5次,连续成功2次,超时时间1s,使用http协议,发送一个请求头,如果是2xx或者3xx状态(比如200,302等)表示服务正常运行。

可以开启注释的配置,以使只有固定ip可以查看status页面,其他ip无法访问此location。

修改完成后reload nginx使配置生效。

sbin/nginx -s reload

观察主动的健康检查效果:

image

在一台服务器上执行ab并发测试:

ab -n 20000 -c 10 "http://192.166.62.104:8080/PortalServer-App/index.html"

查看104上nginx日志:

tail -f logs/access.log

192.166.62.100 - - [16/Oct/2018:13:46:44 +0800] "GET /PortalServer-App/index.html HTTP/1.0" 200 541 "-" "ApacheBench/2.3" "-" "192.166.62.231:8080""0.002"
192.166.62.100 - - [16/Oct/2018:13:46:44 +0800] "GET /PortalServer-App/index.html HTTP/1.0" 200 541 "-" "ApacheBench/2.3" "-" "192.166.66.88:8080""0.002"
192.166.62.100 - - [16/Oct/2018:13:46:44 +0800] "GET /PortalServer-App/index.html HTTP/1.0" 200 541 "-" "ApacheBench/2.3" "-" "192.166.66.88:8080""0.002"
192.166.62.100 - - [16/Oct/2018:13:46:44 +0800] "GET /PortalServer-App/index.html HTTP/1.0" 200 567 "-" "ApacheBench/2.3" "-" "192.166.62.137:8080""0.001"
192.166.62.100 - - [16/Oct/2018:13:46:44 +0800] "GET /PortalServer-App/index.html HTTP/1.0" 200 541 "-" "ApacheBench/2.3" "-" "192.166.62.231:8080""0.001"
192.166.62.100 - - [16/Oct/2018:13:46:44 +0800] "GET /PortalServer-App/index.html HTTP/1.0" 200 567 "-" "ApacheBench/2.3" "-" "192.166.62.137:8080""0.000"
192.166.62.100 - - [16/Oct/2018:13:46:44 +0800] "GET /PortalServer-App/index.html HTTP/1.0" 200 541 "-" "ApacheBench/2.3" "-" "192.166.66.88:8080""0.002"
192.166.62.100 - - [16/Oct/2018:13:46:44 +0800] "GET /PortalServer-App/index.html HTTP/1.0" 200 541 "-" "ApacheBench/2.3" "-" "192.166.62.231:8080""0.001"
192.166.62.100 - - [16/Oct/2018:13:46:44 +0800] "GET /PortalServer-App/index.html HTTP/1.0" 200 567 "-" "ApacheBench/2.3" "-" "192.166.62.137:8080""0.001"
192.166.62.100 - - [16/Oct/2018:13:46:44 +0800] "GET /PortalServer-App/index.html HTTP/1.0" 200 541 "-" "ApacheBench/2.3" "-" "192.166.62.231:8080""0.001"
192.166.62.100 - - [16/Oct/2018:13:46:44 +0800] "GET /PortalServer-App/index.html HTTP/1.0" 200 541 "-" "ApacheBench/2.3" "-" "192.166.66.88:8080""0.002"
192.166.62.100 - - [16/Oct/2018:13:46:44 +0800] "GET /PortalServer-App/index.html HTTP/1.0" 200 541 "-" "ApacheBench/2.3" "-" "192.166.66.88:8080""0.002"
192.166.62.100 - - [16/Oct/2018:13:46:44 +0800] "GET /PortalServer-App/index.html HTTP/1.0" 200 567 "-" "ApacheBench/2.3" "-" "192.166.62.137:8080""0.001"
192.166.62.100 - - [16/Oct/2018:13:46:44 +0800] "GET /PortalServer-App/index.html HTTP/1.0" 200 541 "-" "ApacheBench/2.3" "-" "192.166.62.231:8080""0.001"

可见status中正常的后端可以被负载到,从而实现主动健康检查的效果。

posted @ 2019-10-31 16:34  老车更换新引擎  阅读(2259)  评论(0编辑  收藏  举报