LB层到Real Server之间访问请求的响应时间及HTTP状态码监控及报警设置
为了监控到各业务的访问质量,基于LB层的Nginx日志,实现LB层到Real Server之间访问请求的响应时间(即upstream_response_time)及HTTP状态码(即upstream_status)的监控及报警。操作记录如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 | 基本信息: 负载均衡采用的是Nginx+Keeplived 负载域名:bs7001.kevin-inc.com (有很多负载域名,这里用该域名作为示例) 日志:bs7001.kevin-inc.com-access.log 1)LB层Nginx的log_format日志格式的设置(可以参考:http: //www .cnblogs.com /kevingrace/p/5893499 .html) [root@inner-lb01 ~] # cat /data/nginx/conf/nginx.conf ...... ###### ## set access log format ###### log_format main '$remote_addr $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '$http_user_agent $http_x_forwarded_for $request_time $upstream_response_time $upstream_addr $upstream_status' ; ####### ..... 2)监控及报警脚本设置 日志路径 [root@inner-lb01 ~] # ll /data/nginx/logs/bs7001.kevin-inc.com-access.log -rw-r--r-- 1 root root 0 12月 13 17:00 /data/nginx/logs/bs7001 .kevin-inc.com-access.log sendemail安装配置(安装可参考:http: //www .cnblogs.com /kevingrace/p/5961861 .html) [root@inner-lb01 ~] # cat /opt/sendemail.sh //该脚本可直接拿过来使用 #!/bin/bash # Filename: SendEmail.sh # Notes: 使用sendEmail # # 脚本的日志文件 LOGFILE= "/tmp/Email.log" :> "$LOGFILE" exec 1> "$LOGFILE" exec 2>&1 SMTP_server= 'smtp.kevin.com' username= 'notice@kevin.com' password= 'notice@123' from_email_address= 'notice@kevin.com' to_email_address= "$1" message_subject_utf8= "$2" message_body_utf8= "$3" # 转换邮件标题为GB2312,解决邮件标题含有中文,收到邮件显示乱码的问题。 message_subject_gb2312=`iconv -t GB2312 -f UTF-8 << EOF $message_subject_utf8 EOF` [ $? - eq 0 ] && message_subject= "$message_subject_gb2312" || message_subject= "$message_subject_utf8" # 转换邮件内容为GB2312,解决收到邮件内容乱码 message_body_gb2312=`iconv -t GB2312 -f UTF-8 << EOF $message_body_utf8 EOF` [ $? - eq 0 ] && message_body= "$message_body_gb2312" || message_body= "$message_body_utf8" # 发送邮件 sendEmail= '/usr/local/bin/sendEmail' set -x $sendEmail -s "$SMTP_server" -xu "$username" -xp "$password" -f "$from_email_address" -t "$to_email_address" -u "$message_subject" -m "$message_body" -o message-content- type =text -o message-charset=gb2312 [root@inner-lb01 ~] # cd /opt/lb_log_monit.sh/ [root@inner-lb01 lb_log_monit.sh] # ll 总用量 12 -rwxr-xr-x 1 root root 1180 2月 1 13:03 bs7001_request_status_monit.sh -rwxr-xr-x 1 root root 821 2月 1 11:20 bs7001_request_time_monit_request.sh -rwxr-xr-x 1 root root 559 2月 1 13:01 bs7001_request_time_monit.sh 访问请求的响应时间监控报警脚本(下面脚本中取日志文件中的第3、10列以及倒数第1、2、3列) [root@inner-lb01 lb_log_monit.sh] # cat bs7001_request_time_monit.sh #!/bin/bash /usr/bin/tail -1000 /data/nginx/logs/bs7001 .kevin-inc.com-access.log| awk '{print $3,$10,$(NF-2),$(NF-1),$(NF)}' > /root/lb_log_check/bs7001 .kevin-inc.com-check.log for i in ` awk '{print $3}' /root/lb_log_check/bs7001 .kevin-inc.com-check.log` do a=$( printf "%f" ` echo ${i}*1000| bc `| awk -F "." '{print $1}' ) b=$( printf "%f" ` echo 1*1000| bc `| awk -F "." '{print $1}' ) if [ $a - ge $b ]; then cat /root/lb_log_check/bs7001 .kevin-inc.com-check.log | grep $i else echo "it is ok" > /dev/null 2>&1 fi done [root@inner-lb01 lb_log_monit.sh] # cat bs7001_request_time_monit_request.sh #!/bin.bash /bin/bash -x /opt/lb_log_monit .sh /bs7001_request_time_monit .sh > /root/lb_log_check/bs7001 .kevin-inc.com_request_time.log NUM=` cat /root/lb_log_check/bs7001 .kevin-inc.com_request_time.log| wc -l` if [ $NUM != 0 ]; then /bin/bash /opt/sendemail .sh wangshibo@kevin.com "从LB层访问bs7001.kevin-inc.com请求的响应时间" "响应时间已超过1秒钟!\n具体情况如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com_request_time.log`" /bin/bash /opt/sendemail .sh linan@kevin.com "从LB层访问bs7001.kevin-inc.com请求的响应时间" "响应时间已超过1秒钟!\n具体情况如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com_request_time.log`" else echo "从LB层访问bs7001.kevin-inc.com请求的响应正常" fi [root@inner-lb01 lb_log_monit.sh] # ll /root/lb_log_check/ 总用量 152 -rw-r--r-- 1 root root 147766 2月 1 15:00 bs7001.kevin-inc.com-check.log -rw-r--r-- 1 root root 216 2月 1 15:00 bs7001.kevin-inc.com_request_time.log 访问的HTTP状态码监控报警脚本(500,502,503,504的状态码进行报警) [root@inner-lb01 lb_log_monit.sh] # cat bs7001_request_status_monit.sh #!/bin/bash /usr/bin/tail -1000 /data/nginx/logs/bs7001 .kevin-inc.com-access.log| awk '{print $3,$10,$(NF-2),$(NF-1),$(NF)}' > /root/lb_log_check/bs7001 .kevin-inc.com-check.log for i in ` awk '{print $5}' /root/lb_log_check/bs7001 .kevin-inc.com-check.log| sort | uniq ` do if [ ${i} = 500 ]; then /bin/bash /opt/sendemail .sh wangshibo@kevin.com "从LB层访问bs7001.kevin-inc.com请求的HTTP状态返回码" "HTTP状态返回码:500\n具体情况如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com-check.log |grep ${i}`" elif [ ${i} = 502 ]; then /bin/bash /opt/sendemail .sh wangshibo@kevin.com "从LB层访问bs7001.kevin-inc.com请求的HTTP状态返回码" "HTTP状态返回码:502\n具体情况如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com-check.log |grep ${i}`" elif [ ${i} = 503 ]; then /bin/bash /opt/sendemail .sh wangshibo@kevin.com "从LB层访问bs7001.kevin-inc.com请求的HTTP状态返回码" "HTTP状态返回码:503\n具体情况如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com-check.log |grep ${i}`" else echo "it is ok" fi done 3)结合 crontab 进行定时监控 [root@inner-lb01 lb_log_monit.sh] # crontab -l #LB到后端服务器之间访问各系统业务的请求响应时间和http状态码监控 * /2 * * * * /bin/bash -x /opt/lb_log_monit .sh /bs7001_request_time_monit_request .sh > /dev/null 2>&1 * /2 * * * * /bin/bash -x /opt/lb_log_monit .sh /bs7001_request_status_monit .sh > /dev/null 2>&1 取对应log文件中的第3、10以及倒数第1、2、3列内容 [root@inner-lb01 lb_log_monit.sh] # /usr/bin/tail -10 /data/nginx/logs/bs7001.kevin-inc.com-access.log|awk '{print $3,$10,$(NF-2),$(NF-1),$(NF)}' [01 /Feb/2018 :15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.002 192.168.1.22:7001 304 [01 /Feb/2018 :15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.001 192.168.1.22:7001 304 [01 /Feb/2018 :15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.002 192.168.1.22:7001 304 [01 /Feb/2018 :15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.001 192.168.1.22:7001 304 [01 /Feb/2018 :15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.001 192.168.1.22:7001 304 [01 /Feb/2018 :15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.002 192.168.1.22:7001 304 [01 /Feb/2018 :15:06:02 "http://bs7001.kevin-inc.com/portal/main_new.do" 0.006 192.168.1.21:7001 200 [01 /Feb/2018 :15:07:12 "http://bs7001.kevin-inc.com/portal/main_new.do" 0.003 192.168.1.22:7001 200 [01 /Feb/2018 :15:07:51 "http://bs7001.kevin-inc.com/portal/main_new.do" 0.003 192.168.1.21:7001 200 [01 /Feb/2018 :15:07:57 "http://bs7001.kevin-inc.com/portal/main_new.do" 0.007 192.168.1.22:7001 200 |
*************** 当你发现自己的才华撑不起野心时,就请安静下来学习吧!***************
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· Manus的开源复刻OpenManus初探
· AI 智能体引爆开源社区「GitHub 热点速览」
· 三行代码完成国际化适配,妙~啊~
· .NET Core 中如何实现缓存的预热?