Prometheus 监控 Nginx
适用于Nginx的Prometheus指标库
使用的是 nginx-lua-prometheus 这个库,负责去采集 nginx 内部的指标,暴露给 Prometheus 拉取。
安装
要使用这个库,需要启用 Nginx 对于 lua 的支持,看了 Nginx 编译 lua 很麻烦,于是直接换 openresty
wget -O /etc/yum.repos.d/openresty.repo https://openresty.org/package/centos/openresty.repo
yum check-update
yum install -y openresty
# 装好了。感觉比 Nginx 简单,重点是直接支持 lua了
配置
在 nginx.conf 的 http 块添加如下配置
# lua
lua_shared_dict prometheus_metrics 10M;
# 注意这个文件
lua_package_path "/usr/local/nginx/conf/prometheus.lua";
init_by_lua '
prometheus = require("prometheus").init("prometheus_metrics")
metric_requests = prometheus:counter(
"nginx_http_requests_total", "Number of HTTP requests", {"host", "status"})
metric_latency = prometheus:histogram(
"nginx_http_request_duration_seconds", "HTTP request latency", {"host"})
metric_connections = prometheus:gauge(
"nginx_http_connections", "Number of HTTP connections", {"state"})
';
log_by_lua '
metric_requests:inc(1, {ngx.var.server_name, ngx.var.status})
metric_latency:observe(tonumber(ngx.var.request_time), {ngx.var.server_name})
';
/usr/local/nginx/conf/prometheus.lua
这个文件内容太长了,请自行从 github 上下载保存。
https://github.com/knyar/nginx-lua-prometheus/blob/master/prometheus.lua
新增一个用于暴露指标的服务器配置文件
server {
listen 9145;
server_name localhost;
location /metrics {
content_by_lua '
metric_connections:set(ngx.var.connections_active, {"active"})
metric_connections:set(ngx.var.connections_reading, {"reading"})
metric_connections:set(ngx.var.connections_waiting, {"waiting"})
metric_connections:set(ngx.var.connections_writing, {"writing"})
prometheus:collect()
';
}
}
reload 一下 Nginx,就可以去访问指标了
[root@iZ1rp1vunvZ vhost]# curl http://127.0.0.1:9145/metrics
# HELP nginx_http_connections Number of HTTP connections
# TYPE nginx_http_connections gauge
nginx_http_connections{state="active"} 879
nginx_http_connections{state="reading"} 0
nginx_http_connections{state="waiting"} 851
nginx_http_connections{state="writing"} 25
......
# HELP nginx_http_requests_total Number of HTTP requests
# TYPE nginx_http_requests_total counter
nginx_http_requests_total{host="",status="302"} 39
nginx_http_requests_total{host="",status="400"} 48
nginx_http_requests_total{host="",status="404"} 4
Prometheus 配置文件 Nginx 的配置
- job_name: 'Nginx'
static_configs:
- targets: ['192.168.201.179:9145','192.168.201.180:9145']
告警规则
4xx 过多
- alert: NginxHighHttp4xxErrorRate
expr: sum(rate(nginx_http_requests_total{status=~"^4.."}[1m])) / sum(rate(nginx_http_requests_total[1m])) * 100 > 5
for: 5m
labels:
severity: error
annotations:
summary: "Nginx high HTTP 4xx error rate (instance {{ $labels.instance }})"
description: "Too many HTTP requests with status 4xx (> 5%)\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"
5xx 过多
- alert: NginxHighHttp5xxErrorRate
expr: sum(rate(nginx_http_requests_total{status=~"^5.."}[1m])) / sum(rate(nginx_http_requests_total[1m])) * 100 > 5
for: 5m
labels:
severity: error
annotations:
summary: "Nginx high HTTP 5xx error rate (instance {{ $labels.instance }})"
description: "Too many HTTP requests with status 5xx (> 5%)\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"