Prometheus 监控 Nginx

适用于Nginx的Prometheus指标库

使用的是 nginx-lua-prometheus 这个库,负责去采集 nginx 内部的指标,暴露给 Prometheus 拉取。

安装

要使用这个库,需要启用 Nginx 对于 lua 的支持,看了 Nginx 编译 lua 很麻烦,于是直接换 openresty

wget -O /etc/yum.repos.d/openresty.repo  https://openresty.org/package/centos/openresty.repo
yum check-update
yum install -y openresty
# 装好了。感觉比 Nginx 简单,重点是直接支持 lua了

配置

在 nginx.conf 的 http 块添加如下配置

# lua
lua_shared_dict prometheus_metrics 10M;
# 注意这个文件
lua_package_path "/usr/local/nginx/conf/prometheus.lua";
init_by_lua '
  prometheus = require("prometheus").init("prometheus_metrics")
  metric_requests = prometheus:counter(
    "nginx_http_requests_total", "Number of HTTP requests", {"host", "status"})
  metric_latency = prometheus:histogram(
    "nginx_http_request_duration_seconds", "HTTP request latency", {"host"})
  metric_connections = prometheus:gauge(
    "nginx_http_connections", "Number of HTTP connections", {"state"})
';
log_by_lua '
  metric_requests:inc(1, {ngx.var.server_name, ngx.var.status})
  metric_latency:observe(tonumber(ngx.var.request_time), {ngx.var.server_name})
';

/usr/local/nginx/conf/prometheus.lua 这个文件内容太长了,请自行从 github 上下载保存。

https://github.com/knyar/nginx-lua-prometheus/blob/master/prometheus.lua

新增一个用于暴露指标的服务器配置文件

server {
  listen 9145;
  server_name localhost;
  location /metrics {
    content_by_lua '
      metric_connections:set(ngx.var.connections_active, {"active"})
      metric_connections:set(ngx.var.connections_reading, {"reading"})
      metric_connections:set(ngx.var.connections_waiting, {"waiting"})
      metric_connections:set(ngx.var.connections_writing, {"writing"})
      prometheus:collect()
    ';
  }
}

reload 一下 Nginx,就可以去访问指标了

[root@iZ1rp1vunvZ vhost]# curl http://127.0.0.1:9145/metrics
# HELP nginx_http_connections Number of HTTP connections
# TYPE nginx_http_connections gauge
nginx_http_connections{state="active"} 879
nginx_http_connections{state="reading"} 0
nginx_http_connections{state="waiting"} 851
nginx_http_connections{state="writing"} 25
......
# HELP nginx_http_requests_total Number of HTTP requests
# TYPE nginx_http_requests_total counter
nginx_http_requests_total{host="",status="302"} 39
nginx_http_requests_total{host="",status="400"} 48
nginx_http_requests_total{host="",status="404"} 4

Prometheus 配置文件 Nginx 的配置

  - job_name: 'Nginx'
    static_configs:
      - targets: ['192.168.201.179:9145','192.168.201.180:9145']

告警规则

4xx 过多

  - alert: NginxHighHttp4xxErrorRate
    expr: sum(rate(nginx_http_requests_total{status=~"^4.."}[1m])) / sum(rate(nginx_http_requests_total[1m])) * 100 > 5
    for: 5m
    labels:
      severity: error
    annotations:
      summary: "Nginx high HTTP 4xx error rate (instance {{ $labels.instance }})"
      description: "Too many HTTP requests with status 4xx (> 5%)\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}"

5xx 过多

  - alert: NginxHighHttp5xxErrorRate
    expr: sum(rate(nginx_http_requests_total{status=~"^5.."}[1m])) / sum(rate(nginx_http_requests_total[1m])) * 100 > 5
    for: 5m
    labels:
      severity: error
    annotations:
      summary: "Nginx high HTTP 5xx error rate (instance {{ $labels.instance }})"
      description: "Too many HTTP requests with status 5xx (> 5%)\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}"
posted @ 2020-05-13 09:33  海口-熟练工  阅读(4328)  评论(0编辑  收藏  举报