Springboot开启prometheus监控指标获取HTTP请求的吞吐时延等

一、相关文档

https://mvnrepository.com/artifact/io.micrometer/micrometer-registry-prometheus

https://github.com/micrometer-metrics/micrometer

https://micrometer.io/docs

https://docs.spring.io/spring-boot/docs/current/reference/htmlsingle/#actuator.metrics.supported.spring-mvc

https://yunlzheng.gitbook.io/prometheus-book/

 

二、在springboot项目之中输出prometheus指标

 

a 安装 poml 依赖

 

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
</dependency>

 

b 在启动类注册

@SpringBootApplication
@ServletComponentScan
public class OneApplication {

    public static void main(String[] args) {
        SpringApplication.run(OneApplication.class, args);
    }

    // 非常重要
    @Bean
    MeterRegistryCustomizer<MeterRegistry> configurer(
            @Value("${spring.application.name}") String applicationName) {
        return (registry) -> registry.config().commonTags("application", "hello");
    }
}

c 在配置中添加

management.endpoints.web.exposure.include=*
management.metrics.tags.application="one"
management.metrics.web.server.request.metric-name = http.server.requests

 

三、启动springboot项目后,查看输出的指标

http://localhost:8080/actuator/prometheus

 

四、在prometheus添加采集目标

cd \data\prometheu && vim prometheus.yml

添加采集目标

# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9090"]


  # 新添加的采集目标
  - job_name: "one"
    metrics_path: '/actuator/prometheus'
    static_configs:
      - targets: ["localhost:8080"]

 

五、在 prometheus 查看采集到的数据

 

http://localhost:9090/

搜索采集到的指标名称即可

 

六、可以安装 grafana 查看数据

1、安装 grafana 后登录

2、进入 grafana 添加 DataSource

3、添加 dashboards - 导入会更好 

     下载模板 https://grafana.com/grafana/dashboards/

 

七、几个特殊需求

1. 自定义指标名称

2. 如何给每一个请求打上自定义的tag

3. 更改默认的标签tag的名称

4. 有些status 200的日志,但是业务上是错误的,比如请求参数非法,如何埋点获取

5. 配置grafana的图示呈现吞吐、错误分布、时延

6. 获取网站在线人数

 

 

备注:

添加自定义标签,给每个接口增加一个team标识

import java.util.List;

import io.micrometer.core.annotation.Timed;

import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

@RestController
@Timed
public class MyController {

    @GetMapping("/api/addresses")
    public List<Address> listAddress() {
        return ...
    }

    @GetMapping("/api/people")
    @Timed(extraTags = { "team", "test" })
    @Timed(value = "all.people", longTask = true)
    public List<Person> listPeople() {
        return ...
    }

}

记得在 config 要添加以下

management.metrics.tags.team=""

在指标中的数据示例

# HELP http_server_requests_seconds  
# TYPE http_server_requests_seconds summary
# http_server_requests_seconds_count 表示请求次数3次
http_server_requests_seconds_count{application="hello",exception="None",method="POST",outcome="SUCCESS",status="200",uri="/prometheus/post/{id}",} 3.0
# http_server_requests_seconds_sum 表示3次请求总响应时长是 3.021s
http_server_requests_seconds_sum{application="hello",exception="None",method="POST",outcome="SUCCESS",status="200",uri="/prometheus/post/{id}",} 3.0210991
http_server_requests_seconds_count{application="hello",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/prometheus",} 110.0
http_server_requests_seconds_sum{application="hello",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/prometheus",} 3.8388842
# HELP 是对这个指标的描述
# HELP http_server_requests_seconds_max  
# 这个指标类型
# TYPE http_server_requests_seconds_max gauge
# http_server_requests_seconds_max 表示所有请求中,最长响应时间的一次是 2.00s
http_server_requests_seconds_max{application="hello",exception="None",method="POST",outcome="SUCCESS",status="200",uri="/prometheus/post/{id}",} 2.0021618
http_server_requests_seconds_max{application="hello",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/prometheus",} 0.0345458

Counter(计数器)、Gauge(仪表盘)、Histogram(直方图)、Summary(摘要)

 网站长需要的常规数据

1. 当前在线总数, 折线图, 时间线和人数
2. 吞吐量 - 每个时间点处理的请求数 - 折线图
3. 接口响应时长 - 每个接口的响应时长,横坐标是时间线,折线图
4. 错误分布 - 状态码各个请求分布

几个简单的图示

 

 

 

 

 

 八、设置步长5s最大10s 获取 0ms ~ 0.1s 和 0.1s ~ 0.5s 和 0.5s ~ 1.5s 的响应数量

package com.example.one.config;

import io.micrometer.core.instrument.Meter;
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.config.MeterFilter;
import io.micrometer.core.instrument.distribution.DistributionStatisticConfig;
import lombok.extern.slf4j.Slf4j;
import org.springframework.boot.actuate.autoconfigure.metrics.MeterRegistryCustomizer;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

import java.time.Duration;

@Configuration
@Slf4j
public class MicrometerConfig {

    @Bean
    MeterRegistryCustomizer<MeterRegistry> metricsCommonTags() {
        Long mini = Duration.ofMillis(5000).toNanos();
        Long maxi = Duration.ofSeconds(10).toNanos();
        return registry -> {
            registry.config().meterFilter(
                    new MeterFilter() {
                        @Override
                        public DistributionStatisticConfig configure(Meter.Id id, DistributionStatisticConfig config) {
                            if (id.getType() == Meter.Type.TIMER&&id.getName().matches("^(http|hystrix).*")) {
                                return DistributionStatisticConfig.builder()
                                        .percentilesHistogram(true)
                                        .serviceLevelObjectives(Duration.ofMillis(100).toNanos(),
                                                Duration.ofMillis(500).toNanos(),
                                                Duration.ofMillis(1000).toNanos(),
                                                Duration.ofMillis(1500).toNanos(),
                                                Duration.ofSeconds(3).toNanos(),
                                                Duration.ofSeconds(5).toNanos())
                                        .minimumExpectedValue(mini.doubleValue())
                                        .maximumExpectedValue(maxi.doubleValue())
                                        .build()
                                        .merge(config);
                            } else {
                                return config;
                            }
                        }
                    });
        };
    }
}

获取的数据展示

http_server_requests_seconds_bucket{application="one",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/user/detail/{id}",le="0.1",} 0.0
http_server_requests_seconds_bucket{application="one",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/user/detail/{id}",le="0.5",} 0.0
http_server_requests_seconds_bucket{application="one",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/user/detail/{id}",le="1.0",} 0.0
http_server_requests_seconds_bucket{application="one",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/user/detail/{id}",le="1.5",} 1.0
http_server_requests_seconds_bucket{application="one",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/user/detail/{id}",le="3.0",} 2.0
http_server_requests_seconds_bucket{application="one",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/user/detail/{id}",le="5.0",} 2.0
http_server_requests_seconds_bucket{application="one",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/user/detail/{id}",le="5.726623061",} 2.0
http_server_requests_seconds_bucket{application="one",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/user/detail/{id}",le="7.158278826",} 2.0
http_server_requests_seconds_bucket{application="one",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/user/detail/{id}",le="8.589934591",} 2.0
http_server_requests_seconds_bucket{application="one",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/user/detail/{id}",le="10.0",} 2.0
http_server_requests_seconds_bucket{application="one",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/user/detail/{id}",le="+Inf",} 2.0
http_server_requests_seconds_count{application="one",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/user/detail/{id}",} 2.0
http_server_requests_seconds_sum{application="one",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/user/detail/{id}",} 3.126248
# HELP http_server_requests_seconds_max  
# TYPE http_server_requests_seconds_max gauge
http_server_requests_seconds_max{application="one",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/prometheus",} 0.5549248
http_server_requests_seconds_max{application="one",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/user/detail/{id}",} 2.0041966

上面的请求有一个 1s 和 1个2s的

 

 

 

 

 

posted @ 2022-07-06 17:52  许伟强  阅读(10461)  评论(0编辑  收藏  举报