Prometheus安装手册
Welcome to Prometheus! Prometheus is a monitoring platform that collects metrics from monitored targets by scraping metrics HTTP endpoints on these targets. This guide will show you how to install, configure and monitor our first resource with Prometheus. You'll download, install and run Prometheus. You'll also download and install an exporter, tools that expose time series data on hosts and services. Our first exporter will be Prometheus itself, which provides a wide variety of host-level metrics about memory usage, garbage collection, and more.
安装参考文档:https://prometheus.io/docs/introduction/first_steps/
Prometheus生态系统
下图说明了Prometheus的体系结构及其生态系统组件:
Prometheus Server: Prometheus服务端,由于存储及收集数据,提供相关api对外查询用。
Exporter: 类似传统意义上的被监控端的agent,有区别的是,它不会主动推送监控数据到server端,而是等待server端定时来手机数据,即所谓的主动监控。
Pushagateway: 用于网络不可直达而居于exporter与server端的中转站。
Alertmanager: 报警组件,将报警的功能单独剥离出来放在alertmanager。
Web UI: Prometheus的web接口,可用于简单可视化,及语句执行或者服务状态监控。
下载Prometheus
下载并解压Prometheus安装包Download the latest release :
tar xvfz prometheus-*.tar.gz
cd prometheus-*
Prometheus服务器是一个称为prometheus
(或prometheus.exe
在Microsoft Windows上)的二进制文件。我们可以运行二进制文件,并通过传递--help
标志来查看有关其选项的帮助。
$ ./prometheus --help
usage: prometheus [<flags>]
The Prometheus monitoring server
. . .
在启动Prometheus之前,让我们对其进行配置。
配置Prometheus
Prometheus配置为YAML。配置Prometheus下的prometheus.yml
的示例。
删除了示例文件中的大多数注释,以使其更加简洁。
global:
scrape_interval: 15s
evaluation_interval: 15s
rule_files:
# - "first.rules"
# - "second.rules"
scrape_configs:
- job_name: prometheus
static_configs:
- targets: ['localhost:9090']
There are three blocks of configuration in the example configuration file:
global
,rule_files
, andscrape_configs
.The
global
block controls the Prometheus server's global configuration. We have two options present. The first,scrape_interval
, controls how often Prometheus will scrape targets. You can override this for individual targets. In this case the global setting is to scrape every 15 seconds. Theevaluation_interval
option controls how often Prometheus will evaluate rules. Prometheus uses rules to create new time series and to generate alerts.The
rule_files
block specifies the location of any rules we want the Prometheus server to load. For now we've got no rules.The last block,
scrape_configs
, controls what resources Prometheus monitors. Since Prometheus also exposes data about itself as an HTTP endpoint it can scrape and monitor its own health. In the default configuration there is a single job, calledprometheus
, which scrapes the time series data exposed by the Prometheus server. The job contains a single, statically configured, target, thelocalhost
on port9090
. Prometheus expects metrics to be available on targets on a path of/metrics
. So this default job is scraping via the URL: http://localhost:9090/metrics.The time series data returned will detail the state and performance of the Prometheus server.
For a complete specification of configuration options, see the configuration documentation.
启动Prometheus
要使用我们新创建的配置文件启动Prometheus,请转到包含Prometheus二进制文件的目录并运行:
./prometheus --config.file=prometheus.yml
Prometheus开始运行。我们可以在 http://localhost:9090看到状态页。
Give it about 30 seconds to collect data about itself from its own HTTP metrics endpoint.
我们也可以通过导航到: http://localhost:9090/metrics.
验证Prometheus是否正在提供自身指标。
使用浏览器browser
让我们尝试查看Prometheus收集的有关自身的一些数据。要使用Prometheus的内置浏览器,请导航至 http://localhost:9090/graph,然后选择“console”控制台视图。
promhttp_metric_handler_requests_total
这应该返回多个不同的时间序列(以及每个序列的最新值),所有时间序列均带有度量名称promhttp_metric_handler_requests_total
,但带有不同的标签。这些标签指定不同的请求状态。
如果我们需要HTTP代码200
的请求,则可以使用此查询来检索该信息:
promhttp_metric_handler_requests_total{code="200"}
要计算返回的时间序列数,可以编写:
count(promhttp_metric_handler_requests_total)
For more about the expression language, see the expression language documentation.
使用Graph界面
要绘制图形表达式,请导航至http://localhost:9090/graph并使用“Graph”选项卡。
例如,输入以下表达式以图形化显示在自抓取的Prometheus中发生的每秒HTTP请求速率返回状态代码200:
rate(promhttp_metric_handler_requests_total{code="200"}[1m])
You can experiment with the graph range parameters and other settings.