每天一点基础K8S--prometheus基础

prometheus基础

1、简介

1、开源的系统监控和告警系统
Prometheus is an open-source systems monitoring and alerting toolkit

官网:https://prometheus.io/docs/introduction/overview/

2、prometheus架构图

285ed50038f10c9da280570c1604f6d6.png

简单理解一下,prometheus架构中主要包括:prometheus server、exporter、pushgateway、alertmanager、grafana

其中,prometheus server主要包括:
  1、retrieval,负载在活跃的target主机上主机相应数据;
  2、tsdb,存储采集的信息;
  3、http server,接受查询的模块;

3、主要特征

1、以时间为序列的,以度量名称和相应的key-value构建的多维度数据模型;
a multi-dimensional data model with time series data identified by metric name and key/value pairs

2、promQL语言灵活;
PromQL, a flexible query language to leverage this dimensionality

3、可以依靠本地本地存储实现;
no reliance on distributed storage; single server nodes are autonomous

4、依靠HTTP协议,通过pull的方式时间序列上的变化;
time series collection happens via a pull model over HTTP

5、可以通过pushgateway实现数据采集;
pushing time series is supported via an intermediary gateway

6、可以通过服务发现或者静态配置来发现目标target;
targets are discovered via service discovery or static configuration

7、图形界面多样化;
multiple modes of graphing and dashboarding support

4、基本概念

4.1、数据模型(data model)
#以时间序列为基础,同一条数据流需要有相同的度量值(metric)和相同的标签(labels)
Prometheus fundamentally stores all data as time series: streams of timestamped values belonging to the same metric and the same set of labeled dimensions. 

# 那么,时间戳 + metric + label = 样本

# natotion语法
<metric name>{<label name>=<label value>, ...}
4.2、度量类型/数据类型(metric type)
# counter 计数器
# 累积的数值,只能增加,或者重置为0,常用于统计服务器请求总数、任务完成数等;
A counter is a cumulative metric that represents a single monotonically increasing counter whose value can only increase or be reset to zero on restart. For example, you can use a counter to represent the number of requests served, tasks completed, or errors.


# gauge 测量器
# 可增可减,常用于监控温度、内存、CPU使用情况;
A gauge is a metric that represents a single numerical value that can arbitrarily go up and down.


# Histogram 柱状图
# 统计、计算桶内的的样本情况
A histogram samples observations (usually things like request durations or response sizes) and counts them in configurable buckets.


# summary 统计图
# 与柱状图类似,可用于统计总数,
Similar to a histogram, a summary samples observations (usually things like request durations and response sizes). While it also provides a total count of observations and a sum of all observed values, it calculates configurable quantiles over a sliding time window.
4.3、jobs and instances
# 实例,instance
# 在prometheus环境中,可以被查询请求的一个端点均可称为一个instance。
# 为某一需求而采集的instance集合称为jobs,
In Prometheus terms, an endpoint you can scrape is called an instance, usually corresponding to a single process. A collection of instances with the same purpose, a process replicated for scalability or reliability for example, is called a job.
For example, an API server job with four replicated instances:

job: api-server
instance 1: 1.2.3.4:5670
instance 2: 1.2.3.4:5671
instance 3: 5.6.7.8:5670
instance 4: 5.6.7.8:5671

5、prometheus的配置文件

配置文件可以查看官网:https://prometheus.io/docs/prometheus/latest/configuration/configuration/
posted @ 2023-02-18 23:23  woshinidaye  阅读(103)  评论(0编辑  收藏  举报