|NO.Z.00062|——————————|^^ 部署 ^^|——|Hadoop&实时数仓.V02|——|可视化.v02|监控:Prometheus部署.V2|
一、prometheus部署:安装go语言环境
### --- 下载版本包并解压
~~~ # 由于Prometheus 是用golang开发的,所以首先安装一个go环境,Go语言是跨平台,支持Windows、Linux、
~~~ # Windows :go1.8.3.windows-amd64.msi (78MB)
~~~ Linux : go1.8.3.linux-amd64.tar.gz (86MB)
~~~ Mac : go1.8.3.darwin-amd64.tar.gz (85MB)
~~~ Source: go1.8.3.src.tar.gz (15MB)
~~~ # 下载go语言版本包:https://storage.googleapis.com/golang/go1.8.3.linux-amd64.tar.gz
[root@hadoop00 software]# wget -c https://storage.googleapis.com/golang/go1.8.3.linux-amd64.tar.gz
~~~ # 解压go语言版本包
[root@hadoop00 software]# tar -zxvf go1.8.3.linux-amd64.tar.gz -C ../servers/
### --- 配置go语言环境变量
~~~ # 配置go语言环境变量
[root@hadoop00 ~]# vim /etc/profile
## GO_HOME
export GO_HOME=/opt/yanqi/servers/go
export PATH=:$GO_HOME/bin:$PATH
export GOROOT=/opt/yanqi/servers/go
export PATH=:$GOROOT/bin:$PATH
~~~ # 使环境变量生效
[root@hadoop00 ~]# source /etc/profile
### --- 验证go语言环境是否生效
~~~ # 验证go语言环境版本
[root@hadoop00 ~]# go version
go version go1.8.3 linux/amd64
二、prometheus部署
### --- 下载prometheus版本包
~~~ # 下载prometheus版本包
[root@hadoop00 software]# wget -c https://github.com/prometheus/prometheus/releases/download/v2.22.1/prometheus-2.22.1.linux-amd64.tar.gz
### --- 解压prometheus版本包
~~~ # 加压prometheus版本包
[root@hadoop00 software]# tar -zxvf prometheus-2.22.1.linux-amd64.tar.gz -C ../servers/
~~~ # 修改版本包名称
[root@hadoop00 ~]# cd /opt/yanqi/servers/
[root@hadoop00 servers]# mv prometheus-2.22.1.linux-amd64/ prometheus
~~~ # 修改目录权限
[root@hadoop00 ~]# chown -R root:root /opt/yanqi/servers/prometheus/
### --- 修改prometheus配置文件
~~~ # 修改prometheus版本包名称
[root@hadoop00 ~]# vim /opt/yanqi/servers/prometheus/prometheus.yml
~~~ # 第21~45行:配置如下参数
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['hadoop00:9090']
- job_name: 'bigdata-hadoop00'
static_configs:
- targets: ['hadoop00:9100']
- job_name: 'bigdata-hadoop01'
static_configs:
- targets: ['hadoop01:9100']
- job_name: 'bigdata-hadoop02'
static_configs:
- targets: ['hadoop02:9100']
- job_name: 'bigdata-hadoop03'
static_configs:
- targets: ['hadoop03:9100']
- job_name: 'bigdata-pushgateway'
static_configs:
- targets: ['hadoop00:9091']
### --- 启动prometheus服务
~~~ # 启动prometheus服务:启动之前同步时间:ntpdate -u ntp.api.bz
[root@hadoop00 ~]# /opt/yanqi/servers/prometheus/prometheus --config.file=/opt/yanqi/servers/prometheus/prometheus.yml
三、通过web_UI访问prometheus:
### --- 通过web_UI访问prometheus:
~~~ # 1、通过web_UI访问prometheus:http://hadoop00:9090/graph
~~~ # 2、查看监控到的主机


附录一:报错处理一:
### --- 报错现象:
[root@hadoop00 ~]# /opt/yanqi/servers/prometheus/prometheus --config.file=/opt/yanqi/servers/prometheus/prometheus.yml
~~~输出报错信息:
level=warn ts=2021-11-30T08:50:49.794Z caller=scrape.go:1091 component="scrape manager" scrape_pool=bigdata-pushgateway target=http://hadoop00:9091/metrics msg="Appending scrape report failed" err="out of bounds"
level=warn ts=2021-11-30T08:50:50.466Z caller=scrape.go:1091 component="scrape manager" scrape_pool=bigdata-hadoop03 target=http://hadoop03:9100/metrics msg="Appending scrape report failed" err="out of bounds"
level=warn ts=2021-11-30T08:50:51.664Z caller=scrape.go:1091 component="scrape manager" scrape_pool=bigdata-hadoop01 target=http://hadoop01:9100/metrics msg="Appending scrape report failed" err="out of bounds"
level=warn ts=2021-11-30T08:50:53.331Z caller=scrape.go:1091 component="scrape manager" scrape_pool=bigdata-hadoop02 target=http://hadoop02:9100/metrics msg="Appending scrape report failed" err="out of bounds"
level=warn ts=2021-11-30T08:52:54.730Z caller=scrape.go:1091 component="scrape manager" scrape_pool=prometheus target=http://hadoop00:9090/metrics msg="Appending scrape report failed" err="out of bounds"
### --- 报错分析:
~~~ # 是时间跳变导致的,与Prometheus 机制有关;prometheus和grafana都会获取不到数据
~~~ # 猜测如下,vmotion后操作系统时间发生了向后跳变,vmotion从跳变时间开始取数,grafna当前当前时间显示无数据,修改操作系统时间后与Prometheus最后取数时间发生冲突,报“Error on ingesting samples that are too old or are too far into the future”无法取数。
~~~ # 解决方法:清空prometheus2.0.0.data.metrics 文件后重启prometheus 数据正常。
### --- 解决方案:
~~~ # 删除prometheus下的data.metrics目录,并重启prometheus服务
[root@hadoop00 ~]# rm -rf /opt/yanqi/servers/prometheus/data/
Walter Savage Landor:strove with none,for none was worth my strife.Nature I loved and, next to Nature, Art:I warm'd both hands before the fire of life.It sinks, and I am ready to depart
——W.S.Landor
分类:
bdv026-EB实时数仓
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
· DeepSeek 开源周回顾「GitHub 热点速览」