Prometheus部署+简单监控

Prometheus部署+简单监控

1:环境

HOSTNAME IP Config
prometheus 10.0.0.13 1C1G
node_exporter 10.0.0.14 1C1G

2:版本

系统:Centos 7.9
Prometheus:2.33.3
Altermanager:0.23.0
node_exporter:1.3.1
# 软件包下载地址:https://prometheus.io/download/

3:部署Prometheus-server

1:下载安装包
wget https://github.com/prometheus/prometheus/releases/download/v2.33.3/prometheus-2.33.3.linux-amd64.tar.gz
wget https://github.com/prometheus/alertmanager/releases/download/v0.23.0/alertmanager-0.23.0.linux-amd64.tar.gz
wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz

[root@prometheus ~]# ls
alertmanager-0.23.0.linux-amd64.tar.gz  node_exporter-1.3.1.linux-amd64.tar.gz  prometheus-2.33.3.linux-amd64.tar.gz

2,安装 Prometheus。
创建 prometheus 用户
[root@prometheus ~]# groupadd prometheus
[root@prometheus ~]# useradd -g prometheus -m -d /var/lib/prometheus -s /sbin/nologin prometheus

解压安装包
[root@prometheus ~]# tar xf prometheus-2.33.3.linux-amd64.tar.gz
[root@prometheus ~]# mv prometheus-2.33.3.linux-amd64 prometheus
[root@prometheus ~]# mv prometheus /usr/local/

创建启动脚本
[root@prometheus ~]# cat << eof>>/usr/lib/systemd/system/prometheus.service 
[Unit]
Description=prometheus
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --storage.tsdb.path=/var/lib/prometheus --storage.tsdb.retention=15d --log.level=info
Restart=on-failure
[Install]
WantedBy=multi-user.target
eof

安装 node_exporter
在 Prometheus 节点和另一台节点上分别安装 node_exporter
[root@prometheus ~]# tar xf node_exporter-1.3.1.linux-amd64.tar.gz
[root@prometheus ~]# mv node_exporter-1.3.1.linux-amd64 node_exporter
[root@prometheus ~]# mv node_exporter /usr/local/
[root@prometheus ~]# scp -r /usr/local/node_exporter root@10.0.0.14:/usr/local/
The authenticity of host '10.0.0.14 (10.0.0.14)' can't be established.
ECDSA key fingerprint is SHA256:2ysKwIOq1nrOh1CXWJyKU/DupX/wPD1PQeJeNHYQaC8.
ECDSA key fingerprint is MD5:7c:ae:c3:90:f6:6b:06:b1:46:1c:9d:81:7d:cc:2b:9d.
Are you sure you want to continue connecting (yes/no)? yes
root@10.0.0.14's password: 
LICENSE                                                                                                 100%   11KB   5.2MB/s   00:00    
NOTICE                                                                                                  100%  463   578.1KB/s   00:00    
node_exporter                                                                                           100%   17MB  63.7MB/s   00:00
[root@prometheus ~]# chown -R prometheus.prometheus /usr/local/node_exporter
# 注意:node_exporter 的运行用户也是 prometheus 用户需要在每台节点上都创建该用户。
[root@node_exporter ~]# groupadd prometheus
[root@node_exporter ~]# useradd -g prometheus -m -d /var/lib/prometheus -s /sbin/nologin prometheus
[root@node_exporter ~]# chown -R prometheus.prometheus /usr/local/node_exporter

创建 node_exporter 启动脚本
[root@prometheus ~]# cat << eof>>/usr/lib/systemd/system/node_exporter.service
[Unit]
Description=node_export
Documentation=https://github.com/prometheus/node_exporter
After=network.target
 
[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/node_exporter/node_exporter
Restart=on-failure
[Install]
WantedBy=multi-user.target
eof

启动 node_exporter 服务
[root@node_exporter ~]# systemctl enable node_exporter.service
Created symlink from /etc/systemd/system/multi-user.target.wants/node_exporter.service to /usr/lib/systemd/system/node_exporter.service.
[root@node_exporter ~]# systemctl start node_exporter.service
[root@node_exporter ~]# systemctl status node_exporter.service
● node_exporter.service - node_export
   Loaded: loaded (/usr/lib/systemd/system/node_exporter.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2021-12-28 15:04:17 EST; 7s ago
     Docs: https://github.com/prometheus/node_exporter
 Main PID: 12437 (node_exporter)
   CGroup: /system.slice/node_exporter.service
           └─12437 /usr/local/node_exporter/node_exporter
......
[root@node_exporter ~]# ss -tnl | grep 9100
LISTEN     0      128       [::]:9100                  [::]:*

3:配置 Prometheus 添加监控目标
[root@prometheus ~]# cat /usr/local/prometheus/prometheus.yml
---
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9090",'10.0.0.13:9100'] # 监控本地的node_exporter
# 新添加的对其它node节点抓取数据
  - job_name: "node_exporter"
  #重写了全局抓取间隔时间,由15秒重写成5秒。
    scrape_interval: 5s
    static_configs:
      - targets: ["10.0.0.14:9100"]
      
启动 Prometheus 服务
[root@prometheus ~]# systemctl enable prometheus.service 
Created symlink from /etc/systemd/system/multi-user.target.wants/prometheus.service to /usr/lib/systemd/system/prometheus.service.
[root@prometheus ~]# systemctl start prometheus.service 
[root@prometheus ~]# systemctl status prometheus.service
● prometheus.service - prometheus
   Loaded: loaded (/usr/lib/systemd/system/prometheus.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2021-12-27 23:14:21 EST; 5s ago
 Main PID: 1839 (prometheus)
   CGroup: /system.slice/prometheus.service
           └─1839 /usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --storage.tsdb.path=/var/lib/prom...
[root@prometheus ~]# ss -lnt | grep 9090
LISTEN     0      128       [::]:9090                  [::]:*  

注意:要留意启动之前的目录权限更改,否则可能会在启动的时候报错,Feb 11 16:08:41 localhost alertmanager: level=error ts=2019-02-11T08:08:41.419390133Z caller=main.go:179 msg="Unable to create data directory" err="mkdir data/: permission denied"

访问 Prometheus WEB 查看我们定义的目标主机:http://10.0.0.13:9090/targets

image

posted @ 2022-02-13 23:05  Layzer  阅读(113)  评论(0编辑  收藏  举报