promethous+granfa+mysql监控部署

一 、Prometheus 源码安装和启动配置
普罗米修斯下载网址:https://prometheus.io/download/
监控集成器下载地址:http://www.coderdocument.com/docs/prometheus/v2.14/instrumenting/exporters_and_integrations.html

1.实验环境

IP 角色 系统
172.16.11.7 Prometheus 服务端 CentOS 7
172.16.11.8 node_exporter 客户端 CentOS 7
2.下载prometheus

[root@prometheus ~]# cd /usr/local/
[root@prometheus local]# wget https://github.com/prometheus/prometheus/releases/download/v2.25.0/prometheus-2.25.0.linux-amd64.tar.gz
[root@prometheus local]# tar xf prometheus-2.25.0.linux-amd64.tar.gz
[root@prometheus local]# mv prometheus-2.25.0.linux-amd64/ prometheus
1
2
3
4
查看版本号

[root@prometheus prometheus]# ./prometheus --version
1

查看帮助文档

[root@prometheus prometheus]# ./prometheus --help
1
3.prometheus.yml 配置解释
cat /usr/local/prometheus/prometheus.yml

# my global config
global:
# 默认情况下,每15s拉取一次目标采样点数据。
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
# 每15秒评估一次规则。默认值为每1分钟。
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# job名称会增加到拉取到的所有采样点上,同时还有一个instance目标服务的host:port标签也会增加到采样点上
- job_name: 'prometheus'

# 覆盖global的采样点,拉取时间间隔5s
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
4.启动服务

#启动服务
[root@prometheus prometheus]# ./prometheus --config.file=prometheus.yml

# 指定配置文件
--config.file="prometheus.yml"

# 默认指定监听地址端口,可修改端口
--web.listen-address="0.0.0.0:9090"

# 最大连接数
--web.max-connections=512

# tsdb数据存储的目录,默认当前data/
--storage.tsdb.path="data/"

# premetheus 存储数据的时间,默认保存15天
--storage.tsdb.retention=15d

# 通过命令热加载无需重启 curl -XPOST 172.16.11.7:9090/-/reload
--web.enable-lifecycle

# 可以启用 TLS 或 身份验证 的配置文件的路径
--web.config.file=""


启动选项更多了解:./prometheus --help

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
5.访问:http://172.16.11.7:9090

6.查看暴露指标
访问 http://172.16.11.7:9090/metrics

7.将Prometheus配置为系统服务
进入systemd目录下

cd /usr/lib/systemd/system
1

创建文件:vim prometheus.service

[Unit]
Description=https://prometheus.io

[Service]
Restart=on-failure
ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --web.listen-address=:9090

[Install]
WantedBy=multi-user.target
1
2
3
4
5
6
7
8
9
生效系统system文件

systemctl daemon-reload
1
启动服务

systemctl start prometheus
1
二、客户端,配置服务发现监控linux主机及相关服务
172.16.11.8操作
1.安装node_exporter
监控Linux 安装常用node_exporter

cd /usr/local/
wget https://github.com/prometheus/node_exporter/releases/download/v1.1.2/node_exporter-1.1.2.linux-amd64.tar.gz
tar xf node_exporter-1.1.2.linux-amd64.tar.gz
mv node_exporter-1.1.2.linux-amd64/ node_exporter
1
2
3
4
2.启动node_exporter,并添加到服务
(1)直接启动

cd /usr/local/node_exporter && ./node_exporter &
# 启动后会监听9100端口
1
2
(2)添加为服务方式启动

vim /usr/lib/systemd/system/node_exporter.service
[Unit]
Description=node_exporter
After=network.target

[Service]
ExecStart=/usr/local/node_exporter/node_exporter
Restart=on-failure

[Install]
WantedBy=multi-user.target
1
2
3
4
5
6
7
8
9
10
11
这里选择(2)添加为服务方式启动

systemctl daemon-reload
systemctl start node_exporter
1
2


三 、服务端配置文件添加监控项
172.16.11.7操作

cd /usr/local/prometheus
vim prometheus.yml
1
2
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['172.16.11.7:9090']
- job_name: 'linux'
static_configs:
- targets: ['172.16.11.8:9100']

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
重启prometheus

[root@prometheus ~]# systemctl restart prometheus.service
1
重启之后,再次刷新查看


四 、监控mysql(mysqld-exporter)
172.16.11.8操作
1.下载跟配置

cd /usr/local
wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.12.1/mysqld_exporter-0.12.1.linux-amd64.tar.gz
tar xf mysqld_exporter-0.12.1.linux-amd64.tar.gz -C /usr/local/
mv mysqld_exporter-0.12.1.linux-amd64 mysqld_exporter
cd /usr/local/mysqld_exporter && vim .my.cnf
1
2
3
4
5

2.启动mysqld-exporter

cd /usr/local/mysqld_exporter
./mysqld_exporter --config.my-cnf="/usr/local/mysqld_exporter/.my.cnf" &
1
2

启动后会监听9104端口

3.配置文件添加监控项后重启
172.16.11.7Prometheus 服务端操作

cd /usr/local/prometheus
vim prometheus.yml
1
2
- job_name: 'mysql'
static_configs:
- targets: ['172.16.11.8:9104']
1
2
3

重启普罗米修斯

systemctl restart prometheus.service
1


五 、监控节点的其它系统服务
172.16.11.8操作
如果要监控节点的系统服务,需要在后面添加名单参数
–collector.systemd.unit-whitelist=“.+” 从systemd中循环正则匹配单元
–collector.systemd.unit-whitelist=“(docker|sshd|nginx).service” 白名单,收集目标

#监控客户端,docker服务,nginx服务,sshd
vi /usr/lib/systemd/system/node_exporter.service

[Unit]
Description=https://prometheus.io

[Service]
Restart=on-failure
ExecStart=/usr/local/node_exporter/node_exporter --collector.systemd --collector.systemd.unit-whitelist=(docker|sshd|nginx).service

[Install]
WantedBy=multi-user.target
1
2
3
4
5
6
7
8
9

重启服务node_exporter

systemctl daemon-reload
systemctl restart node_exporter
1
2
六 、Grafana 展示 Prometheus 数据
1.快速下载安装Grafana
172.16.11.7操作

wget --no-check-certificate https://mirrors.tuna.tsinghua.edu.cn/grafana/yum/rpm/grafana-7.4.3-1.x86_64.rpm

wget  https://dl.grafana.com/enterprise/release/grafana-enterprise-9.3.2-1.x86_64.rpm
yum install -y initscripts fontconfig
yum install -y grafana-7.4.3-1.x86_64.rpm
systemctl start grafana-server.service
systemctl status grafana-server.service
1
2
3
4
5
启动后访问地址:ip:3000
初始用户名和密码都是admin

mysqld_exporter报错Error 1146: Table 'my2.status' doesn't exist

导入以下SQL

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
create database IF NOT EXISTS my2;
use my2;
CREATE TABLE IF NOT EXISTS status (
  VARIABLE_NAME varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  VARIABLE_VALUE varchar(1024) CHARACTER SET utf8 DEFAULT NULL,
  TIMEST timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB;
 
CREATE TABLE IF NOT EXISTS current (
  VARIABLE_NAME varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  VARIABLE_VALUE varchar(1024) CHARACTER SET utf8 DEFAULT NULL
) ENGINE=InnoDB;
 
ALTER TABLE status
 ADD unique KEY idx01 (VARIABLE_NAME,timest);
-- delete from my2.status where VARIABLE_NAME like 'PROCESSES_HOSTS.%';
-- update my2.status set variable_value=0, timest=timest where VARIABLE_NAME like '%-d' and variable_value<0;
ALTER TABLE current
 ADD unique KEY idx02 (VARIABLE_NAME);
 
DROP PROCEDURE IF EXISTS collect_stats;
DELIMITER // ;
CREATE PROCEDURE collect_stats()
BEGIN
DECLARE a datetime;
DECLARE v varchar(10);
set sql_log_bin = 0;
set a=now();
select substr(version(),1,3) into v;
 
if v='5.7' OR v='8.0' then
  insert into my2.status(variable_name,variable_value,timest)
   select upper(variable_name),variable_value, a
     from performance_schema.global_status
    where variable_value REGEXP '^-*[[:digit:]]+(\.[[:digit:]]+)?$'
      and variable_name not like 'Performance_schema_%'
      and variable_name not like 'SSL_%';
  insert into my2.status(variable_name,variable_value,timest)
   SELECT 'replication_worker_time', coalesce(max(PROCESSLIST_TIME), 0.1), a
     FROM performance_schema.threads
    WHERE (NAME = 'thread/sql/slave_worker'
            AND (PROCESSLIST_STATE IS NULL
                  OR PROCESSLIST_STATE != 'Waiting for an event from Coordinator'))
       OR NAME = 'thread/sql/slave_sql';
--  *** Comment the following 4 lines with 8.0  ***
 else
  insert into my2.status(variable_name,variable_value,timest)
   select variable_name,variable_value,a
     from information_schema.global_status;
end if;
insert into my2.status(variable_name,variable_value,timest)
 select concat('PROCESSES.',user),count(*),a
   from information_schema.processlist
  group by user;
insert into my2.status(variable_name,variable_value,timest)
 select concat('PROCESSES_HOSTS.',SUBSTRING_INDEX(host,':',1)),count(*),a
   from information_schema.processlist
  group by concat('PROCESSES_HOSTS.',SUBSTRING_INDEX(host,':',1));
insert into my2.status(variable_name,variable_value,timest)
 select concat('PROCESSES_COMMAND.',command),count(*),a
   from information_schema.processlist
  group by concat('PROCESSES_COMMAND.',command);
insert into my2.status(variable_name,variable_value,timest)
 select substr(concat('PROCESSES_STATE.',state),1,64),count(*),a
   from information_schema.processlist
  group by substr(concat('PROCESSES_STATE.',state),1,64);
if v='5.6' OR v='5.7' OR v='8.0' OR v='10.' then
  insert into my2.status(variable_name,variable_value,timest)
   SELECT 'SUM_TIMER_WAIT', sum(sum_timer_wait*1.0), a
     FROM performance_schema.events_statements_summary_global_by_event_name;
end if;
 
-- Delta values
if v='5.7' OR v='8.0' then
  insert into my2.status(variable_name,variable_value,timest)
   select concat(upper(s.variable_name),'-d'), greatest(s.variable_value-c.variable_value,0), a
     from performance_schema.global_status s, my2.current c
    where s.variable_name=c.variable_name;
  insert into my2.status(variable_name,variable_value,timest)
   SELECT concat('COM_',upper(substr(s.EVENT_NAME,15,58)), '-d'), greatest(s.COUNT_STAR-c.variable_value,0), a
     FROM performance_schema.events_statements_summary_global_by_event_name s, my2.current c
    WHERE s.EVENT_NAME LIKE 'statement/sql/%'
      AND s.EVENT_NAME = c.variable_name;
  insert into my2.status(variable_name,variable_value,timest)
   SELECT 'SUM_TIMER_WAIT-d', sum(sum_timer_wait*1.0)-c.variable_value, a
     FROM performance_schema.events_statements_summary_global_by_event_name, my2.current c
    WHERE c.variable_name='SUM_TIMER_WAIT';
  insert into my2.status(variable_name, variable_value, timest)
   select 'replication_connection_status',if(SERVICE_STATE='ON', 1, 0),a
     from performance_schema.replication_connection_status;
  insert into my2.status(variable_name, variable_value, timest)
   select 'replication_applier_status',if(SERVICE_STATE='ON', 1, 0),a
     from performance_schema.replication_applier_status;
  delete from my2.current;
  insert into my2.current(variable_name,variable_value)
   select upper(variable_name),variable_value+0
     from performance_schema.global_status
    where variable_value REGEXP '^-*[[:digit:]]+(\.[[:digit:]]+)?$'
      and variable_name not like 'Performance_schema_%'
      and variable_name not like 'SSL_%';
  insert into my2.current(variable_name,variable_value)
   SELECT substr(EVENT_NAME,1,40), COUNT_STAR
     FROM performance_schema.events_statements_summary_global_by_event_name
    WHERE EVENT_NAME LIKE 'statement/sql/%';
  insert into my2.current(variable_name,variable_value)
   SELECT 'SUM_TIMER_WAIT', sum(sum_timer_wait*1.0)
     FROM performance_schema.events_statements_summary_global_by_event_name;
 
  insert into my2.current(variable_name,variable_value)
   select concat('PROCESSES_COMMAND.',command),count(*)
     from information_schema.processlist
    group by concat('PROCESSES_COMMAND.',command);
  insert into my2.current(variable_name,variable_value)
   select upper(variable_name),variable_value
     from performance_schema.global_variables
    where variable_name in ('max_connections', 'innodb_buffer_pool_size', 'query_cache_size',
                            'innodb_log_buffer_size', 'key_buffer_size', 'table_open_cache');
 else
  insert into my2.status(variable_name,variable_value,timest)
   select concat(upper(s.variable_name),'-d'), greatest(s.variable_value-c.variable_value,0), a
     from information_schema.global_status s, my2.current c
    where s.variable_name=c.variable_name;
  delete from my2.current;
  insert into my2.current(variable_name,variable_value)
   select upper(variable_name),variable_value+0
     from information_schema.global_status
    where variable_value REGEXP '^-*[[:digit:]]+(\.[[:digit:]]+)?$'
      and variable_name not like 'Performance_schema_%'
      and variable_name not like 'SSL_%';
  insert into my2.current(variable_name,variable_value)
   select upper(variable_name),variable_value
     from information_schema.global_variables
    where variable_name in ('max_connections', 'innodb_buffer_pool_size', 'query_cache_size',
                            'innodb_log_buffer_size', 'key_buffer_size', 'table_open_cache');
end if;
 
set sql_log_bin = 1;
END //
DELIMITER ; //
 
-- Collect daily statistics on space usage and delete old statistics (older than 62 days, 1 year for DB size)
DROP PROCEDURE IF EXISTS collect_daily_stats;
DELIMITER // ;
CREATE PROCEDURE collect_daily_stats()
BEGIN
DECLARE a datetime;
set sql_log_bin = 0;
set a=now();
insert into my2.status(variable_name,variable_value,timest)
 select concat('SIZEDB.',table_schema), sum(data_length+index_length), a
   from information_schema.tables group by table_schema;
insert into my2.status(variable_name,variable_value,timest)
 select 'SIZEDB.TOTAL', sum(data_length+index_length), a
   from information_schema.tables;
delete from my2.status where timest < date_sub(now(), INTERVAL 62 DAY) and variable_name <>'SIZEDB.TOTAL';
delete from my2.status where timest < date_sub(now(), INTERVAL 365 DAY);
set sql_log_bin = 1;
END //
DELIMITER ; //
 
-- The event scheduler must also be activated in the my.cnf (event_scheduler=1)
set global event_scheduler=1;
 
set sql_log_bin = 0;
DROP EVENT IF EXISTS collect_stats;
CREATE EVENT collect_stats
    ON SCHEDULE EVERY 10 Minute
    DO call collect_stats();
DROP EVENT IF EXISTS collect_daily_stats;
CREATE EVENT collect_daily_stats
    ON SCHEDULE EVERY 1 DAY
    DO call collect_daily_stats();
set sql_log_bin = 1;

  

 


2.添加Prometheus数据源
Configuration -> Data Sources ->add data source -> Prometheus

3.新增Dashboard Linux基础数据展示
Create -> import


4.导入模板8919

5.选择数据源

点击lmport
6.查看Dashboard
Dashboards ->Manage


七 、MySQL数据展示
1 设置数据源

 

2.导入已经画好的dashboard,数据源选择刚刚创建好的mysql数据源即可
导入画好的dashboard,可在官网下载

点击访问mysql数据源
这里我选择第一个

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
cat <<EOF>>  /usr/lib/systemd/system/mysqld-exporter.service
[Unit]
Description=mysqld_exporter
 
 
[Service]
User=root
ExecStart=/data/mysqld_exporter/mysqld_exporter --config.my-cnf /data/mysqld_exporter/my.cnf --web.listen-address=0.0.0.0:9104 \
--collect.slave_status \
--collect.binlog_size \
--collect.info_schema.processlist \
--collect.info_schema.innodb_metrics \
--collect.engine_innodb_status \
--collect.perf_schema.file_events \
--collect.perf_schema.replication_group_member_stats-collect.info_schema.processlist \ -collect.info_schema.innodb_tablespaces \ -collect.info_schema.innodb_metrics \ -collect.perf_schema.tableiowaits \ -collect.perf_schema.indexiowaits \ -collect.perf_schema.tablelocks \ -collect.engine_innodb_status \ -collect.perf_schema.file_events \ -collect.info_schema.processlist \ -collect.binlog_size \ -collect.info_schema.clientstats \ -collect.perf_schema.eventswaits \
Restart=on-failure
 
[Install]
WantedBy=multi-user.targe
EOF

  


7991数字是上面官网复制过来的
粘贴,点击load

选择Mysql源


七 、监控Redis(redis_exporter)
1.安装redis_exporter
172.16.11.8操作
cd /usr/local
wget https://github.com/oliver006/redis_exporter/releases/download/v0.15.0/redis_exporter-v0.15.0.linux-amd64.tar.gz
tar -xvf redis_exporter-v0.15.0.linux-amd64.tar.gz

2.启动redis_exporter
172.16.11.8操作
默认redis_exporter端口为9121
cd /usr/local
./redis_exporter redis//172.16.11.8:6379 &

3.prometheus配置文件中加入redis监控并重启
172.16.11.7操作

vim /usr/local/prometheus/prometheus.yml
1
- job_name: 'Redis'
static_configs:
- targets: ['172.16.11.8:9121']
1
2
3
systemctl restart prometheus
————————————————
版权声明:本文为CSDN博主「兴乐安宁」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/weixin_42324463/article/details/128006734

posted @   Cetus-Y  阅读(318)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· 单线程的Redis速度为什么快?
历史上的今天:
2021-08-04 pg postgresql空闲连接查看
点击右上角即可分享
微信分享提示