promethous+granfa+mysql监控部署
一 、Prometheus 源码安装和启动配置
普罗米修斯下载网址:https://prometheus.io/download/
监控集成器下载地址:http://www.coderdocument.com/docs/prometheus/v2.14/instrumenting/exporters_and_integrations.html
1.实验环境
IP 角色 系统
172.16.11.7 Prometheus 服务端 CentOS 7
172.16.11.8 node_exporter 客户端 CentOS 7
2.下载prometheus
[root@prometheus ~]# cd /usr/local/
[root@prometheus local]# wget https://github.com/prometheus/prometheus/releases/download/v2.25.0/prometheus-2.25.0.linux-amd64.tar.gz
[root@prometheus local]# tar xf prometheus-2.25.0.linux-amd64.tar.gz
[root@prometheus local]# mv prometheus-2.25.0.linux-amd64/ prometheus
1
2
3
4
查看版本号
[root@prometheus prometheus]# ./prometheus --version
1
查看帮助文档
[root@prometheus prometheus]# ./prometheus --help
1
3.prometheus.yml 配置解释
cat /usr/local/prometheus/prometheus.yml
# my global config
global:
# 默认情况下,每15s拉取一次目标采样点数据。
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
# 每15秒评估一次规则。默认值为每1分钟。
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# job名称会增加到拉取到的所有采样点上,同时还有一个instance目标服务的host:port标签也会增加到采样点上
- job_name: 'prometheus'
# 覆盖global的采样点,拉取时间间隔5s
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
4.启动服务
#启动服务
[root@prometheus prometheus]# ./prometheus --config.file=prometheus.yml
# 指定配置文件
--config.file="prometheus.yml"
# 默认指定监听地址端口,可修改端口
--web.listen-address="0.0.0.0:9090"
# 最大连接数
--web.max-connections=512
# tsdb数据存储的目录,默认当前data/
--storage.tsdb.path="data/"
# premetheus 存储数据的时间,默认保存15天
--storage.tsdb.retention=15d
# 通过命令热加载无需重启 curl -XPOST 172.16.11.7:9090/-/reload
--web.enable-lifecycle
# 可以启用 TLS 或 身份验证 的配置文件的路径
--web.config.file=""
启动选项更多了解:./prometheus --help
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
5.访问:http://172.16.11.7:9090
6.查看暴露指标
访问 http://172.16.11.7:9090/metrics
7.将Prometheus配置为系统服务
进入systemd目录下
cd /usr/lib/systemd/system
1
创建文件:vim prometheus.service
[Unit]
Description=https://prometheus.io
[Service]
Restart=on-failure
ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --web.listen-address=:9090
[Install]
WantedBy=multi-user.target
1
2
3
4
5
6
7
8
9
生效系统system文件
systemctl daemon-reload
1
启动服务
systemctl start prometheus
1
二、客户端,配置服务发现监控linux主机及相关服务
172.16.11.8操作
1.安装node_exporter
监控Linux 安装常用node_exporter
cd /usr/local/
wget https://github.com/prometheus/node_exporter/releases/download/v1.1.2/node_exporter-1.1.2.linux-amd64.tar.gz
tar xf node_exporter-1.1.2.linux-amd64.tar.gz
mv node_exporter-1.1.2.linux-amd64/ node_exporter
1
2
3
4
2.启动node_exporter,并添加到服务
(1)直接启动
cd /usr/local/node_exporter && ./node_exporter &
# 启动后会监听9100端口
1
2
(2)添加为服务方式启动
vim /usr/lib/systemd/system/node_exporter.service
[Unit]
Description=node_exporter
After=network.target
[Service]
ExecStart=/usr/local/node_exporter/node_exporter
Restart=on-failure
[Install]
WantedBy=multi-user.target
1
2
3
4
5
6
7
8
9
10
11
这里选择(2)添加为服务方式启动
systemctl daemon-reload
systemctl start node_exporter
1
2
三 、服务端配置文件添加监控项
172.16.11.7操作
cd /usr/local/prometheus
vim prometheus.yml
1
2
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['172.16.11.7:9090']
- job_name: 'linux'
static_configs:
- targets: ['172.16.11.8:9100']
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
重启prometheus
[root@prometheus ~]# systemctl restart prometheus.service
1
重启之后,再次刷新查看
四 、监控mysql(mysqld-exporter)
172.16.11.8操作
1.下载跟配置
cd /usr/local
wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.12.1/mysqld_exporter-0.12.1.linux-amd64.tar.gz
tar xf mysqld_exporter-0.12.1.linux-amd64.tar.gz -C /usr/local/
mv mysqld_exporter-0.12.1.linux-amd64 mysqld_exporter
cd /usr/local/mysqld_exporter && vim .my.cnf
1
2
3
4
5
2.启动mysqld-exporter
cd /usr/local/mysqld_exporter
./mysqld_exporter --config.my-cnf="/usr/local/mysqld_exporter/.my.cnf" &
1
2
启动后会监听9104端口
3.配置文件添加监控项后重启
172.16.11.7Prometheus 服务端操作
cd /usr/local/prometheus
vim prometheus.yml
1
2
- job_name: 'mysql'
static_configs:
- targets: ['172.16.11.8:9104']
1
2
3
重启普罗米修斯
systemctl restart prometheus.service
1
五 、监控节点的其它系统服务
172.16.11.8操作
如果要监控节点的系统服务,需要在后面添加名单参数
–collector.systemd.unit-whitelist=“.+” 从systemd中循环正则匹配单元
–collector.systemd.unit-whitelist=“(docker|sshd|nginx).service” 白名单,收集目标
#监控客户端,docker服务,nginx服务,sshd
vi /usr/lib/systemd/system/node_exporter.service
[Unit]
Description=https://prometheus.io
[Service]
Restart=on-failure
ExecStart=/usr/local/node_exporter/node_exporter --collector.systemd --collector.systemd.unit-whitelist=(docker|sshd|nginx).service
[Install]
WantedBy=multi-user.target
1
2
3
4
5
6
7
8
9
重启服务node_exporter
systemctl daemon-reload
systemctl restart node_exporter
1
2
六 、Grafana 展示 Prometheus 数据
1.快速下载安装Grafana
172.16.11.7操作
wget --no-check-certificate https://mirrors.tuna.tsinghua.edu.cn/grafana/yum/rpm/grafana-7.4.3-1.x86_64.rpm
wget https://dl.grafana.com/enterprise/release/grafana-enterprise-9.3.2-1.x86_64.rpm
yum install -y initscripts fontconfig
yum install -y grafana-7.4.3-1.x86_64.rpm
systemctl start grafana-server.service
systemctl status grafana-server.service
1
2
3
4
5
启动后访问地址:ip:3000
初始用户名和密码都是admin
mysqld_exporter报错Error 1146: Table 'my2.status' doesn't exist
导入以下SQL
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 | create database IF NOT EXISTS my2; use my2; CREATE TABLE IF NOT EXISTS status ( VARIABLE_NAME varchar (64) CHARACTER SET utf8 NOT NULL DEFAULT '' , VARIABLE_VALUE varchar (1024) CHARACTER SET utf8 DEFAULT NULL , TIMEST timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP ) ENGINE=InnoDB; CREATE TABLE IF NOT EXISTS current ( VARIABLE_NAME varchar (64) CHARACTER SET utf8 NOT NULL DEFAULT '' , VARIABLE_VALUE varchar (1024) CHARACTER SET utf8 DEFAULT NULL ) ENGINE=InnoDB; ALTER TABLE status ADD unique KEY idx01 (VARIABLE_NAME,timest); -- delete from my2.status where VARIABLE_NAME like 'PROCESSES_HOSTS.%'; -- update my2.status set variable_value=0, timest=timest where VARIABLE_NAME like '%-d' and variable_value<0; ALTER TABLE current ADD unique KEY idx02 (VARIABLE_NAME); DROP PROCEDURE IF EXISTS collect_stats; DELIMITER // ; CREATE PROCEDURE collect_stats() BEGIN DECLARE a datetime; DECLARE v varchar (10); set sql_log_bin = 0; set a=now(); select substr(version(),1,3) into v; if v= '5.7' OR v= '8.0' then insert into my2.status(variable_name,variable_value,timest) select upper (variable_name),variable_value, a from performance_schema.global_status where variable_value REGEXP '^-*[[:digit:]]+(\.[[:digit:]]+)?$' and variable_name not like 'Performance_schema_%' and variable_name not like 'SSL_%' ; insert into my2.status(variable_name,variable_value,timest) SELECT 'replication_worker_time' , coalesce ( max (PROCESSLIST_TIME), 0.1), a FROM performance_schema.threads WHERE ( NAME = 'thread/sql/slave_worker' AND (PROCESSLIST_STATE IS NULL OR PROCESSLIST_STATE != 'Waiting for an event from Coordinator' )) OR NAME = 'thread/sql/slave_sql' ; -- *** Comment the following 4 lines with 8.0 *** else insert into my2.status(variable_name,variable_value,timest) select variable_name,variable_value,a from information_schema.global_status; end if; insert into my2.status(variable_name,variable_value,timest) select concat( 'PROCESSES.' , user ), count (*),a from information_schema.processlist group by user ; insert into my2.status(variable_name,variable_value,timest) select concat( 'PROCESSES_HOSTS.' ,SUBSTRING_INDEX(host, ':' ,1)), count (*),a from information_schema.processlist group by concat( 'PROCESSES_HOSTS.' ,SUBSTRING_INDEX(host, ':' ,1)); insert into my2.status(variable_name,variable_value,timest) select concat( 'PROCESSES_COMMAND.' ,command), count (*),a from information_schema.processlist group by concat( 'PROCESSES_COMMAND.' ,command); insert into my2.status(variable_name,variable_value,timest) select substr(concat( 'PROCESSES_STATE.' ,state),1,64), count (*),a from information_schema.processlist group by substr(concat( 'PROCESSES_STATE.' ,state),1,64); if v= '5.6' OR v= '5.7' OR v= '8.0' OR v= '10.' then insert into my2.status(variable_name,variable_value,timest) SELECT 'SUM_TIMER_WAIT' , sum (sum_timer_wait*1.0), a FROM performance_schema.events_statements_summary_global_by_event_name; end if; -- Delta values if v= '5.7' OR v= '8.0' then insert into my2.status(variable_name,variable_value,timest) select concat( upper (s.variable_name), '-d' ), greatest(s.variable_value-c.variable_value,0), a from performance_schema.global_status s, my2. current c where s.variable_name=c.variable_name; insert into my2.status(variable_name,variable_value,timest) SELECT concat( 'COM_' , upper (substr(s.EVENT_NAME,15,58)), '-d' ), greatest(s.COUNT_STAR-c.variable_value,0), a FROM performance_schema.events_statements_summary_global_by_event_name s, my2. current c WHERE s.EVENT_NAME LIKE 'statement/sql/%' AND s.EVENT_NAME = c.variable_name; insert into my2.status(variable_name,variable_value,timest) SELECT 'SUM_TIMER_WAIT-d' , sum (sum_timer_wait*1.0)-c.variable_value, a FROM performance_schema.events_statements_summary_global_by_event_name, my2. current c WHERE c.variable_name= 'SUM_TIMER_WAIT' ; insert into my2.status(variable_name, variable_value, timest) select 'replication_connection_status' ,if(SERVICE_STATE= 'ON' , 1, 0),a from performance_schema.replication_connection_status; insert into my2.status(variable_name, variable_value, timest) select 'replication_applier_status' ,if(SERVICE_STATE= 'ON' , 1, 0),a from performance_schema.replication_applier_status; delete from my2. current ; insert into my2. current (variable_name,variable_value) select upper (variable_name),variable_value+0 from performance_schema.global_status where variable_value REGEXP '^-*[[:digit:]]+(\.[[:digit:]]+)?$' and variable_name not like 'Performance_schema_%' and variable_name not like 'SSL_%' ; insert into my2. current (variable_name,variable_value) SELECT substr(EVENT_NAME,1,40), COUNT_STAR FROM performance_schema.events_statements_summary_global_by_event_name WHERE EVENT_NAME LIKE 'statement/sql/%' ; insert into my2. current (variable_name,variable_value) SELECT 'SUM_TIMER_WAIT' , sum (sum_timer_wait*1.0) FROM performance_schema.events_statements_summary_global_by_event_name; insert into my2. current (variable_name,variable_value) select concat( 'PROCESSES_COMMAND.' ,command), count (*) from information_schema.processlist group by concat( 'PROCESSES_COMMAND.' ,command); insert into my2. current (variable_name,variable_value) select upper (variable_name),variable_value from performance_schema.global_variables where variable_name in ( 'max_connections' , 'innodb_buffer_pool_size' , 'query_cache_size' , 'innodb_log_buffer_size' , 'key_buffer_size' , 'table_open_cache' ); else insert into my2.status(variable_name,variable_value,timest) select concat( upper (s.variable_name), '-d' ), greatest(s.variable_value-c.variable_value,0), a from information_schema.global_status s, my2. current c where s.variable_name=c.variable_name; delete from my2. current ; insert into my2. current (variable_name,variable_value) select upper (variable_name),variable_value+0 from information_schema.global_status where variable_value REGEXP '^-*[[:digit:]]+(\.[[:digit:]]+)?$' and variable_name not like 'Performance_schema_%' and variable_name not like 'SSL_%' ; insert into my2. current (variable_name,variable_value) select upper (variable_name),variable_value from information_schema.global_variables where variable_name in ( 'max_connections' , 'innodb_buffer_pool_size' , 'query_cache_size' , 'innodb_log_buffer_size' , 'key_buffer_size' , 'table_open_cache' ); end if; set sql_log_bin = 1; END // DELIMITER ; // -- Collect daily statistics on space usage and delete old statistics (older than 62 days, 1 year for DB size) DROP PROCEDURE IF EXISTS collect_daily_stats; DELIMITER // ; CREATE PROCEDURE collect_daily_stats() BEGIN DECLARE a datetime; set sql_log_bin = 0; set a=now(); insert into my2.status(variable_name,variable_value,timest) select concat( 'SIZEDB.' ,table_schema), sum (data_length+index_length), a from information_schema.tables group by table_schema; insert into my2.status(variable_name,variable_value,timest) select 'SIZEDB.TOTAL' , sum (data_length+index_length), a from information_schema.tables; delete from my2.status where timest < date_sub(now(), INTERVAL 62 DAY ) and variable_name <> 'SIZEDB.TOTAL' ; delete from my2.status where timest < date_sub(now(), INTERVAL 365 DAY ); set sql_log_bin = 1; END // DELIMITER ; // -- The event scheduler must also be activated in the my.cnf (event_scheduler=1) set global event_scheduler=1; set sql_log_bin = 0; DROP EVENT IF EXISTS collect_stats; CREATE EVENT collect_stats ON SCHEDULE EVERY 10 Minute DO call collect_stats(); DROP EVENT IF EXISTS collect_daily_stats; CREATE EVENT collect_daily_stats ON SCHEDULE EVERY 1 DAY DO call collect_daily_stats(); set sql_log_bin = 1; |
2.添加Prometheus数据源
Configuration -> Data Sources ->add data source -> Prometheus
3.新增Dashboard Linux基础数据展示
Create -> import
4.导入模板8919
5.选择数据源
点击lmport
6.查看Dashboard
Dashboards ->Manage
七 、MySQL数据展示
1 设置数据源
2.导入已经画好的dashboard,数据源选择刚刚创建好的mysql数据源即可
导入画好的dashboard,可在官网下载
点击访问mysql数据源
这里我选择第一个
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | cat <<EOF>> /usr/lib/systemd/system/mysqld-exporter.service [Unit] Description=mysqld_exporter [Service] User =root ExecStart=/data/mysqld_exporter/mysqld_exporter --config.my-cnf /data/mysqld_exporter/my.cnf --web.listen-address=0.0.0.0:9104 \ --collect.slave_status \ --collect.binlog_size \ --collect.info_schema.processlist \ --collect.info_schema.innodb_metrics \ --collect.engine_innodb_status \ --collect.perf_schema.file_events \ --collect.perf_schema.replication_group_member_stats-collect.info_schema.processlist \ -collect.info_schema.innodb_tablespaces \ -collect.info_schema.innodb_metrics \ -collect.perf_schema.tableiowaits \ -collect.perf_schema.indexiowaits \ -collect.perf_schema.tablelocks \ -collect.engine_innodb_status \ -collect.perf_schema.file_events \ -collect.info_schema.processlist \ -collect.binlog_size \ -collect.info_schema.clientstats \ -collect.perf_schema.eventswaits \ Restart= on -failure [Install] WantedBy=multi- user .targe EOF |
7991数字是上面官网复制过来的
粘贴,点击load
选择Mysql源
七 、监控Redis(redis_exporter)
1.安装redis_exporter
172.16.11.8操作
cd /usr/local
wget https://github.com/oliver006/redis_exporter/releases/download/v0.15.0/redis_exporter-v0.15.0.linux-amd64.tar.gz
tar -xvf redis_exporter-v0.15.0.linux-amd64.tar.gz
2.启动redis_exporter
172.16.11.8操作
默认redis_exporter端口为9121
cd /usr/local
./redis_exporter redis//172.16.11.8:6379 &
3.prometheus配置文件中加入redis监控并重启
172.16.11.7操作
vim /usr/local/prometheus/prometheus.yml
1
- job_name: 'Redis'
static_configs:
- targets: ['172.16.11.8:9121']
1
2
3
systemctl restart prometheus
————————————————
版权声明:本文为CSDN博主「兴乐安宁」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/weixin_42324463/article/details/128006734
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· 单线程的Redis速度为什么快?
2021-08-04 pg postgresql空闲连接查看