mysql_exporter的安装和告警、面板设置
MysQL数据库安装(略)
mysql_exporter安装
一、创建用于监视数据库的用户exporter
mysql -u root -p
mysql> set global validate_password_policy=LOW; # 降低MySQL8 密码规则策略,或者按规则设置密码
mysql> show variables like "%validate%";
mysql> CREATE USER 'exporter'@'localhost' IDENTIFIED BY 'Prometheus' WITH MAX_USER_CONNECTIONS 5;
mysql> GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'localhost';
说明:使用max_user_connections参数来限制exporter用户最大连接数,避免监控引起数据库过载,需要注意的是该参数并不是MySQL/Mariadb每个版本都支持
二、安装mysqld_exporter
下载
tar xvf mysqld_exporter-0.12.1.linux-amd64.tar.gz -C /usr/local/
cd /usr/local/
ln -s mysqld_exporter-0.12.1.linux-amd64/ mysqld_exporter
cd /usr/local/mysqld_exporter
# cat > .my.cnf <<EOF
[client]
user=exporter
password=Prometheus
EOF
使用systemd方式启动
# cat >/usr/lib/systemd/system/mysqld_exporter.service <<EOF
[Unit]
Description=Prometheus
[Service]
Environment=DATA_SOURCE_NAME=exporter:Prometheus@(localhost:3306)/
ExecStart=/usr/local/mysqld_exporter/mysqld_exporter --config.my-cnf=/usr/local/mysqld_exporter/.my.cnf --web.listen-address=:9104
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
# systemctl enable mysqld_exporter
# systemctl start mysqld_exporter
mysqld_exporter默认运行端口是:9104
可以通过http://数据库服务器:9104访问MySQLD Exporter暴露的服务
可以通过/metrics查看mysql_up指标判断当前MySQLD Exporter是否正常连接到了MySQL实例
http://xxxxxxxxxx:9104/ 注意不是https
三、在prometheus.yaml中添加mysqld_exporter的配置
- job_name: 'mysqld_exporter'
scheme: http
static_configs:
- targets: ['192.168.40.200:9104']
labels:
app: mysqld_exporter
role: mysqld_exporter
重新加载配置prometheus配置
# curl -X POST http://xxxxxxxxx:9090/-/reload
四、编写告警规则
../prometheus/etc/
本文将规则分开写:第一个文件用于记录规则,第二个是告警规则。
#1.记录规则
cat > rules/mysql_record_rule.yml <<'EOF'
groups:
- name: mysql_rules
rules:
- record: mysql:status
expr: mysql_up{instance=~".*9104"}
- record: mysql:uptime
expr: mysql_global_status_uptime{job=~"mysqld_exporter.*"}
- record: mysql:mysql_threads_connected
expr: mysql_global_status_threads_connected{job=~"mysqld_exporter.*"}
- record: mysql:mysql_threads_running
expr: mysql_global_status_threads_running{job=~"mysqld_exporter.*"}
- record: mysql:mysql_aborted_connects
expr: increase(mysql_global_status_aborted_connects{job=~"mysqld_exporter.*"}[2m])
- record: mysql:mysql_slow_queries
expr: increase(mysql_global_status_slow_queries{job=~"mysqld_exporter.*"}[2m])
- record: mysql:mysql_table_locks
expr: increase(mysql_global_status_table_locks_waited{job=~"mysqld_exporter.*"}[2m])
- record: mysql:mysql_qps
expr: rate(mysql_global_status_queries{job=~"mysqld_exporter.*"}[2m])
EOF
# 2.告警规则
cat > rules/mysql_alert_rule.yml<<'EOF'
groups:
- name: mysql_alerts
rules:
- alert: MySQL_Down_Alert
expr: mysql:status==0
for: 1m
labels:
metric_type: db_monitor
resource: db
severity: critical
annotations:
summary: 主机 {{ $labels.nodename }} 数据库 出现异常!
description: 主机 {{ $labels.nodename }} 上的 {{ $labels.job }} 可能存在异常,请检查!
- alert: MySQL_uptime_Alert
expr: mysql:uptime<1
for: 1m
labels:
metric_type: db_monitor
resource: db
severity: critical
annotations:
summary: 主机 {{ $labels.nodename }} 数据库 出现异常!
description: 主机 {{ $labels.nodename }} 数据库状态异常,请检查!
- alert: MySQL_threads_connected_Alert
expr: mysql:mysql_threads_connected > 100
for: 1m
labels:
metric_type: db_monitor
resource: db
severity: critical
annotations:
summary: 主机 {{ $labels.nodename }} 上的数据库指标 threads_connected 超出阈值!
description: 主机 {{ $labels.nodename }} 上的数据库指标 threads_connected 超出阈值,当前值为{{humanize $value}},请检查!
- alert: MySQL_threads_running_Alert
expr: mysql:mysql_threads_running > 200
for: 1m
labels:
metric_type: db_monitor
resource: db
severity: critical
annotations:
summary: 主机 {{ $labels.nodename }} 上的数据库指标 threads_running 超出阈值!
description: 主机 {{ $labels.nodename }} 上的数据库指标 threads_connected 超出阈值,当前值为{{humanize $value}},请检查!
- alert: MySQL_aborted_connects_Alert
expr: mysql:mysql_aborted_connects > 10
for: 1m
labels:
metric_type: db_monitor
resource: db
severity: critical
annotations:
summary: 主机 {{ $labels.nodename }} 上的数据库指标 aborted_connects 超出阈值!
description: 主机 {{ $labels.nodename }} 上的数据库指标 aborted_connects 超出阈值,当前值为{{humanize $value}},请检查!
- alert: MySQL_slow_queries_Alert
expr: mysql:mysql_slow_queries > 1
for: 1m
labels:
metric_type: db_monitor
resource: db
severity: critical
annotations:
summary: 主机 {{ $labels.nodename }} 上的数据库指标 slow_queries 超出阈值!
description: 主机 {{ $labels.nodename }} 上的数据库指标 slow_queries 超出阈值,当前值为{{humanize $value}},请检查!
- alert: MySQL_table_locks_Alert
expr: mysql:mysql_table_locks > 1
for: 1m
labels:
metric_type: db_monitor
resource: db
severity: critical
annotations:
summary: 主机 {{ $labels.nodename }} 上的数据库指标 table_locks 超出阈值!
description: 主机 {{ $labels.nodename }} 上的数据库指标 table_locks 超出阈值,当前值为{{humanize $value}},请检查!
- alert: MySQL_qps_Alert
expr: mysql:mysql_qps > 500
for: 1m
labels:
metric_type: db_monitor
resource: db
severity: critical
annotations:
summary: 主机 {{ $labels.nodename }} 上的数据库指标 qps 超出阈值!
description: 主机 {{ $labels.nodename }} 上的数据库指标 qps 超出阈值,当前值为{{humanize $value}},请检查!
EOF
Prometheus中引入相关的rule_files文件
mysql_alert_rule.yml
mysql_record_rule.yml
支持通配符
重新加载配置prometheus配置
# curl -X POST http://xxxxxxxxx:9090/-/reload
排错1:
多个rule_file都要以- 开头
排错2:
curl -X POST http://prometheus.zq.com/-/reload
failed to reload config: one or more errors occurred while applying the new configuration (--config.file="/data/etc/prometheus.yml")
需要对规则进行检查
# promtool check rules /data/etc/rules/mysql_record_rule.yml
告警邮件效果
五.自定义Dashboard展示MySQL相关指标:
根据grafana官网大牛分享的7362面板修改,https://grafana.com/grafana/dashboards/7362