一个MYSQL监控与CONNECTION_CONTROL引起的问题

适用范围

mysql 任何版本

问题概述

客户反应我们监控工具占用了太多连接,已出现连接数告警,影响生产环境了,总共最大连接数为1000,我们监控软件达到800多!
如下:

但是比较疑惑的是该监控已在上周停掉了服务,并且该帐号也已手动drop 掉。 为什么还能访问到DB呢!

问题原因

通过对该问题分析主要是由以下两个原因影起:
1.监控软件原因,针对以上问题和我们研发确认,除了mysql_exporter 会采集mysql信息外,proxy 上的slowmon_collector 服务也会时不时的去采集DB信息。只不过该DB我早已取消纳管了,但为什么还是会去采集DB信息。这需要研发确认才能回复!
2.CONNECTION_CONTROL BUG, 该帐号已被删除,所以不能正常连接到DB,而 CONNECTION_CONTROL 会记录当该帐号连接失败达到 connection_control_failed_connections_threshold 指定次数时就会延迟创建连接。而延迟创建连接这段时间,是会占用连接数的。

为了证实这个问题,我特意在mysql 8.0.30上还原了该问题。

#1.插件准备
#安装 CONNECTION_CONTROL 插件
mysql>INSTALL PLUGIN CONNECTION_CONTROL   SONAME 'connection_control.so';
mysql> INSTALL PLUGIN CONNECTION_CONTROL_FAILED_LOGIN_ATTEMPTS   SONAME 'connection_control.so';
  
#检查CONNECTION_CONTROL 插件是否正常
mysql> SELECT PLUGIN_NAME, PLUGIN_STATUS
    ->        FROM INFORMATION_SCHEMA.PLUGINS
    ->        WHERE PLUGIN_NAME LIKE 'connection%';
+------------------------------------------+---------------+
| PLUGIN_NAME                              | PLUGIN_STATUS |
+------------------------------------------+---------------+
| CONNECTION_CONTROL                       | ACTIVE        |
| CONNECTION_CONTROL_FAILED_LOGIN_ATTEMPTS | ACTIVE        |
+------------------------------------------+---------------+

#检查CONNECTION_CONTROL 参数
mysql>  show variables connection_control%';
+-------------------------------------------------+------------+
| Variable_name                                   | Value      |
+-------------------------------------------------+------------+
| connection_control_failed_connections_threshold | 3          |
| connection_control_max_connection_delay         | 2147483647 |
| connection_control_min_connection_delay         | 1000       |
+-------------------------------------------------+------------+
3 rows in set (0.02 sec)

#2 用户准备
#创建测试用户
mysql> 
mysql> create user test@'%' identified by 'test' ;
Query OK, 0 rows affected (1.29 sec)

#给用户授权
mysql> grant all privileges on *.* to 'test'@'%' ;
Query OK, 0 rows affected (5.39 sec)


#确定测试用户可以访问DB
[root@s2ahuzcloud01 run]# mysql -u test -p'test' -h 192.168.1.51 -P 3306
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 13
Server version: 8.0.30-commercial MySQL Enterprise Server - Commercial

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> 
mysql

#3.开启测试
#删掉用户
mysql> drop user test@'%' ;
Query OK, 0 rows affected (0.71 sec)

#刷新权限
mysql> FLUSH PRIVILEGES ;
Query OK, 0 rows affected (0.12 sec)

#设置最大连接数
mysql> set global max_connections=5 ;
Query OK, 0 rows affected (0.01 sec)
mysql> show variables like '%max_connections%' ;
+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| max_connections        | 5     |
| mysqlx_max_connections | 100   |
+------------------------+-------+
2 rows in set (0.02 sec)


#4 创建并发连接
#同时开四个窗口连接

[root@s2ahuzcloud01 ~]# while  true ;  do   mysql -u test -p'test' -h 192.168.1.51 -P 3306;    done
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1045 (28000): Access denied for user 'test'@'192.168.1.51' (using password: YES)
mysql: [Warning] Using a password on the command line interface can be insecure.


#在次检查用户是否存
mysql> 
mysql> select user ,host from mysql.user where user ='test' ;
Empty set (0.01 sec)

#查看连接数
mysql> show processlist ;
+----+-----------------+--------------------+------+---------+------+--------------------------------------+------------------+
| Id | User            | Host               | db   | Command | Time | State                                | Info             |
+----+-----------------+--------------------+------+---------+------+--------------------------------------+------------------+
|  6 | event_scheduler | localhost          | NULL | Daemon  | 5852 | Waiting on empty queue               | NULL             |
|  9 | root            | localhost          | NULL | Query   |    0 | init                                 | show processlist |
| 47 | test            | 192.168.1.51:37622 | NULL | Connect |   12 | Waiting in connection_control plugin | NULL             |
| 48 | test            | 192.168.1.51:37624 | NULL | Connect |    9 | Waiting in connection_control plugin | NULL             |
| 49 | test            | 192.168.1.51:37626 | NULL | Connect |    7 | Waiting in connection_control plugin | NULL             |
| 50 | test            | 192.168.1.51:37628 | NULL | Connect |    5 | Waiting in connection_control plugin | NULL             |
+----+-----------------+--------------------+------+---------+------+--------------------------------------+------------------+
6 rows in set (0.01 sec)

##使用正常的业务帐号却恩为连接数过多无法登录 
[root@s2ahuzcloud01 run]# mysql -u root -p'root' --socket=/u01/mysql/data/uat1/run/mysql3306.sock
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1040 (HY000): Too many connections

补充
CONNECTION_CONTROL 插件下,任何不存在用户连接DB,都会占用DB的连接数

#多窗口并发访问, username 随机
while  true ;  do   mysql -u test123 -p'test' -h 192.168.1.51 -P 3306;    done

mysql> show processlist ;
+----+-----------------+--------------------+------+---------+------+--------------------------------------+------------------+
| Id | User            | Host               | db   | Command | Time | State                                | Info             |
+----+-----------------+--------------------+------+---------+------+--------------------------------------+------------------+
|  6 | event_scheduler | localhost          | NULL | Daemon  |  200 | Waiting on empty queue               | NULL             |
|  9 | root            | localhost          | NULL | Query   |    0 | init                                 | show processlist |
| 51 | test123         | 192.168.1.51:41792 | NULL | Connect |   16 | Waiting in connection_control plugin | NULL             |
| 52 | test123         | 192.168.1.51:41794 | NULL | Connect |   16 | Waiting in connection_control plugin | NULL             |
| 53 | test123         | 192.168.1.51:41796 | NULL | Connect |   11 | Waiting in connection_control plugin | NULL             |
| 54 | test123         | 192.168.1.51:41798 | NULL | Connect |    8 | Waiting in connection_control plugin | NULL             |
+----+-----------------+--------------------+------+---------+------+--------------------------------------+------------------+
6 rows in set (0.00 sec)

解决方案

通过以上问题分析,即已明确了问题的原因,就可以通过以下方式避免该问题发生:

  1. 更新监控程序,确保监控程去不会去访问未纳管的DB
  2. 在必须使用CONNECTION_CONTROL 插件的场景
    a. 将connection_control_min_connection_delay设置为足够小。这样就会很快释放掉连接。
    b. 将connection_control_failed_connections_threshold 参数设置到足够大 或 设置为0 不限制,这也将失去connection_control插件的作用。
set global connection_control_failed_connections_threshold =0;

3.在有可以使用CONNECTION_CONTROL 插件的场景,可以卸载掉该插件。

mysql> UNINSTALL PLUGIN CONNECTION_CONTROL_FAILED_LOGIN_ATTEMPTS;
mysql> UNINSTALL PLUGIN CONNECTION_CONTROL;

当然做好监控告警,将问题消除在潜在阶段,仍然是有效的手段,如果不是我们提前检查,可能就对生产产生了事故。 在对帐号有操作更新时,建议定期重启。

参考文档

https://bugs.mysql.com/bug.php?id=111253

posted @ 2023-06-06 09:38  www.cqdba.cn  阅读(206)  评论(0编辑  收藏  举报