mysqlfailover高可用与proxysql读写分离配置
proxysql官方推荐两种高可用方案:
1、MHA+proxysql
2、mysqlrpladmin+proxysql
MySQLfailover工具包含在mysqlrpladmin工具中,所以两者可以相互替换。
mysqlfailover+proxysql做读写分离和高可用。现在生产环境中也是这一套。
MySQLfailover和proxysql都是非常灵活的两个中间件,可以相互配合使用。
如果是链式复制,也就是master-->slave-->slave,那么也可以用MySQLfailover+proxysql进行读写分离和高可用。
如果业务量增大,可以考虑PXC或MariaDB的Galera集群。
mysql官方的集群方案:mysql cluster、MGR,两者的业务限制比较严格,只适合于对主键严重依赖且没有大事物的环境中。
如果只需要执行查询的数据库,那么可以在生产环境中用percona的tokudb引擎。
前提:
mysqlfailover需要GTID复制。
proxysql需要:yum install -y perl-DBD-MySQL perl-DBI perl-Time-HiRes perl-IO-Socket-SSL
添加好程序账号(failover),故障转移账号(failover),proxysql心跳检测账号(monitor),这里测试图方便所以程序账号和故障转移账号用一个了。
规划:
master:192.168.0.106
slave:192.168.0.109
slave:192.168.0.112
proxysql:192.168.0.111
部署好mysql服务。在proxysql机器上安装好mysql。
两台从机都连接到192.168.0.106上。
一、配置proxysql:
主要配置有4张表:mysql_servers,mysql_users,mysql_query_rules,mysql_replication_hostgroups
1:mysql_servers,这张表用来添加后端mysql实例的表。
insert into mysql_servers(hostgroup_id,hostname,port,weight,comment) values('1','192.168.0.106','3306',1,'Write Group'); 这里设置1位写组
insert into mysql_servers(hostgroup_id,hostname,port,weight,comment) values('2','192.168.0.109','3306',1,'Read Group'); 这里设置2为读组
insert into mysql_servers(hostgroup_id,hostname,port,weight,comment) values('2','192.168.0.112','3306',1,'Read Group');
2:mysql_users,这张表用来设置程序账号的,需要提前在两台mysql上添加。
insert into mysql_users(username,password,active,default_hostgroup,transaction_persistent) values('failover','123456',1,1,1); 这里图方便,MySQLfailover和proxysql用一个账号了,生产环境中不能这么做。
3:设置proxysql心跳检测账号:
set mysql-monitor_username='monitor'; set mysql-monitor_password='123456';
4:mysql_query_rules,路由表,用来配置分发的,也就是读写分离规则。
INSERT INTO mysql_query_rules(active,match_pattern,destination_hostgroup,apply) VALUES(1,'^SELECT.*FOR UPDATE$',1,1);
INSERT INTO mysql_query_rules(active,match_pattern,destination_hostgroup,apply) VALUES(1,'^SELECT',2,1);
5:mysql_replication_hostgroups,用来配置高可用的,
mysql_replication_hostgroups中的每一行代表一对writer_hostgroup和reader_hostgroup。ProxySQL将监视read_only
指定主机组中所有服务器的值,并根据值read_only
将服务器分配给编写器或读取器主机组。字段注释可用于存储任意数据。
这些字段具有以下语义:
writer_hostgroup
- 默认情况下将发送所有流量的主机组,read_only=0
MySQL中的节点将分配给此主机组。reader_hostgroup
- 应该将读取流量的主机组发送到,应该定义查询规则或单独的只读用户将流量路由到此主机组,这些节点read_only=1
将分配给该主机组。comment
- 可用于用户定义的任何目的的文本字段。可以是群集存储内容的描述,添加或禁用主机组的提醒,或某些检查器脚本处理的JSON。
以上是官方文档说明,意思是说用read_only变量来检测主机变化,从而实现高可用。表有3个字段,设置前面设置好的读写组的ID,插入到这张表里。
insert into mysql_replication_hostgroups (writer_hostgroup,reader_hostgroup,comment)values('1','2','高可用');
最后,要加载到runtime模块中使其生效,然后再保存到硬盘中。
mysql -uadmin -padmin -h 127.0.0.1 -P 6032 -e "load mysql servers to runtime;load mysql users to runtime;load mysql variables to runtime;LOAD MYSQL QUERY RULES TO RUNTIME;save mysql servers to disk;save mysql users to disk;save mysql variables to disk;SAVE MYSQL QUERY RULES TO DISK;"
二、MySQLfailover
具体的MySQLfailover工具可以在:
https://www.cnblogs.com/magmell/p/9570743.html
https://www.cnblogs.com/magmell/p/9257935.html 找到安装方法。
运行:
mysqlfailover --master=failover:123456@'192.168.0.106':3306 --discover-slaves-login=failover:123456 --force --daemon=start --candidates=failover:123456@'192.168.0.109':3306 --interval=5 -vvv --exec-after=/usr/local/sbin/test.sh
如果你是链式复制:M-->S1-->S2,则需要指定--slave从机,因为MySQLfailover本身只能检测到一主双从的架构,这里也不会影响到切换。
如果你是其他的复制:M-->S1-->S2/S3,这里也需要指定--slave参数,以便在故障转移时可以定位到新主机上。
如果你是以上两种复制方式,则也需要指定--candidates=参数,以便在故障转移时可以转移到指定的机器上。
其中有一个参数要说明一下:--exec-after
这个参数是在故障转移之后让MySQLfailover自动执行的脚本,必须要放在/usr/local/sbin/目录下才能生效。这里的主要作用就是配合proxysql中间件进行读写切换,通过设置read_only参数使proxysql中间件自己判断是写机还是读机。
这个参数有4个传入参数,分别是:
$1:旧master的IP
$2:旧master的端口
$3:新master的IP
$4:新master的端口
我的脚本:
#!/bin/bash
ip="192.168.0.112" #因为我在命令行中指明了候选主机,那么这里就应该填写另一台slave了,要设置只读的slave机
port=3306
new_master_ip=$3 #接收到的第三个参数,下面有说明
new_master_port=$4 #接收到的第四个参数
mysql -ufailover -p123456 -h$new_master_ip -P$new_master_port -e "set global read_only=0;" #这里设置新master为读写模式
mysql -ufailover -p123456 -h$ip -P$port -e "set global read_only=1;" #设置slave为只读模式,和proxysql要配合使用进行读写分离。
这里说明:
由于proxysql读写分离的机制是检测服务器的read_only参数的,而且故障转移会自动将只读设置变成读写模式,所以脚本里要重新去另一台slave设置一下read_only参数才可以和proxysql相配合使用。
192.168.0.112就是另一台slave机。
三、启动服务,测试读写分离。
mysql -ufailover -p123456 -h127.0.0.1 -P6033 用程序账号连接。
mysql> use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select * from test;
+------+
| id |
+------+
| 111 |
| 111 |
| 3 |
| 3 |
| 3 |
| 3 |
+------+
6 rows in set (0.00 sec)
mysql> insert into test values(3);
Query OK, 1 row affected (0.01 sec)
mysql> insert into test values(3);
Query OK, 1 row affected (0.00 sec)
mysql> insert into test values(3);
Query OK, 1 row affected (0.01 sec)
mysql> select * from test;
+------+
| id |
+------+
| 111 |
| 111 |
| 3 |
| 3 |
| 3 |
| 3 |
| 3 |
| 3 |
| 3 |
+------+
9 rows in set (0.00 sec)
mysql> \q
查看读写分离是否成功:mysql -uadmin -padmin -h 127.0.0.1 -P 6032
mysql> select * from stats_mysql_query_digest;
+-----------+--------------------+----------+--------------------+----------------------------------+------------+------------+------------+----------+----------+----------+
| hostgroup | schemaname | username | digest | digest_text | count_star | first_seen | last_seen | sum_time | min_time | max_time |
+-----------+--------------------+----------+--------------------+----------------------------------+------------+------------+------------+----------+----------+----------+
| 1 | test | failover | 0x6FA973F1E0331C0E | insert into test values(?) | 3 | 1536640645 | 1536640646 | 13013 | 1391 | 9671 |
| 2 | test | failover | 0x38DF1D37B3136F42 | select * from test | 2 | 1536640640 | 1536640649 | 1625 | 582 | 1043 |
| 1 | test | failover | 0x99531AEFF718C501 | show tables | 1 | 1536640630 | 1536640630 | 368 | 368 | 368 |
| 1 | test | failover | 0x02033E45904D3DF0 | show databases | 1 | 1536640630 | 1536640630 | 930 | 930 | 930 |
| 1 | information_schema | failover | 0x620B328FE9D6D71A | SELECT DATABASE() | 1 | 1536640630 | 1536640630 | 750 | 750 | 750 |
| 1 | information_schema | failover | 0x226CD90D52A2BA0B | select @@version_comment limit ? | 1 | 1536640627 | 1536640627 | 0 | 0 | 0 |
+-----------+--------------------+----------+--------------------+----------------------------------+------------+------------+------------+----------+----------+----------+
6 rows in set (0.01 sec)
观察hostgroup字段,可以看到insert语句发送到了写组上,select语句发送到了读组上。
现在来测试高可用:
KILL掉master的mysql进程:
[root@node1 ~]# ps -ef|grep mysql
root 17224 1 0 11:09 pts/0 00:00:00 /bin/sh /usr/local/mysql/bin/mysqld_safe --datadir=/usr/local/mysql/data --pid-file=/usr/local/mysql/data/mysql.pid
mysql 18126 17224 0 11:09 pts/0 00:00:05 /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/usr/local/mysql/data --plugin-dir=/usr/local/mysql/lib/plugin --user=mysql --log-error=/usr/local/mysql/data/mysqld.log --open-files-limit=28192 --pid-file=/usr/local/mysql/data/mysql.pid --socket=/usr/local/mysql/mysql.sock --port=3306
root 18358 1509 0 13:24 pts/0 00:00:00 grep mysql
[root@node1 ~]# kill -9 17224
[root@node1 ~]# kill -9 18126
[root@node1 ~]# ps -ef|grep mysql
root 18360 1509 0 13:25 pts/0 00:00:00 grep mysql
MySQLfailover输出:
Q-quit R-refresh H-health G-GTID Lists U-UUIDs
Failed to reconnect to the master after 3 attemps.
Failover starting in 'auto' mode...
# Checking eligibility of slave 192.168.0.109:3306 for candidate.
# GTID_MODE=ON ... Ok
# Replication user exists ... Ok
# Candidate slave 192.168.0.109:3306 will become the new master.
# Checking slaves status (before failover).
# Preparing candidate for failover.
WARNING: IP lookup by name failed for 44,reason: Unknown host
WARNING: IP lookup by address failed for 192.168.0.109,reason: Unknown host
WARNING: IP lookup by address failed for 192.168.0.112,reason: Unknown host
# Missing transactions found on 192.168.0.112:3306. SELECT gtid_subset() = 0
# LOCK STRING: FLUSH TABLES WITH READ LOCK
# Read only is ON for 192.168.0.112:3306.
# Connecting candidate to 192.168.0.112:3306 as a temporary slave to retrieve unprocessed GTIDs.
# Change master command for 192.168.0.109:3306
# CHANGE MASTER TO MASTER_HOST = '192.168.0.112', MASTER_USER = 'backup', MASTER_PASSWORD = '123456', MASTER_PORT = 3306, MASTER_AUTO_POSITION=1
# Read only is OFF for 192.168.0.112:3306.
# UNLOCK STRING: UNLOCK TABLES
# Waiting for candidate to catch up to slave 192.168.0.112:3306.
# Slave 192.168.0.109:3306:
# QUERY = SELECT WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS('c142ca67-b898-11e8-86e8-000c29367e64:1', 300)
# Return Code = 3
# Slave 192.168.0.109:3306:
# QUERY = SELECT WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS('c777e02f-b898-11e8-86a0-000c29c6f346:1-4', 300)
# Return Code = 0
# Creating replication user if it does not exist.
# Stopping slaves.
# Performing STOP on all slaves.
WARNING: IP lookup by name failed for 44,reason: Unknown host
WARNING: IP lookup by address failed for 192.168.0.109,reason: Unknown host
WARNING: IP lookup by address failed for 192.168.0.112,reason: Unknown host
# Executing stop on slave 192.168.0.109:3306 WARN - slave is not configured with this master
# Executing stop on slave 192.168.0.109:3306 Ok
WARNING: IP lookup by address failed for 192.168.0.106,reason: Unknown host
# Executing stop on slave 192.168.0.112:3306 WARN - slave is not configured with this master
# Executing stop on slave 192.168.0.112:3306 Ok
WARNING: IP lookup by name failed for 44,reason: Unknown host
WARNING: IP lookup by address failed for 192.168.0.109,reason: Unknown host
# Switching slaves to new master.
# Change master command for 192.168.0.112:3306
# CHANGE MASTER TO MASTER_HOST = '192.168.0.109', MASTER_USER = 'backup', MASTER_PASSWORD = '123456', MASTER_PORT = 3306, MASTER_AUTO_POSITION=1
# Disconnecting new master as slave.
# Execute on 192.168.0.109:3306: RESET SLAVE ALL
# Starting slaves.
# Performing START on all slaves.
# Executing start on slave 192.168.0.112:3306 Ok
# Spawning external script.
# SCRIPT EXECUTED: /usr/local/sbin/test.sh 192.168.0.106 3306 192.168.0.109 3306 #这里可以看到在执行脚本了,传入了四个参数
# Script completed Ok. #执行成功
# Checking slaves for errors.
# 192.168.0.112:3306 status: Ok
# Failover complete.
# Discovering slaves for master at 192.168.0.109:3306
Failover console will restart in 5 seconds.
# Attempting to contact 192.168.0.109 ... Success
# Attempting to contact 192.168.0.112 ... Success
MySQL Replication Failover Utility
Failover Mode = auto Next Interval = Sat Sep 15 18:11:45 2018
Master Information
------------------
Binary Log File Position Binlog_Do_DB Binlog_Ignore_DB
mysql-bin.000001 657
GTID Executed Set
b5c5054c-b898-11e8-8670-000c299e1daf:1 [...]
# Attempting to contact 192.168.0.109 ... Success
# Attempting to contact 192.168.0.112 ... Success
Replication Health Status
+----------------+-------+---------+--------+------------+---------+-------------+-------------------+-----------------+------------+-------------+--------------+------------------+---------------+-----------+----------------+------------+---------------+
| host | port | role | state | gtid_mode | health | version | master_log_file | master_log_pos | IO_Thread | SQL_Thread | Secs_Behind | Remaining_Delay | IO_Error_Num | IO_Error | SQL_Error_Num | SQL_Error | Trans_Behind |
+----------------+-------+---------+--------+------------+---------+-------------+-------------------+-----------------+------------+-------------+--------------+------------------+---------------+-----------+----------------+------------+---------------+
| 192.168.0.109 | 3306 | MASTER | UP | ON | OK | 5.7.22-log | mysql-bin.000001 | 657 | | | | | | | | | |
| 192.168.0.112 | 3306 | SLAVE | UP | ON | OK | 5.7.22-log | mysql-bin.000001 | 657 | Yes | Yes | 0 | No | 0 | | 0 | | 0 |
+----------------+-------+---------+--------+------------+---------+-------------+-------------------+-----------------+------------+-------------+--------------+------------------+---------------+-----------+----------------+------------+---------------+
已经成功转移到192.168.0.109上并提升为master。
查看proxysql是否成功转移:
mysql> select * from mysql_servers;
+--------------+----------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+-------------+
| hostgroup_id | hostname | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
+--------------+----------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+-------------+
| 1 | 192.168.0.109 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | Write Group |
| 2 | 192.168.0.112 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | Read Group |
| 2 | 192.168.0.109 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | Read Group |
+--------------+----------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+-------------+
3 rows in set (0.00 sec)
可以看到旧master和旧slave的hostgroup_id 已经改变。来测试语句负载:
[root@nodetest ~]# mysql -ufailover -p123456 -h127.0.0.1 -P6033
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 9
Server version: 5.5.30 (ProxySQL)
Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select * from test;
+------+
| id |
+------+
| 111 |
| 111 |
| 3 |
| 3 |
| 3 |
| 3 |
| 3 |
| 3 |
| 3 |
| 4 |
| 45 |
| 695 |
| 6676 |
| 6676 |
+------+
14 rows in set (0.00 sec)
mysql> select @@hostname;
+------------+
| @@hostname |
+------------+
| node2 |
+------------+
1 row in set (0.01 sec)
mysql> insert into test values(43536);
Query OK, 1 row affected (0.01 sec)
mysql> select * from test;
+-------+
| id |
+-------+
| 111 |
| 111 |
| 3 |
| 3 |
| 3 |
| 3 |
| 3 |
| 3 |
| 3 |
| 4 |
| 45 |
| 695 |
| 6676 |
| 6676 |
| 43536 |
+-------+
15 rows in set (0.00 sec)
mysql> \q
Bye
查看proxysql:
[root@nodetest ~]# mysql -uadmin -padmin -h127.0.0.1 -P6032
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 10
Server version: 5.5.30 (ProxySQL Admin Module)
Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> select * from stats_mysql_query_digest_reset;
+-----------+--------------------+----------+--------------------+----------------------------------+------------+------------+------------+----------+----------+----------+
| hostgroup | schemaname | username | digest | digest_text | count_star | first_seen | last_seen | sum_time | min_time | max_time |
+-----------+--------------------+----------+--------------------+----------------------------------+------------+------------+------------+----------+----------+----------+
| 1 | test | failover | 0x6FA973F1E0331C0E | insert into test values(?) | 1 | 1536649818 | 1536649818 | 11699 | 11699 | 11699 |
| 2 | test | failover | 0x82A12D4C4E7B0A28 | select @@hostname | 1 | 1536649811 | 1536649811 | 495 | 495 | 495 |
| 2 | test | failover | 0x38DF1D37B3136F42 | select * from test | 2 | 1536649806 | 1536649820 | 1253 | 547 | 706 |
| 1 | test | failover | 0x02033E45904D3DF0 | show databases | 1 | 1536649802 | 1536649802 | 674 | 674 | 674 |
| 2 | information_schema | failover | 0x620B328FE9D6D71A | SELECT DATABASE() | 1 | 1536649802 | 1536649802 | 880 | 880 | 880 |
| 1 | test | failover | 0x99531AEFF718C501 | show tables | 1 | 1536649802 | 1536649802 | 590 | 590 | 590 |
| 1 | information_schema | failover | 0x226CD90D52A2BA0B | select @@version_comment limit ? | 1 | 1536649800 | 1536649800 | 0 | 0 | 0 |
+-----------+--------------------+----------+--------------------+----------------------------------+------------+------------+------------+----------+----------+----------+
7 rows in set (0.00 sec)
现在的读写组都是旧slave机,所以可以看到插入和查询语句都已经负载到了192.168.0.109这台机器上,证明高可用成功。
注意:
You should also note that, in MySQL 5.6.7 and later, enforcing foreign key relationships between tables in different databases causes multithreaded slaves to use sequential rather than parallel mode, which can have a negative impact on performance. (Bug #14092635)
意思是说如果不同库之间开启外键的话,会对并行复制的性能有很大影响,因此,在任何情况下,都不要使用外键。
未经允许,谢绝转载