rpl_semi_sync_master_wait_no_slave 参数研究实验
最近在研究MySQL,刚学到半同步。
半同步的配置中,关于这两个参数:
rpl_semi_sync_master_wait_no_slave
rpl_semi_sync_master_wait_for_slave_count
发现很不好搞懂,请教了一些老师,也做了一些资料搜索,每个人给我的答案都不同:
中文网络上一些典型的说法,这些都是错误的!!!:
rpl_semi_sync_master_wait_for_slave_count:
控制slave应答的数量,默认是1,表示master接收到几个slave应答后才commit。
rpl_semi_sync_master_wait_no_slave :
1.需要等待几个slave节点的ACK,否则一直waiting。
2.当一个事务被提交,但是Master没有Slave连接,这时M不可能收到任何确认信息,但M会在时间限制范围内继续等待。如果没有Slave链接,会切换到异步复制。是否允许master每个事务提交后都要等待slave的接收确认信号。默认为on,每一个事务都会等待。如果为off,则slave追赶上后,也不会开启半同步复制模式,需要手工开启。
好奇心驱使我通过实验来验证大家的说法,可惜的是——都是错的。通过官方文档,看的云里雾里的,只好自己探索。
探索内容
半同步参数
- rpl_semi_sync_master_wait_no_slave(重点)
- rpl_semi_sync_master_wait_for_slave_count
环境信息
role |
ip |
port |
hostname |
etc |
master |
192.168.188.101 |
4306 |
mysqlvm1 |
提示符为mysql> |
|
|
|
|
|
slave |
192.168.188.201 |
4306 |
mysqlvm1-1 |
提示符为mysql4306> |
|
|
5306 |
|
提示符为mysql5306> |
|
|
6306 |
|
提示符为mysql6306> |
|
|
7306 |
|
提示符为mysql7306> |
MySQL版本
5.7.26
前置条件
已配置好主从复制。
配置增强半同步
1.加载lib,所有主从节点都要配置。
主库&从库:
install plugin rpl_semi_sync_master soname 'semisync_master.so'; install plugin rpl_semi_sync_slave soname 'semisync_slave.so';
2.查看,确保所有节点都成功加载。
mysql> show plugins; | rpl_semi_sync_master | ACTIVE | REPLICATION | semisync_master.so | GPL | | rpl_semi_sync_slave | ACTIVE | REPLICATION | semisync_slave.so | GPL |
3.启用半同步
1.先启用从库上的参数,最后启用主库的参数。
从库:
set global rpl_semi_sync_slave_enabled = 1; # 1:启用,0:禁止
主库:
set global rpl_semi_sync_master_enabled = 1; # 1:启用,0:禁止 set global rpl_semi_sync_master_timeout = 60000; # 60秒,时间长些便于实验
2.从库重启io_thread
stop slave io_thread;
start slave io_thread;
查看主库参数
mysql> show global variables like "%sync%"; show global status like "%sync%"; +-------------------------------------------+------------+ | Variable_name | Value | +-------------------------------------------+------------+ | binlog_group_commit_sync_delay | 100 | | binlog_group_commit_sync_no_delay_count | 10 | | innodb_flush_sync | ON | | innodb_sync_array_size | 1 | | innodb_sync_spin_loops | 30 | | rpl_semi_sync_master_enabled | ON | | rpl_semi_sync_master_timeout | 60000 | | rpl_semi_sync_master_trace_level | 32 | | rpl_semi_sync_master_wait_for_slave_count | 3 | | rpl_semi_sync_master_wait_no_slave | ON | | rpl_semi_sync_master_wait_point | AFTER_SYNC | | rpl_semi_sync_slave_enabled | ON | | rpl_semi_sync_slave_trace_level | 32 | | sync_binlog | 1 | | sync_frm | ON | | sync_master_info | 10000 | | sync_relay_log | 10000 | | sync_relay_log_info | 10000 | +-------------------------------------------+------------+ 18 rows in set (0.00 sec) +--------------------------------------------+----------+ | Variable_name | Value | +--------------------------------------------+----------+ | Innodb_data_fsyncs | 664 | | Innodb_data_pending_fsyncs | 0 | | Innodb_os_log_fsyncs | 413 | | Innodb_os_log_pending_fsyncs | 0 | | Rpl_semi_sync_master_clients | 4 | | Rpl_semi_sync_master_net_avg_wait_time | 0 | | Rpl_semi_sync_master_net_wait_time | 0 | | Rpl_semi_sync_master_net_waits | 199 | | Rpl_semi_sync_master_no_times | 21 | | Rpl_semi_sync_master_no_tx | 48 | | Rpl_semi_sync_master_status | ON | | Rpl_semi_sync_master_timefunc_failures | 0 | | Rpl_semi_sync_master_tx_avg_wait_time | 3280008 | | Rpl_semi_sync_master_tx_wait_time | 72160195 | | Rpl_semi_sync_master_tx_waits | 22 | | Rpl_semi_sync_master_wait_pos_backtraverse | 0 | | Rpl_semi_sync_master_wait_sessions | 0 | | Rpl_semi_sync_master_yes_tx | 20 | | Rpl_semi_sync_slave_status | OFF | +--------------------------------------------+----------+ 19 rows in set (0.00 sec)
============================我是分割线=====================================
实验1:
场景:
rpl_semi_sync_master_wait_no_slave=ON
rpl_semi_sync_master_wait_for_slave_count=3
slave存活1,其他slave停止,会发生什么?
步骤:
1.只保留一个slave,停止其他3个slave
mysql5306> stop slave; mysql6306> stop slave; mysql7306> stop slave;
2.立即查询master
为减少信息干扰,只截取了需要关注的数据
mysql> show global variables like "%sync%"; show global status like "%sync%"; +-------------------------------------------+------------+ | Variable_name | Value | +-------------------------------------------+------------+ | rpl_semi_sync_master_enabled | ON | | rpl_semi_sync_master_timeout | 60000 | | rpl_semi_sync_master_wait_for_slave_count | 3 | | rpl_semi_sync_master_wait_no_slave | ON | | rpl_semi_sync_master_wait_point | AFTER_SYNC | | rpl_semi_sync_slave_enabled | ON | +-------------------------------------------+------------+ 18 rows in set (0.00 sec) +--------------------------------------------+----------+ | Variable_name | Value | +--------------------------------------------+----------+ | Rpl_semi_sync_master_clients | 4 | | Rpl_semi_sync_master_status | ON | | Rpl_semi_sync_slave_status | OFF | +--------------------------------------------+----------+ 19 rows in set (0.00 sec)
3.等待一分钟,再查询master
mysql> show global variables like "%sync%"; show global status like "%sync%"; +-------------------------------------------+------------+ | Variable_name | Value | +-------------------------------------------+------------+ | rpl_semi_sync_master_enabled | ON | | rpl_semi_sync_master_timeout | 60000 | | rpl_semi_sync_master_wait_for_slave_count | 3 | | rpl_semi_sync_master_wait_no_slave | ON | | rpl_semi_sync_master_wait_point | AFTER_SYNC | | rpl_semi_sync_slave_enabled | ON | +-------------------------------------------+------------+ 18 rows in set (0.00 sec) +--------------------------------------------+----------+ | Variable_name | Value | +--------------------------------------------+----------+ | Rpl_semi_sync_master_clients | 1 | | Rpl_semi_sync_master_status | ON | | Rpl_semi_sync_slave_status | OFF | +--------------------------------------------+----------+ 19 rows in set (0.00 sec)
4.另外开启一个master会话,开始一个事务并提交
mysql> insert into new values(4); [挂起……]
5.查看master
发现无变化
mysql> show global variables like "%sync%"; show global status like "%sync%"; +-------------------------------------------+------------+ | Variable_name | Value | +-------------------------------------------+------------+ | rpl_semi_sync_master_enabled | ON | | rpl_semi_sync_master_timeout | 60000 | | rpl_semi_sync_master_wait_for_slave_count | 3 | | rpl_semi_sync_master_wait_no_slave | ON | | rpl_semi_sync_master_wait_point | AFTER_SYNC | | rpl_semi_sync_slave_enabled | ON | +-------------------------------------------+------------+ 18 rows in set (0.00 sec) +--------------------------------------------+----------+ | Variable_name | Value | +--------------------------------------------+----------+ | Rpl_semi_sync_master_clients | 1 | | Rpl_semi_sync_master_status | ON | | Rpl_semi_sync_slave_status | OFF | +--------------------------------------------+----------+ 19 rows in set (0.00 sec)
6.查看存活的slave
发现事务已经被应用
mysql> select * from new; +------+ | name | +------+ | 2 | +------+ 1 rows in set (0.00 sec)
7.等待master事务超时后完成。
这里master提交到收到结果,用了1分钟0.01秒,该事务因slave的ack应答不够,发生了等待。
mysql> insert into new values(4); Query OK, 1 row affected (1 min 0.01 sec)
8.查看master
可以发现,事务超时后,master已经转为异步复制(Rpl_semi_sync_master_status=OFF)
mysql> show global variables like "%sync%"; show global status like "%sync%"; +-------------------------------------------+------------+ | Variable_name | Value | +-------------------------------------------+------------+ | rpl_semi_sync_master_enabled | ON | | rpl_semi_sync_master_timeout | 60000 | | rpl_semi_sync_master_wait_for_slave_count | 3 | | rpl_semi_sync_master_wait_no_slave | ON | | rpl_semi_sync_master_wait_point | AFTER_SYNC | | rpl_semi_sync_slave_enabled | ON | +-------------------------------------------+------------+ 18 rows in set (0.00 sec) +--------------------------------------------+----------+ | Variable_name | Value | +--------------------------------------------+----------+ | Rpl_semi_sync_master_clients | 1 | | Rpl_semi_sync_master_status | OFF | | Rpl_semi_sync_slave_status | OFF | +--------------------------------------------+----------+ 19 rows in set (0.00 sec)
9.依次启动已经停止的slave,每启动一个,便立即查看一次master状态
可以看到,启动slave的动作会立即被master接收到,并且master会自动切换回半同步模式(Rpl_semi_sync_master_status=ON)
mysql5306> start slave; mysql> show global status like "%sync%"; +--------------------------------------------+----------+ | Variable_name | Value | +--------------------------------------------+----------+ | Rpl_semi_sync_master_clients | 2 | | Rpl_semi_sync_master_status | OFF | +--------------------------------------------+----------+ 19 rows in set (0.00 sec) mysql6306> start slave; mysql> show global status like "%sync%"; +--------------------------------------------+----------+ | Variable_name | Value | +--------------------------------------------+----------+ | Rpl_semi_sync_master_clients | 3 | | Rpl_semi_sync_master_status | ON | +--------------------------------------------+----------+ 19 rows in set (0.00 sec) mysql7306> start slave; mysql> show global status like "%sync%"; +--------------------------------------------+----------+ | Variable_name | Value | +--------------------------------------------+----------+ | Rpl_semi_sync_master_clients | 4 | | Rpl_semi_sync_master_status | ON | +--------------------------------------------+----------+ 19 rows in set (0.00 sec)
============================我是分割线=====================================
实验2:
场景:
rpl_semi_sync_master_wait_no_slave=OFF
rpl_semi_sync_master_wait_for_slave_count=3
slave存活1,其他slave停止,会发生什么?
步骤:
1.继续实验1的环境,首先设置参数并查看master状态
为减少信息干扰,只截取了需要关注的数据
mysql> set global rpl_semi_sync_master_wait_no_slave=OFF; Query OK, 0 rows affected (0.00 sec) mysql> show global variables like "%sync%"; show global status like "%sync%"; +-------------------------------------------+------------+ | Variable_name | Value | +-------------------------------------------+------------+ | rpl_semi_sync_master_enabled | ON | | rpl_semi_sync_master_timeout | 60000 | | rpl_semi_sync_master_wait_for_slave_count | 3 | | rpl_semi_sync_master_wait_no_slave | OFF | | rpl_semi_sync_master_wait_point | AFTER_SYNC | | rpl_semi_sync_slave_enabled | ON | +-------------------------------------------+------------+ 18 rows in set (0.00 sec) +--------------------------------------------+----------+ | Variable_name | Value | +--------------------------------------------+----------+ | Rpl_semi_sync_master_clients | 4 | | Rpl_semi_sync_master_status | ON | | Rpl_semi_sync_slave_status | OFF | +--------------------------------------------+----------+ 19 rows in set (0.00 sec)
2.只保留一个slave,依次停止其他3个slave
mysql5306> stop slave; mysql6306> stop slave; mysql7306> stop slave;
3.等待一分钟,查看master状态。(在等待期间可以通过反复查看master状态,来观察master数据的变化)
master已经转为异步模式复制。
mysql> show global status like "%sync%"; +--------------------------------------------+----------+ | Variable_name | Value | +--------------------------------------------+----------+ | Rpl_semi_sync_master_clients | 1 | | Rpl_semi_sync_master_status | OFF | | Rpl_semi_sync_slave_status | OFF | +--------------------------------------------+----------+ 19 rows in set (0.00 sec)
4.使用另一个master的会话,开始一个事务并提交。 这里master提交到收到结果,该事务没有发生等待。
mysql> insert into new values(5); Query OK, 1 row affected (0.02 sec)
5.立即查看存活的slave
事务已经被应用。
mysql> select * from new; +------+ | name | +------+ | 4 | | 5 | +------+ 2 rows in set (0.00 sec)
6.依次启动其他slave,每启动一个便立即查看master状态
可以看到,启动slave的动作会立即被master接收到,并且master会自动切换回半同步模式(Rpl_semi_sync_master_status=ON)
mysql5306> start slave; mysql> show global status like "%sync%"; +--------------------------------------------+----------+ | Variable_name | Value | +--------------------------------------------+----------+ | Rpl_semi_sync_master_clients | 2 | | Rpl_semi_sync_master_status | OFF | | Rpl_semi_sync_slave_status | OFF | +--------------------------------------------+----------+ 19 rows in set (0.00 sec) mysql6306> start slave; mysql> show global status like "%sync%"; +--------------------------------------------+----------+ | Variable_name | Value | +--------------------------------------------+----------+ | Rpl_semi_sync_master_clients | 3 | | Rpl_semi_sync_master_status | ON | | Rpl_semi_sync_slave_status | OFF | +--------------------------------------------+----------+ 19 rows in set (0.00 sec) mysql7306> start slave; mysql> show global status like "%sync%"; +--------------------------------------------+----------+ | Variable_name | Value | +--------------------------------------------+----------+ | Rpl_semi_sync_master_clients | 4 | | Rpl_semi_sync_master_status | ON | | Rpl_semi_sync_slave_status | OFF | +--------------------------------------------+----------+ 19 rows in set (0.00 sec)
============================我是分割线=====================================
结论:
使用直白简短好理解的语言:
【rpl_semi_sync_master_wait_for_slave_count】 ,
1.master提交后所需的应答数量!如果slave clients数量大于等于这个值,那么master会一路畅行无阻;如果低于这个值,master可能会在事务提交阶段发生一次超时等待,当等待超过参数(rpl_semi_sync_master_timeout)设定时,master就转为异步模式(原理见下一个参数)。
2.master将这个参数值作为标杆,用来和【Rpl_semi_sync_master_clients】参数做比较。
【rpl_semi_sync_master_wait_no_slave】
1.为OFF时,只要master发现【Rpl_semi_sync_master_clients】小于【rpl_semi_sync_master_wait_for_slave_count】,则master立即转为异步模式。
2.为ON时,空闲时间(无事务提交)里,即使master发现【Rpl_semi_sync_master_clients】小于【rpl_semi_sync_master_wait_for_slave_count】,也不会做任何调整。只要保证在事务超时之前,master收到大于等于【rpl_semi_sync_master_wait_for_slave_count】值的ACK应答数量,master就一直保持在半同步模式;如果在事务提交阶段(master等待ACK)超时,master才会转为异步模式。
无论【rpl_semi_sync_master_wait_no_slave】为ON还是OFF,当slave上线到【rpl_semi_sync_master_wait_for_slave_count】值时,master都会自动由异步模式转为半同步模式。
粗略看来,设置为OFF时,master不会因为slave的离线而造成事务等待,这似乎是一个更合适的选择,但是为什么5.7中默认参数为ON呢?目前我还不得而知。
另外,本次实验建立在空闲场景(或者说极微小负载场景)下,在高并发的场景下,这个参数又会导致怎样的结果,这块也尚需后续探索。
============================我是分割线=====================================
官方文档关于参数rpl_semi_sync_master_wait_no_slave的解释:
Controls whether the master waits for the timeout period configured by rpl_semi_sync_master_timeout to expire, even if the slave count drops to less than the number of slaves configured by pl_semi_sync_master_wait_for_slave_count during the timeout period.
When the value of rpl_semi_sync_master_wait_no_slave is ON (the default), it is permissible for the slave count to drop to less than rpl_semi_sync_master_wait_for_slave_count during the timeout period. As long as enough slaves acknowledge the transaction before the timeout period expires, semisynchronous replication continues.
When the value of rpl_semi_sync_master_wait_no_slave is OFF, if the slave count drops to less than the number configured in rpl_semi_sync_master_wait_for_slave_count at any time during the timeout period configured by rpl_semi_sync_master_timeout, the master reverts to normal replication.
This variable is available only if the master-side semisynchronous replication plugin is installed.
官方文档关于参数rpl_semi_sync_master_wait_for_slave_count的解释:
The number of slave acknowledgments the master must receive per transaction before proceeding. By default rpl_semi_sync_master_wait_for_slave_count is 1, meaning that semisynchronous replication proceeds after receiving a single slave acknowledgment. Performance is best for small values of this variable.
For example, if rpl_semi_sync_master_wait_for_slave_count is 2, then 2 slaves must acknowledge receipt of the transaction before the timeout period configured by rpl_semi_sync_master_timeout for semisynchronous replication to proceed. If less slaves acknowledge receipt of the transaction during the timeout period, the master reverts to normal replication.
============================我是结束线=====================================
EOF