Statement based replication writes the queries that modify data in the Binary Log to replicate them on the slave or to use it as a PITR recovery. Here we will see what is the behavior of the MySQL when it needs to log “not usual” queries like Events, Functions, Stored Procedures, Local Variables, etc. We’ll learn what problems can we have and how to avoid them.
基于语句的复制把修改数据的语句写到二进制日志中,然后复制到从服务器,或者用来做PITR(Point In Time Recovery)还原。在这里我们来看看MYSQL日志是怎么记录“不常见”的查询,如事件,函数,存储过程,局部变量等等。我们将得知我们会遇到的问题以及如何解决他们。
TRIGGERS 触发器
When a statement activates a Trigger only the original query is logged not the subsequent triggered statements. If you want to maintain the consistency of your data is necessary to define the same Triggers in Master and Slave servers.
当一个查询触发了一个触发器,记录的只有本查询语句,触发器里的语句不会被记录。如果你想要维护数据的一致性,那么就必须保证主从服务器的触发器定义是一样的。
Example:
mysql> create trigger Copy_data AFTER INSERT on t FOR EACH ROW INSERT INTO t_copy VALUE(NEW.i); mysql> insert into t VALUES(1),(2),(3);
Binary Log:
#111213 23:16:21 server id 1 end_log_pos 269 Query thread_id=3 exec_time=0 error_code=0 use test/*!*/; SET TIMESTAMP=1323814581/*!*/; insert into t VALUES(1),(2),(3)
This behavior can be a problem to take in account or in some other cases help us in our infrastructure. Is possible to define different Triggers in your replication servers if you need different actions for the same statements.
这个特性可能是一个要考虑的问题,或在别的情况下在基础架构方面帮助我们。如果你需要相同的语句在不同的复制服务器之间产生不同的操作,定义不同的触发器就就可以做到。
FUNCTIONS 函数
Calls to functions are logged directly on the Binary Log. Therefore if you don’t have all functions created on all your servers you will break your replication and the SQL process will stop with an error.
调用函数的语句直接记录到二进制日志中。所以如果你没有在所有的服务器上创建全部是函数,你将会中断复制,SQL线程会停止并报出一个错误。
Example:
mysql> CREATE FUNCTION this_year() RETURNS INT DETERMINISTIC RETURN YEAR(CURDATE()); mysql> insert into t VALUES(this_year());
Binary Log:
#111213 23:25:46 server id 1 end_log_pos 676 Query thread_id=5 exec_time=0 error_code=0 SET TIMESTAMP=1323815146/*!*/; insert into t VALUES(this_year())
If you forget to create the function on your slave the replication will be broken. After executing a SHOW SLAVE STATUS you will see a message like the following one:
如果你忘记在从服务器创建函数了,复制将会中断。执行SHOW SLAVE STATUS你将看到如下信息:
Last_Error: Error 'FUNCTION test.this_year does not exist' on query. Default database: 'test'. Query: 'insert into t VALUES(this_year())'
STORED PROCEDURES 存储过程
The behaviour of stored procedures and functions are completely different. If our stored procedure write data to our tables the queries inside the procedure get logged and not the call to the procedure itself. So in this case you don’t need to replicate all your stored procedures in your slaves servers.
存储过程和函数的特性完全不同。如果存储过程写数据到表里,存储过程里面的SQL语句被记录到日志,而不是记录CALL语句本身。所以在这个情况下你不需要复制存储过程到从服务器。
Example:
mysql> create procedure this_pyear() BEGIN INSERT INTO t VALUES(YEAR(CURDATE())); END;// mysql> CALL this_pyear();
Binary Log:
#111213 23:33:41 server id 1 end_log_pos 2055 Query thread_id=6 exec_time=0 error_code=0 SET TIMESTAMP=1323815621/*!*/; INSERT INTO t VALUES(YEAR(CURDATE()))
We’re going to do it a little bit more difficult: 我们对它做加强一点难度
Our procedure modifies a table that has a associated ON INSERT trigger and we use the previous function to insert the present year. Following the previous rules that we explained before is easy to imagine. The query inside the procedure gets logged with the explicit call to the Function but not the statement that the trigger executed.
我们的存储过程修改一个表,这个表有一个INSERT触发器,我们用前面的函数去插入当前年份。根据前面的分析的规则,很容易想象到:存储过程里面的显式调用语句被记录到日志,触发器里执行的没有被记录。
We have a trigger that calls a the procedure. In this case neither the trigger, the CALL or the queries inside the procedure that modifies data are logged. So in this case you need to manually create the trigger and procedure on the slave servers.
我们有一个触发器调用一个存储过程,在这个情况下,不管是触发器还是存储过程里面的查询语句都被记录了。在这个情况下你需要在从服务器手动创建触发器和存储过程。
EVENTS 计划任务
When we create an Event on the Master server it gets replicated to the slave with the DISABLE ON SLAVE option. Thanks to that the Event is not executed N times for every slave we have in our infrastructure and we won’t duplicate data. The statements inside the Event are logged and not the Event itself.
在主服务器上创建一个计划任务时,复制到从服务器后就带上了DISABLE ON SLAVE参数。多亏了这个,计划任务不会在从服务器上N次执行,也不会产生重复数据。记录到日志的是计划任务里的语句而不是计划任务本身
Example:
mysql> create event year ON SCHEDULE AT CURRENT_TIMESTAMP + INTERVAL 1 HOUR DO INSERT INTO t VALUES(YEAR(CURDATE()));
Binary Logs:
First the event gets logged to the Binary Log: 开始任务被记录到二进制日志
CREATE DEFINER=`msandbox`@`localhost` event year ON SCHEDULE AT CURRENT_TIMESTAMP + INTERVAL 1 HOUR DO INSERT INTO t VALUES(YEAR(CURDATE()))
After one hour the event gets executed. The insert is logged and then re-executed on the slave server so is not necessary to have the events enabled on all servers:
一个小时候任务执行了,INSERT记录在日志,然后在从服务器重新执行。所以不用在所有的服务器上开启计划任务。
#111214 0:51:54 server id 1 end_log_pos 1593 Query thread_id=18 exec_time=0 error_code=0 SET TIMESTAMP=1323820314/*!*/; INSERT INTO t VALUES(YEAR(CURDATE()))
If later we need to promote a Slave to Master we will need to do some steps to enable all the events replicated from the master because as we saw they’re disabled by default. Here we can see the previous event after it gets replicated on the slave:
如果后来我们需要提升一个从服务器变成主服务器,我们需要启用所有的从主服务器复制过来的计划任务,因为它们默认都是关闭的。在这里我们可以看到前面复制到从服务器后的任务
CREATE DEFINER=`msandbox`@`localhost` EVENT `year` ON SCHEDULE AT '2011-12-14 22:03:06' ON COMPLETION NOT PRESERVE DISABLE ON SLAVE DO INSERT INTO t VALUES(YEAR(CURDATE()));
These are the steps we need to do to enable on the promoted slave. 我们需要在提升上来的从服务器上执行下面的操作,去打开计划任务
- We should disable the event manager on Slave with SET GLOBAL event_scheduler = OFF; 在salve上用SET GLOBAL event_scheduler = OFF命令关闭任务管理器
- ENABLE every event with “ALTER EVENT event_name ENABLE” 用ALTER EVENT event_name ENABLE命令打开每个任务
- Enable again the event_scheduler. 重新打开任务管理器
To demote a server we need to follow the same previous steps but in this case we alter every event adding DISABLE ON SLAVE. ALTER EVENT even_name DISABLE ON SLAVE.
降级服务器的步骤一样,但是这时需要对每个任务DISABLE ON SLAVE. 在SLAVE上ALTER EVENT even_name DISABLE .
LOCAL VARIABLES 局部变量
Local Variables are logged as an extra events before the statement itself. So all used variables are replicated on the slave.
局部变量在语句自身前面当做一个额外的项目被记录到日志中。索引用户变量都复制到从服务器。
Example:
mysql> SELECT YEAR(CURDATE()) INTO @this_year; mysql> insert into t VALUES(@this_year)
Binary Log:
#111213 23:58:11 server id 1 end_log_pos 457 User_var SET @`this_year`:=2011/*!*/; # at 457 #111213 23:58:11 server id 1 end_log_pos 552 Query thread_id=13 exec_time=0 error_code=0 SET TIMESTAMP=1323817091/*!*/; insert into t VALUES(@this_year)
Is important to add a note here. There is a bug related with this described on https://bugs.launchpad.net/percona-server/+bug/860910. In a master-master setup a master can show a wrong ‘SHOW SLAVE STATUS’ output when using SET user-variables and then using it to perform writes. The issue is fixed only in Percona server 5.5.17-22.1.
有一个重要的点需要注意。有一个和这有关的BUG,BUG描述在https://bugs.launchpad.net/percona-server/+bug/860910。在双MASTER下,当使用用户变量执行写入时,MASTER会报出错误的‘SHOW SLAVE STATUS’输出。这个问题仅在Percona server 5.5.17-22.1被修复。
AUTO INCREMENTAL VALUES 自增值
In order to have the same auto incremental values on master and slaves the actual used auto incremental value is logged as an extra event just before the statement. MySQL uses the INSERT_ID variable to store that value.
为了在主从端得到一样的自增值,实际使用的自增值在语句前当做一个额外的项目被记录到日志。MYSQL用INSERT_ID变量存储这个值。
Binary Log:
#111214 0:03:04 server id 1 end_log_pos 811 Intvar SET INSERT_ID=1/*!*/; # at 811 #111214 0:03:04 server id 1 end_log_pos 905 Query thread_id=14 exec_time=0 error_code=0 SET TIMESTAMP=1323817384/*!*/; INSERT INTO y (value) VALUES(1)
If we use “LAST_INSERT_ID” on our master the value of that function is logged as a variable on the binary log so the slave can use the same value.
如果我们在MASTER上使用LAST_INSERT_ID,这个函数的值作为一个变量被记录到日志。从而使SLAVE得到相同的值
Binary Log:
#111214 0:04:50 server id 1 end_log_pos 1245 Intvar SET LAST_INSERT_ID=2/*!*/; #111214 0:04:50 server id 1 end_log_pos 1273 Intvar SET INSERT_ID=3/*!*/; #111214 0:04:50 server id 1 end_log_pos 1382 Query thread_id=15 exec_time=0 error_code=0 SET TIMESTAMP=1323817490/*!*/; INSERT INTO y (value) VALUES(LAST_INSERT_ID())
That was for a insert with only one value. What is the behavior if we insert multiple values and we have gaps?
这是只写入单个值,当我们写入多个值并且值不连续时会怎么样呢?
Example: Imagine the table has this values on the auto incremental column: 1,2,3,5 设想这个表在自增字段上有了1,2,3,5
Then we execute the following: 然后我们执行下面的操作
insert into y (i,value) VALUES(4,1),(NULL,1),(NULL,1);
Binary Log:
The value 4 is inserted with the INSERT statement and the INSERT_ID is the next auto incremental value:
4写入到了INSERT语句,INSERT_ID是第二个自增值
#111214 21:54:06 server id 1 end_log_pos 203 Intvar SET INSERT_ID=6/*!*/; #111214 21:54:06 server id 1 end_log_pos 319 Query thread_id=12 exec_time=0 SET TIMESTAMP=1323896046/*!*/; insert into y (i,value) VALUES(4,1),(NULL,1),(NULL,1)
Conclusion 结论
As we can see there are a lot of things to take in account when we are working with statement based replication. Knowing in advances how MySQL works internally can help us to improve our replication availability and data.
如我们看到的那样,基于语句的复制有很多需要关注的地方。提前知道MYSQL的内部操作方式能够帮助我们提高复制的可用性和数据的可靠性。