MySQL的GTID复制
从mysql5.6开始引入全局事务标识符(GTID),即每个事务都有一个唯一的标识符。服务器上的每个事务都被分配一个唯一的事务标识符,这是一个64位非零的数值,根据事务提交的顺序分配。GTID的构成是由服务器的Uuid和事务的提交顺序两部分组成的。
复制事务的时候如果启用了全局事务标识符,不管事务被复制了多少次,事务的GTID保持不变。
注意的是:GTID被写入二进制日志,并且只会分配给已经写入二进制日志的事务。也就是说,如果关闭二进制日志,事务就不会分配GTID了。不管master还是slave都是这样。所以,如果想使用slave做故障转移,需要开启二进制日志,如果没有开启二进制日志,slave就不会记下事务的GTID。
首先来配置GTID复制
首先在从上清除当前的基于filename和pos的复制状态
mysql> stop slave; Query OK, 0 rows affected (0.01 sec) mysql> reset slave all; Query OK, 0 rows affected (0.02 sec) mysql> show slave status\G Empty set (0.00 sec)
主和从均开启GTID,设置GTID复制!因为之前两台服务器时主从复制,因此状态是一致的,因此不用再拷贝数据!
同步数据,设置复制账户都需要做!因为之前已经是主从,复制账户已经存在。
#主从均做如下设置 log-bin= log_slave_updates gtid-mode=on enforce-gtid-consistency
log-bin= #在基于filename和pos做主从时,没有开启备用服务器的二进制日志,做gtid复制时,需要开启二进制日志,原因后面会提到!
log_slave_updates: 这个是在基于filename和pos做主从时,用于做级联复制,在MySQL5.6中左gtid好像必须要开启这个参数,MySQL5.7不再强制必须!
gtid-mode=on: 开启gitd模式
enforce-gtid-consistency:确保如果语句的记录与全局事务标识符不一致,语句就报错。
设置完之后重启服务器:
在从上做如下设置
mysql> change master to master_host="10.0.102.214", master_port=3306,master_user="repl",master_password="123456",master_auto_position=1; Query OK, 0 rows affected, 2 warnings (0.02 sec) mysql> start slave;
Query OK, 0 rows affected (0.02 sec)
# master_auto_position使slave在连接master的时候,自动与master协商应该发送什么事务。
mysql> show slave status\G #与之前的复制相比,多了gitd的信息
*************************** 1. row ***************************
...........
Master_UUID: 4687e05d-f37f-11e8-8fc7-fa336351fc00 #master的UUID
Retrieved_Gtid_Set: 4687e05d-f37f-11e8-8fc7-fa336351fc00:1-2
Executed_Gtid_Set: 4687e05d-f37f-11e8-8fc7-fa336351fc00:1-2
Retrieved_Gtid_Set:这是从master获取而来的,存储在中继日志中的一组GTID.
Executed_Gtid_Set: 这是slave上执行,并且已经写入slave的二进制日志的一组GTID。
在从上查看二进制日志
mysql> show binlog events; #默认读取当前正在使用的二进制日志 +------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+ | Log_name | Pos | Event_type | Server_id | End_log_pos | Info | +------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+ | test2-bin.000001 | 4 | Format_desc | 3 | 123 | Server ver: 5.7.22-log, Binlog ver: 4 | | test2-bin.000001 | 123 | Previous_gtids | 3 | 154 | | | test2-bin.000001 | 154 | Gtid | 5 | 219 | SET @@SESSION.GTID_NEXT= '4687e05d-f37f-11e8-8fc7-fa336351fc00:1' | | test2-bin.000001 | 219 | Query | 5 | 348 | use `mytest`; create table tb2(id int auto_increment primary key) | | test2-bin.000001 | 348 | Gtid | 5 | 413 | SET @@SESSION.GTID_NEXT= '4687e05d-f37f-11e8-8fc7-fa336351fc00:2' | | test2-bin.000001 | 413 | Query | 5 | 476 | BEGIN | | test2-bin.000001 | 476 | Table_map | 5 | 524 | table_id: 108 (mytest.tb2) | | test2-bin.000001 | 524 | Write_rows | 5 | 564 | table_id: 108 flags: STMT_END_F | | test2-bin.000001 | 564 | Xid | 5 | 595 | COMMIT /* xid=9 */ | +------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+ 9 rows in set (0.01 sec)
#在二进制日志事件中可以看到Executed_Gtid_Set的gitd集合已经在slave上执行
gitd的复制是怎么找到二进制日志的复制点的?
在我们做filename和pos的复制时,手动指定了二进制日志的文件和位置,但是gtid怎么找到二进制日志的复制点的?从上面的二进制日志看到,event有一个Previous_gtids事件,这个事件指定的是前一个二进制日志事件的最后的gtid的数值,把当前从执行到的gtid与Previous_gtids比较,确定二进制日志的文件,然后再对比gtid的大小,确定日志的位置!因为当前是一个新开始的gitd复制,因此Previous_gtids值为0,我们强制轮换主的二进制,查看数据如下!
mysql> flush logs; #强制轮换二进制日志,会进行一次显式刷新磁盘 Query OK, 0 rows affected (0.00 sec) mysql> show binlog events in "test3-bin.000006"; #因为之前的执行了两个事务,因此Previous_gtids指向为1-2. +------------------+------+----------------+-----------+-------------+-------------------------------------------------------------------+ | Log_name | Pos | Event_type | Server_id | End_log_pos | Info | +------------------+------+----------------+-----------+-------------+-------------------------------------------------------------------+ | test3-bin.000006 | 4 | Format_desc | 5 | 123 | Server ver: 5.7.22-log, Binlog ver: 4 | | test3-bin.000006 | 123 | Previous_gtids | 5 | 194 | 4687e05d-f37f-11e8-8fc7-fa336351fc00:1-2 | | test3-bin.000006 | 194 | Gtid | 5 | 259 | SET @@SESSION.GTID_NEXT= '4687e05d-f37f-11e8-8fc7-fa336351fc00:3' | | test3-bin.000006 | 259 | Query | 5 | 333 | BEGIN | | test3-bin.000006 | 333 | Table_map | 5 | 381 | table_id: 109 (mytest.tb1) | | test3-bin.000006 | 381 | Write_rows | 5 | 421 | table_id: 109 flags: STMT_END_F | | test3-bin.000006 | 421 | Xid | 5 | 452 | COMMIT /* xid=40 */ | | test3-bin.000006 | 452 | Gtid | 5 | 517 | SET @@SESSION.GTID_NEXT= '4687e05d-f37f-11e8-8fc7-fa336351fc00:4' |
我们知道GTID是由服务器的UUID+事务的执行顺序组成的,服务器的UUID存在于datadir指定目录下面:
mysql> show variables like "datadir"; +---------------+--------------+ | Variable_name | Value | +---------------+--------------+ | datadir | /data/mysql/ | +---------------+--------------+ 1 row in set (0.00 sec) mysql> system cat /data/mysql/auto.cnf; #服务器的UUID [auto] server-uuid=4687e05d-f37f-11e8-8fc7-fa336351fc00
上面我们搭建了一个简易的GITD复制,那么GTID是怎么复制的,GTID的复制原理是什么?
master更新数据时,会在事务前产生GTID,一同记录到binlog日志中。 slave端的i/o线程将变更的binlog,写入到本地的relay log中。 sql线程从relay log中获取GTID,然后对比slave端的binlog是否有记录。【对比本地的binlog中是否有记录,因此slave需要开通二进制日志】 如果有记录,说明该GTID的事务已经执行,slave会忽略。 如果没有记录,slave就会从relay log中执行该GTID的事务,并记录到binlog。
查看当前master和从的二进制日志点和gtid值!
##在master上 mysql> show master status; +------------------+----------+--------------+------------------+------------------------------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+----------+--------------+------------------+------------------------------------------+ | test3-bin.000006 | 1226 | | | 4687e05d-f37f-11e8-8fc7-fa336351fc00:1-6 | +------------------+----------+--------------+------------------+------------------------------------------+ 1 row in set (0.00 sec) mysql> #在从上执行 mysql> show master status; +------------------+----------+--------------+------------------+------------------------------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+----------+--------------+------------------+------------------------------------------+ | test2-bin.000002 | 194 | | | 4687e05d-f37f-11e8-8fc7-fa336351fc00:1-6 | +------------------+----------+--------------+------------------+------------------------------------------+ 1 row in set (0.00 sec)
#可以看到日志名称不一样,日志的pos不一样,但是gtid却是一样的
测试在从上插入一条数据:
mysql> insert into tb1 select null; #插入的是自增主键的数值 Query OK, 1 row affected (0.01 sec) Records: 1 Duplicates: 0 Warnings: 0 #查看二进制日志中的事件,是在begin开始一个事务之前,写入了GTID的数值 | test3-bin.000006 | 1226 | Gtid | 5 | 1291 | SET @@SESSION.GTID_NEXT= '4687e05d-f37f-11e8-8fc7-fa336351fc00:7' | | test3-bin.000006 | 1291 | Query | 5 | 1365 | BEGIN | | test3-bin.000006 | 1365 | Table_map | 5 | 1413 | table_id: 109 (mytest.tb1) | | test3-bin.000006 | 1413 | Write_rows | 5 | 1453 | table_id: 109 flags: STMT_END_F | | test3-bin.000006 | 1453 | Xid | 5 | 1484 | COMMIT /* xid=67 */
mysql> show variables like "binlog_format"; #日志格式是row
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| binlog_format | ROW |
+---------------+-------+
1 row in set (0.00 sec) |
在从上查看二进制日志
#前面执行了flush logs命令!
mysql> show binlog events in "test2-bin.000002"; +------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+ | Log_name | Pos | Event_type | Server_id | End_log_pos | Info | +------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+ | test2-bin.000002 | 4 | Format_desc | 3 | 123 | Server ver: 5.7.22-log, Binlog ver: 4 | | test2-bin.000002 | 123 | Previous_gtids | 3 | 194 | 4687e05d-f37f-11e8-8fc7-fa336351fc00:1-6 | | test2-bin.000002 | 194 | Gtid | 5 | 259 | SET @@SESSION.GTID_NEXT= '4687e05d-f37f-11e8-8fc7-fa336351fc00:7' | | test2-bin.000002 | 259 | Query | 5 | 322 | BEGIN | | test2-bin.000002 | 322 | Table_map | 5 | 370 | table_id: 110 (mytest.tb1) | | test2-bin.000002 | 370 | Write_rows | 5 | 410 | table_id: 110 flags: STMT_END_F | | test2-bin.000002 | 410 | Xid | 5 | 441 | COMMIT /* xid=40 */ | +------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+ 7 rows in set (0.00 sec)
使用GTID做故障转移
#主从上都有一张这样的表,数据是一样的 mysql> desc tb2; +-------+---------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------+---------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | +-------+---------+------+-----+---------+----------------+ 1 row in set (0.00 sec) #现在在从从上插入一条数据 mysql> insert into tb2 select 4; Query OK, 1 row affected (0.01 sec) Records: 1 Duplicates: 0 Warnings: 0 #在主上也插入一条数据 mysql> insert into tb2 select 4; Query OK, 1 row affected (0.01 sec) Records: 1 Duplicates: 0 Warnings: 0
mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.0.102.214 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: test3-bin.000007 Read_Master_Log_Pos: 452 Relay_Log_File: test2-relay-bin.000007 Relay_Log_Pos: 407 Relay_Master_Log_File: test3-bin.000007 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1062 Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '4687e05d-f37f-11e8-8fc7-fa336351fc00:8' at master log test3-bin.000007, end_log_pos 421. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any. Skip_Counter: 0 Exec_Master_Log_Pos: 194 Relay_Log_Space: 959 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1062 Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '4687e05d-f37f-11e8-8fc7-fa336351fc00:8' at master log test3-bin.000007, end_log_pos 421. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any. Replicate_Ignore_Server_Ids: Master_Server_Id: 5 Master_UUID: 4687e05d-f37f-11e8-8fc7-fa336351fc00 Master_Info_File: /data/mysql/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: 181203 10:01:08 Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 4687e05d-f37f-11e8-8fc7-fa336351fc00:1-8 Executed_Gtid_Set: 4687e05d-f37f-11e8-8fc7-fa336351fc00:1-7, e2bd1bae-f5cb-11e8-9c8c-fa1dae125200:1 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec) mysql>
#错误说明
Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '4687e05d-f37f-11e8-8fc7-fa336351fc00:8' at master log test3-bin.000007, end_log_pos 421. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any. Retrieved_Gtid_Set: 4687e05d-f37f-11e8-8fc7-fa336351fc00:1-8 #从接收到的gtid, gtid为8的序列没有执行, Executed_Gtid_Set: 4687e05d-f37f-11e8-8fc7-fa336351fc00:1-7, #从执行的gtid,但是却执行了下面的一个gtid e2bd1bae-f5cb-11e8-9c8c-fa1dae125200:1
我们知道是重复了数值,因此忽略掉这一条gitd的执行事务即可!
mysql> select @@gtid_next; #查看下一个要执行的事务,默认是自动选择 +-------------+ | @@gtid_next | +-------------+ | AUTOMATIC | +-------------+ 1 row in set (0.00 sec) mysql> set gtid_next="4687e05d-f37f-11e8-8fc7-fa336351fc00:8"; #我们把gtid_next设置为要忽略的哪一个事务的gtid Query OK, 0 rows affected (0.00 sec) mysql> begin; #执行一个空的事务 Query OK, 0 rows affected (0.00 sec) mysql> commit; Query OK, 0 rows affected (0.01 sec) mysql> set gtid_next="AUTOMATIC"; #把gtid_next设置为原来的AUTOMATIC Query OK, 0 rows affected (0.00 sec) mysql> start slave sql_thread; #开启sql线程 Query OK, 0 rows affected (0.02 sec) mysql> show slave status\G #查看复制已经恢复正常