MySQL主从复制中的问题(Coordinator stopped beacause there were errors in the workers......)
1、show slave status \G 显示如下报错信息:
Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction ...
2、根据提示信息定位报错位置
情况一:"Delete_rows"
select * from performance_schema.replication_applier_status_by_worker \G
Last_SQL_Error: Could not execute Delete_rows event on table xxx ,Can't find record in xxx
原因:在master上删除一条记录,而slave上找不到。
解决方法: 由于master要删除一条记录,而slave上找不到故报错,这种情况主上都将其删除了,那么从机可以直接跳过。
stop slave; set global sql_slave_skip_counter=1; start slave;
如上命令若报错:ERROR 1858 (HY000): sql_slave_skip_counter can not be set when the server is running with @@GLOBAL.GTID_MODE = ON. Instead, for each transaction that you want to skip, generate an empty transaction with the same GTID as the transaction
或者可以换用如下命令:
STOP SLAVE;
SET @@SESSION.GTID_NEXT= 'f396f867-d755-11xxx85-005xxxxxb5a:264261655' --在session里设置gtid_next,即跳过这个GTID
BEGIN; COMMIT; --设置空事物
SET SESSION GTID_NEXT = AUTOMATIC; -- 恢复GTID
START SLAVE;xxxx
情况二:"Duplicate "
Last_SQL_Error: Could not execute Write_rows event on table xxx; Duplicate entry 'xxx' for key 'PRIMARY',
原因:在slave已经有该记录,又在master上插入了同一条记录
解决方法:在从库上删除该记录,或者跳过该记录。然后在master上和slave上再分别确认一下。
情况三:"Update_rows" (还未碰到 待验证)
Last_SQL_Error: Could not execute Update_rows event on table xxx; Can't find record in 'xxx',
参考原因:在master上更新一条记录,而slave上找不到,丢失了数据。
参考方法:在master上,用mysqlbinlog 分析下出错的binlog日志在干什么。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
/usr/local/mysql/bin/mysqlbinlog --no-defaults -v -v --base64-output=DECODE-ROWS mysql-bin.000010 | grep -A '10' 794 #120302 12:08:36 server id 22 end_log_pos 794 Update_rows: table id 33 flags: STMT_END_F ### UPDATE hcy.t1 ### WHERE ### @1=2 /* INT meta=0 nullable=0 is_null=0 */ ### @2='bbc' /* STRING(4) meta=65028 nullable=1 is_null=0 */ ### SET ### @1=2 /* INT meta=0 nullable=0 is_null=0 */ ### @2='BTV' /* STRING(4) meta=65028 nullable=1 is_null=0 */ # at 794 #120302 12:08:36 server id 22 end_log_pos 821 Xid = 60 COMMIT/*!*/; DELIMITER ; # End of log file ROLLBACK /* added by mysqlbinlog */; /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; |
在slave上,查找下更新后的那条记录,应该是不存在的。
mysql> select * from t1 where id=2;
Empty set (0.00 sec)
然后再到master查看
1
2
3
4
5
6
7
|
mysql> select * from t1 where id=2; +----+------+ | id | name | +----+------+ | 2 | BTV | +----+------+ 1 row in set (0.00 sec) |
把丢失的数据在slave上填补,然后跳过报错即可。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
mysql> insert into t1 values (2,'BTV'); Query OK, 1 row affected (0.00 sec) mysql> select * from t1 where id=2; +----+------+ | id | name | +----+------+ | 2 | BTV | +----+------+ 1 row in set (0.00 sec) mysql> stop slave ;set global sql_slave_skip_counter=1;start slave; Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) mysql> show slave status\G; ……
Slave_IO_Running: Yes
Slave_SQL_Running: Yes …… |