校验主从数据并修复
1.先安装pt工具
[root@bogon ~]# yum -y install percona-toolkit
2.主库数据:
root@localhost:mysql3316.sock 16:18:03 [zhangshuo]>select * from zs; +----+--------------+ | id | name | +----+--------------+ | 1 | zhangshuo | | 2 | jie | | 3 | zhangsanfeng | +----+--------------+ 3 rows in set (0.00 sec)
3.从库数据:
root@localhost:mysql3317.sock 16:20:48 [zhangshuo]>select * from zs; +----+-----------------+ | id | name | +----+-----------------+ | 1 | zhangshuo | | 2 | jie | | 3 | zhangsanfeng | | 4 | 从入门进阶 | | 5 | 到删库跑路 | +----+-----------------+ 5 rows in set (0.00 sec)
4.很明显数据不一致,我们使用工具来检测:
首先要在从库配置文件中定义:
report_host=slave_ip
report_port=slave_port
root@localhost:mysql3316.sock 08:51:35 [percona]>grant all privileges on *.* to 'repl'@'%' identified by'repl'; root@localhost:mysql3316.sock 08:53:05 [percona]>flush privileges;
[root@bogon ~]# pt-table-checksum -S /tmp/mysql3316.sock -P 3316 --user=repl --password=repl --host=192.168.1.113 --databases=zhangshuo --recursion-method='hosts' --no-check-binlog-format --create-replicate-table --replicate=zhangshuo.checksums
TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE 07-30T09:02:36 0 1 3 1 0 0.055 zhangshuo.zs
TS #检测完成时间。 ERRORS #检查时候发生错误的数量。 DIFFS #0表示一致,1表示不一致。 ROWS #表的行数。 CHUNKS #被划分到表中的块的数目。 SKIPPED #过多错误而跳过块的数目。 TIME #执行时间 TABLE #检查的表名
参数解析:
-S /tmp/mysql3316.sock #指定sock
-P 3316 #指定端口
--user=repl #指定用户
--password=repl #指定密码
--host=192.168.1.113 #指定检测主机
--databases=zhangshuo #指定检测库
--recursion-method='hosts' #发现从库方式
--no-check-binlog-forma #如果是row格式会报错,需跳过。
--create-replicate-table --replicate=zhangshuo.checksums #指定检测结果输出表
5.通过DIFFS=1,确定主从数据不一致。因指定了参数--create-replicate-table --replicate=zhangshuo.checksums。进slave查看检测信息:
root@localhost:mysql3317.sock 10:08:28 [zhangshuo]>select * from checksums\G *************************** 1. row *************************** db: zhangshuo tbl: zs chunk: 1 chunk_time: 0.000992 chunk_index: NULL lower_boundary: NULL upper_boundary: NULL this_crc: 855a644a this_cnt: 5 #从库 master_crc: abfa5704 master_cnt: 3 #主库 发现从库比主库多了两条数据 ts: 2016-07-30 09:54:04 1 row in set (0.00 sec)
6.使用pt-table-sync来修复:
(1)可以使用--print参数将修复语句打印出来。 #修复语句也可以使用source来执行。
[root@bogon ~]# pt-table-sync --replicate=zhangshuo.checksums --socket=/tmp/mysql3316.sock --port=3316 h=192.168.1.113,u=repl,p=repl --socket=/tmp/mysql3317.sock,--port=3317,h=192.168.1.113,u=repl,p=repl --print DELETE FROM `zhangshuo`.`zs` WHERE `id`='4' LIMIT 1 /*percona-toolkit src_db:zhangshuo src_tbl:zs src_dsn:P=3316,S=/tmp/mysql3317.sock,h=192.168.1.113,p=...,u=repl dst_db:zhangshuo dst_tbl:zs dst_dsn:P=3317,S=/tmp/mysql3317.sock,h=192.168.1.113,p=...,u=repl lock:1 transaction:1 changing_src:zhangshuo.checksums replicate:zhangshuo.checksums bidirectional:0 pid:8659 user:root host:bogon*/; DELETE FROM `zhangshuo`.`zs` WHERE `id`='5' LIMIT 1 /*percona-toolkit src_db:zhangshuo src_tbl:zs src_dsn:P=3316,S=/tmp/mysql3317.sock,h=192.168.1.113,p=...,u=repl dst_db:zhangshuo dst_tbl:zs dst_dsn:P=3317,S=/tmp/mysql3317.sock,h=192.168.1.113,p=...,u=repl lock:1 transaction:1 changing_src:zhangshuo.checksums replicate:zhangshuo.checksums bidirectional:0 pid:8659 user:root host:bogon*/;
(2)直接修复:
[root@bogon ~]# pt-table-sync --replicate=zhangshuo.checksums --socket=/tmp/mysql3316.sock --port=3316 h=192.168.1.113,u=repl,p=repl --socket=/tmp/mysql3317.sock,--port=3317,h=192.168.1.113,u=repl,p=repl --execute
7.进入从库查看发现多出的两条数据被删除了:
root@localhost:mysql3317.sock 11:00:01 [zhangshuo]>select * from zs; +----+--------------+ | id | name | +----+--------------+ | 1 | zhangshuo | | 2 | jie | | 3 | zhangsanfeng | +----+--------------+ 3 rows in set (0.00 sec)
8.重新校验确定中从数据同步:(DIFFS列为0)
[root@bogon ~]# pt-table-checksum -S /tmp/mysql3316.sock -P 3316 --user=repl --password=repl --host=192.168.1.113 --databases=zhangshuo --recursion-method='hosts' --no-check-binlog-format --create-replicate-table --replicate=zhangshuo.checksums TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE 07-30T11:14:55 0 0 3 1 0 0.010 zhangshuo.zs
9.分析pt-table-sync是怎么工作的:
(1)在从库更改一条数据:
root@localhost:mysql3317.sock 11:30:09 [zhangshuo]>select * from zs; +----+--------------+ | id | name | +----+--------------+ | 1 | zhangshuo | | 2 | jie | | 3 | zhangsanfeng | +----+--------------+ 3 rows in set (0.00 sec root@localhost:mysql3317.sock 11:32:40 [zhangshuo]>update zs set name='zhangjie' where id = 1; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 root@localhost:mysql3317.sock 11:33:01 [zhangshuo]>select * from zs; +----+--------------+ | id | name | +----+--------------+ | 1 | zhangjie | | 2 | jie | | 3 | zhangsanfeng | +----+--------------+ 3 rows in set (0.00 sec)
(2)主库使用pt-table-checksum校验,并发现主从数据不一致(DIFFS为1):
[root@bogon ~]# pt-table-checksum -S /tmp/mysql3316.sock -P 3316 --user=repl --password=repl --host=192.168.1.113 --databases=zhangshuo --recursion-method='hosts' --no-check-binlog-format --create-replicate-table --replicate=zhangshuo.checksums TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE 07-30T11:33:55 0 1 3 1 0 0.008 zhangshuo.zs
(3)使用pt-table-sync修复,因为需要解析binlog,所以在数据修复之前flush logs方便实验:
root@localhost:mysql3316.sock 11:29:10 [zhangshuo]>flush logs; [root@bogon ~]# pt-table-sync --replicate=zhangshuo.checksums --socket=/tmp/mysql3316.sock --port=3316 h=192.168.1.113,u=repl,p=repl --socket=/tmp/mysql3317.sock,--port=3317,h=192.168.1.113,u=repl,p=repl --execute
(4)解析binlog查看pt-table-sync怎样修复数据:
[root@bogon ~]# nohup /usr/local/mysql/bin/mysqlbinlog -v --base64-output=decode-rows /data/mysql/mysql3316/logs/mysql-bin.000008 nohup: 忽略输入并把输出追加到"nohup.out"
BEGIN /*!*/; # at 2345 #160730 11:34:42 server id 1283316 end_log_pos 2836 CRC32 0x66f458b4 Query thread_id=63 exec_time=0 error_code=0 SET TIMESTAMP=1469849682/*!*/; REPLACE INTO `zhangshuo`.`zs`(`id`, `name`) VALUES ('1', 'zhangshuo') /*percona-toolkit src_db:zhangshuo src_tbl:zs src_dsn:P=3316,S=/tmp/mysql3317.sock,h=192.168.1.113,p=...,u=repl dst_db:zhangshuo dst_tbl:zs dst_dsn:P=3317,S=/tmp/mysql3317.sock,h=192.168.1.113,p=...,u=repl lock:1 transaction:1 changing_src:zhangshuo.checksums replicate:zhangshuo.checksums bidirectional:0 pid:8948 user:root host:bogon*/ /*!*/; # at 2836 #160730 11:34:42 server id 1283316 end_log_pos 2867 CRC32 0x7f9ef3d5 Xid = 1668 COMMIT/*!*/; DELIMITER ; # End of log file ROLLBACK /* added by mysqlbinlog */; /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;