MySQL数据一致性校验pt-table-checksum
pt-table-sync与pt-table-checksum 数据一致性校验,percona公司出品
配置percona的yum源。
yum install epel-release
yum install libev qpress
yum install perl perl-devel perl-Time-HiRes perl-DBI perl-DBD-MySQL
yum install percona-toolkit
yum install Percona-Server-shared-51.x86_64 mysql.x86_64 mysql-libs.x86_64
Create database pt CHARACTER SET utf8;
grant update,insert,delete,select,process,super,replication slave on *.* to 'checksum'@'192.168.0.%' identified by 'uYOkwTUP2qkc';
grant all on pt.* to 'checksum'@'192.168.0.%' identified by 'uYOkwTUP2qkc';
grant update,insert,delete,select,process,super,replication slave on *.* to 'checksum'@'localhost' identified by 'a90r9f2A';
grant all on pt.* to 'checksum'@'localhost' identified by 'a90r9f2A';
flush privileges;
use pt;
create table dsns(id int(11) not null auto_increment,parent_id int(11) default null,dsn varchar(255) not null,primary key(id));
插入slave信息。
INSERT INTO dsns (dsn) values('h=192.168.0.14,u=checksum,p=uYOkwTUP2qkc,P=3916');
INSERT INTO dsns (dsn) values('h=192.168.0.14,u=checksum,p=uYOkwTUP2qkc,P=3917');
CREATE TABLE IF NOT EXISTS checksums (
db char(64) NOT NULL,
tbl char(64) NOT NULL,
chunk int NOT NULL,
chunk_time float NULL,
chunk_index varchar(200) NULL,
lower_boundary text NULL,
upper_boundary text NULL,
this_crc char(40) NOT NULL,
this_cnt int NOT NULL,
master_crc char(40) NULL,
master_cnt int NULL,
ts timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (db, tbl, chunk),
INDEX ts_db_tbl (ts, db, tbl)
) ENGINE=InnoDB;
#-h -u -p -P -S -d 连接信息
#--nocheck-replication-filters 检测中忽略mysql 配置参数binlog_ignore_db等。
#--nocheck-binlog-format 不检测日志格式,row会报错。
#--replicate 指定checksum 输出结果存储的db和表, 如test.checksum
# --chunk-size, 指定块行数,可以用单位k(小写),M,G,分别表示千,百万,十亿。
# --chunk-size-limit=n 表示跳过n
# --ignore-databases/tables/column 跳出指定元素的过滤
# --lock-wait-timeout innodb锁的超时设定, 默认为1
# --max-load 设置最大并发连接数
# --replicate-check-only 只输出数据不一致的信息。
# --help 有这个就行了, 以及其他的详见文档。
只需在某一台服务器上安装percona-toolkit,这台服务能够同时访问主库和从库就行了。具体用法见后面实例。
全库校验:(指定一个slave列表,master上运行)
[root@test11 ~]# pt-table-checksum --host=localhost --user=checksum --password=mysql --socket=/usr/local/mysql/3306/mysql.sock \
--replicate=pt.checksums --no-check-binlog-format --no-check-replication-filters \
--max-lag=5m --max-load=Threads_running=50 --chunk-size=1000 --chunk-size-limit=3.0 \
--recursion-method=dsn=h=192.168.1.161,D=pt,t=dsns #这里指定master上的dsns的slave列表。
TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE
05-24T19:11:17 0 0 0 1 0 0.025 mysql.column_stats
05-24T19:11:17 0 0 0 1 0 0.017 mysql.columns_priv
05-24T19:11:17 0 0 2 1 0 0.009 mysql.db
05-24T19:11:17 0 0 0 1 0 0.009 mysql.event
。。。。。。。
从上输出可以看出,全库,不包含:performance_schema、information_schema、sys系统库。但是包含mysql库哦。
忽略mysql库,全库校验(指定一个slave列表,master上运行)
[root@test11 ~]# pt-table-checksum --ignore-databases=mysql --host=localhost --user=checksum --password=a90r9f2A \
--replicate=pt.checksums --no-check-binlog-format --no-check-replication-filters \
--max-lag=5m --max-load=Threads_running=50 --chunk-size=10000 --chunk-size-limit=3.0 \
--recursion-method=dsn=h=192.168.1.161,D=pt,t=dsns
TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE
05-24T19:19:40 0 0 1 1 0 0.262 pt.dsns
05-24T19:19:40 0 1 2 1 0 0.008 test.t
忽略表:--ignore-tables=pt.dsns
pt-table-checksum --ignore-databases=mysql --host=10.10.10.184 --user=checksum --password=a90r9f2A --port=3308 \
--replicate=pt.checksums --no-check-binlog-format --no-check-replication-filters \
--max-lag=5m --max-load=Threads_running=50 --chunk-size=10000 --chunk-size-limit=3.0 \
pt-table-checksum --ignore-databases=mysql --host=localhost --user=root --password=mysql --socket=/usr/local/mysql/3306/mysql.sock \
--replicate=pt.checksums --no-check-binlog-format --no-check-replication-filters \
--max-lag=5m --max-load=Threads_running=50 --chunk-size=10000 --chunk-size-limit=3.0 \
指定库校验:
[root@test11 ~]# pt-table-checksum --databases=jz_ums --host=10.10.10.102 --user=checksum --password=mysql --socket=/usr/local/mysql/3306/mysql.sock \
--replicate=pt.checksums --no-check-binlog-format --no-check-replication-filters \
--max-lag=5m --max-load=Threads_running=50 --chunk-size=1000 --chunk-size-limit=3.0 \
pt-table-checksum --databases=jz_ums --host=10.10.10.102 --user=checksum --password='a90r9f2A' --port=3306\
--replicate=pt.checksums --no-check-binlog-format --no-check-replication-filters --max-lag=5m --max-load=Threads_running=50 --chunk-size=10000
slave上查看:
MariaDB [pt]> select * from checksums;
+------+-----+-------+------------+-------------+----------------+----------------+----------+----------+------------+------------+---------------------+
| db | tbl | chunk | chunk_time | chunk_index | lower_boundary | upper_boundary | this_crc | this_cnt | master_crc | master_cnt | ts |
+------+-----+-------+------------+-------------+----------------+----------------+----------+----------+------------+------------+---------------------+
| test | t | 1 | 0.000767 | NULL | NULL | NULL | 1415828e | 3 | 246be59 | 2 | 2016-05-24 19:01:34 |
+------+-----+-------+------------+-------------+----------------+----------------+----------+----------+------------+------------+---------------------+
1 row in set (0.00 sec)
过滤出不一致表信息(slave上查看:)。
select db, tbl, sum(this_cnt) as total_rows, count(*) as chunks from checksums
where ( master_cnt <> this_cnt OR master_crc <> this_crc OR isnull(master_crc) <> isnull(this_crc) )
group by db, tbl;
生成修复SQL:(slave上执行)
方式1:
先master的IP地址,在slave的IP地址
# pt-table-sync --replicate=test.checksums --charset=utf8 h=127.0.0.1,u=root,p=123456,P=3306 h=192.168.1.106,u=checksum,p=a90r9f2A,P=3306 --print
# pt-table-sync --replicate=test.checksums --charset=utf8 h=127.0.0.1,u=root,p=123456,P=3306 h=192.168.1.106,u=checksum,p=a90r9f2A,P=3306 --execute
方式2:
通过--sync-to-master参数指定一个slave的IP地址就可以了
# pt-table-sync --replicate=test.checksums --databases=dbtest --tables=tb --charset=utf8 --sync-to-master h=192.168.1.106,u=root,p=123456,P=3306 --print
# pt-table-sync --replicate=test.checksums --databases=dbtest --tables=tb --charset=utf8 --sync-to-master h=192.168.1.106,u=root,p=123456,P=3306 --execute
参数的意义
--replicate= :指定通过pt-table-checksum得到的表,这2个工具差不多都会一直用。
--databases= : 指定执行同步的数据库,多个用逗号隔开。
--tables= :指定执行同步的表,多个用逗号隔开。
--sync-to-master :指定一个DSN,即从的IP,他会通过show processlist或show slave status 去自动的找主。
h=127.0.0.1 :服务器地址,命令里有2个ip,第一次出现的是M的地址,第2次是Slave的地址。
u=root :帐号。
p=123456 :密码。
--ask-pass : 交互式输入密码。
--print :打印,但不执行命令。
--execute :执行命令。
线上常用:
pt-table-sync --replicate=pt.checksums --charset=utf8 --sync-to-master h=10.10.10.245,u=checksum,p=a90r9f2A,P=3308 --print
slave上执行:
[root@test12 backup]# pt-table-sync --print --sync-to-master h=localhost,S=/usr/local/mysql/3307/mysql.sock,u=checksum,p='mysql' --databases=test --tables=t
# A software update is available:
# * The current version for Percona::Toolkit is 2.2.14.
DELETE FROM `test`.`t` WHERE `id`='100' LIMIT 1 /*percona-toolkit src_db:test src_tbl:t src_dsn:P=3306,S=/usr/local/mysql/3307/mysql.sock,h=192.168.1.161,p=...,u=checksum dst_db:test
dst_tbl:t dst_dsn:S=/usr/local/mysql/3307/mysql.sock,h=localhost,p=...,u=checksum lock:1 transaction:1 changing_src:1 replicate:0 bidirectional:0 pid:3537 user:root host:test12*/;
[root@test12 backup]# pt-table-sync --print --sync-to-master h=localhost,S=/usr/local/mysql/3307/mysql.sock,u=checksum,p='mysql' --replicate pt.checksums;
DELETE FROM `mysql`.`gtid_slave_pos` WHERE `domain_id`='0' AND `sub_id`='122' LIMIT 1 /*percona-toolkit src_db:mysql src_tbl:gtid_slave_pos src_dsn:P=3306,S=/usr/local/mysql/3307/mysql.sock,h=192.168.1.161,p=...,u=checksum dst_db:mysql dst_tbl:gtid_slave_pos dst_dsn:S=/usr/local/mysql/3307/mysql.sock,h=localhost,p=...,u=checksum lock:1 transaction:1 changing_src:pt.checksums replicate:pt.checksums bidirectional:0 pid:3565 user:root host:test12*/;
DELETE FROM `mysql`.`gtid_slave_pos` WHERE `domain_id`='0' AND `sub_id`='123' LIMIT 1 /*percona-toolkit src_db:mysql src_tbl:gtid_slave_pos src_dsn:P=3306,S=/usr/local/mysql/3307/mysql.sock,h=192.168.1.161,p=...,u=checksum dst_db:mysql dst_tbl:gtid_slave_pos dst_dsn:S=/usr/local/mysql/3307/mysql.sock,h=localhost,p=...,u=checksum lock:1 transaction:1 changing_src:pt.checksums replicate:pt.checksums bidirectional:0 pid:3565 user:root host:test12*/;
Access denied for user 'checksum'@'192.168.%' to database 'mysql' [for Statement "LOCK TABLES `mysql`.`proxies_priv` WRITE"] at line 6097 while doing mysql.proxies_priv on localhost
DELETE FROM `test`.`t` WHERE `id`='100' LIMIT 1 /*percona-toolkit src_db:test src_tbl:t src_dsn:P=3306,S=/usr/local/mysql/3307/mysql.sock,h=192.168.1.161,p=...,u=checksum dst_db:test dst_tbl:t dst_dsn:S=/usr/local/mysql/3307/mysql.sock,h=localhost,p=...,u=checksum lock:1 transaction:1 changing_src:pt.checksums replicate:pt.checksums bidirectional:0 pid:3565 user:root host:test12*/;
pt-table-sync --print --sync-to-master h=localhost,S=/usr/local/mysql_3308/mysql.sock,u=checksum,p='a90r9f2A' --replicate pt.checksums;
注意一个问题,如果为
master1 <--------互为主从--------->master2
| |
slave1 slave2
我们在检查pt-table-checksum的时候指定的只有master1,pt.dnsn也指定的只有slave1和master2,
如:/usr/bin/pt-table-checksum --ignore-databases=mysql --host=$HOST --user=$USER --password=$PASSWD --port=$PORT --replicate=pt.checksums \
--no-check-binlog-format --no-check-replication-filters --max-lag=10m \
--max-load=Threads_running=50 --chunk-size=10000 --chunk-size-limit=3.0 \
--recursion-method=dsn=h=10.10.10.181,D=pt,t=dsns >>$LOG 2>&1
checksum之后,在slave1和master2是获取不到数据不一致的。
pt工具自动识别了这个架构,所以,在slave2上可以获取不一致的问题。
186上这样:
pt-table-sync --sync-to-master --print h=10.10.10.244,P=3308,u=checksum,p='a90r9f2A'
修复:
1.执行命令:
[root@test13 ~]# pt-table-sync --execute --sync-to-master h=192.168.1.162,P=3307,u=checksum,p='mysql' --databases=test --tables=t
[root@test13 ~]#
从上查看:
MariaDB [test]> select * from t;
+----+
| id |
+----+
| 1 |
| 2 |
+----+
2 rows in set (0.00 sec)
2.手动执行:
登录slave上执行;
DELETE FROM `test`.`t` WHERE `id`='100' LIMIT 1;
常见问题:
问题1:install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted.
pt-table-checksum --databases=test --host=localhost --user=checksum --password=mysql --socket=/usr/local/mysql/3306/mysql.sock \
> --replicate=pt.checksums --no-check-binlog-format --no-check-replication-filters \
> --max-lag=5m --max-load=Threads_running=50 --chunk-size=1000 --chunk-size-limit=3.0 --replicate-check-only
05-24T17:03:14 install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted.
Compilation failed in require at (eval 24) line 3.
at /usr/bin/pt-table-checksum line 1569
检查
[root@test12 backup]# perl -e'use DBD::mysql'
Can't load '/usr/lib64/perl5/auto/DBD/mysql/mysql.so' for module DBD::mysql: libmysqlclient.so.16: cannot open shared object file: No such file or directory at /usr/lib64/perl5/DynaLoader.pm line 200.
at -e line 1
Compilation failed in require at -e line 1.
BEGIN failed--compilation aborted at -e line 1.
yum install Percona-Server-shared-51.x86_64 mysql.x86_64 mysql-libs.x86_64
问题2:Cannot connect to h=192.168.1.162,p=...,u=checksum
[root@test11 ~]# pt-table-checksum --databases=test --host=localhost --user=checksum --password=mysql --socket=/usr/local/mysql/3306/mysql.sock --replicate=pt.checksums --no-check-binlog-format --no-check-replication-filters --max-lag=5m --max-load=Threads_running=50 --chunk-size=1000 --chunk-size-limit=3.0 --replicate-check-only
Cannot connect to h=192.168.1.162,p=...,u=checksum
Diffs cannot be detected because no slaves were found. Please read the --recursion-method documentation for information.
解决办法:
在Master机的pt库加入
create table dsns(id int(11) not null auto_increment,parent_id int(11) default null,dsn varchar(255) not null,primary key(id));
– 写入从库信息
INSERT INTO dsns (parent_id,dsn) values(1,'h=192.168.1.162,u=checksum,p=mysql,P=3307');
– 如果有多个从库,就插入多条记录.
INSERT INTO dsns (parent_id,dsn) values(1,'h=192.168.1.162,u=checksum,p=mysql,P=3308');
执行pt-table-checksum命令时多加下面的参数即可。
--recursion-method=dsn=h=192.168.1.162,D=pt,t=dsns
问题3:Access denied for user 'checksum'@'192.168.1.161'
[root@test11 ~]# pt-table-checksum --databases=test --host=localhost --user=checksum --password=mysql --socket=/usr/local/mysql/3306/mysql.sock --replicate=pt.checksums \
--no-check-binlog-format --no-check-replication-filters --max-lag=5m --max-load=Threads_running=50 --chunk-size=1000 --chunk-size-limit=3.0 \
--recursion-method=dsn=h=192.168.1.162,D=pt,t=dsn
05-24T18:49:53 DBI connect('pt;host=192.168.1.162;mysql_socket=/usr/local/mysql/3306/mysql.sock;mysql_read_default_group=client','checksum',...) failed: Access denied for user 'checksum'@'192.168.1.161' (using password: YES) at /usr/bin/pt-table-checksum line 1569
问题4:索引问题。
索引问题
06-07T21:19:31 Cannot checksum table db.table: There is no good index and the table is oversized. at ./pt-table-checksum line 3899.
因为该工具是根据主键索引或者唯一索引来分块进行校验的,默认是1000行为一块.
问题5:块过大问题
05-26T09:52:29 Skipping chunk 1 of jz_ums.ods_device_visitlog because it is oversized. The current chunk size limit is 20000 rows (chunk size=10000 * chunk size limit=2.0),
but MySQL estimates that there are 20238 rows in the chunk.
当前配置:总消耗时长:2016-05-26 09:52:20 ——2016-05-26 09:53:50也就是,1分30秒。
--max-lag=10m --max-load=Threads_running=50 --chunk-size=10000
chunk size limit默认2,根据提示,chunk size limit is 20000 rows
10000(chunk size) * 2.0(chunk size limit)=20000行。
也就是一个块大小,不能超过20000行了。
那么添加参数:--chunk-size-limit=3.0,增加到3w行跳过。
当把--chunk-size=10000 换成--chunk-size=5k的测试。
2016-05-26 10:43:36
2016-05-26 10:45:11
共消耗:1分35。
当把--chunk-size=10000 换成--chunk-size=1M的测试(也就是说300W数据行以下的都不跳过(没有主键或是唯一索引的)。
2016-05-26 10:52:02
2016-05-26 10:54:08
2分06秒。
===============================================
05-25T17:59:47 Cannot checksum table jz_ums.yy_device_uuidlog: There is no good index and the table is oversized. at /usr/bin/pt-table-checksum line 6417.
05-25T16:45:49 Cannot checksum table jz_ums.t_click_20160406: There is no good index and the table is oversized. at /usr/bin/pt-table-checksum line 6417.
05-25T16:31:21 Cannot checksum table jz_ums.ods_device_account: There is no good index and the table is oversized. at /usr/bin/pt-table-checksum line 6417.
05-25T16:31:22 Cannot checksum table jz_ums.ods_device_addapps: There is no good index and the table is oversized. at /usr/bin/pt-table-checksum line 6417.
05-25T16:31:22 Cannot checksum table jz_ums.ods_device_allapps: There is no good index and the table is oversized. at /usr/bin/pt-table-checksum line 6417.
05-25T16:31:23 Cannot checksum table jz_ums.ods_device_station: There is no good index and the table is oversized. at /usr/bin/pt-table-checksum line 6417.
05-25T16:31:24 Cannot checksum table jz_ums.ods_device_update: There is no good index and the table is oversized. at /usr/bin/pt-table-checksum line 6417.
05-25T16:31:24 Cannot checksum table jz_ums.ods_device_useapps: There is no good index and the table is oversized. at /usr/bin/pt-table-checksum line 6417.
jz_ums.yy_device_uuidlog 339260
jz_ums.t_click_20160406 21454
jz_ums.ods_device_account 58704
jz_ums.ods_device_addapps 122447
jz_ums.ods_device_allapps 788005
jz_ums.ods_device_station 1766887
jz_ums.ods_device_update 4365647
jz_ums.ods_device_useapps 44604
解决方案:
MariaDB [test]> insert into tab select * from tab;
Query OK, 131072 rows affected (0.76 sec)
Records: 131072 Duplicates: 0 Warnings: 0
MariaDB [test]>
MariaDB [test]> show create table tab;
+-------+---------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+---------------------------------------------------------------------------------------+
| tab | CREATE TABLE `tab` (
`id` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+-------+---------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
MariaDB [test]> alter table tab add column i int not null primary key;
ERROR 1062 (23000): Duplicate entry '0' for key 'PRIMARY'
这是由于没有自增长,所以这里添加主键,相当于给把表rename,新建一个添加了主键的tab表,然后插入现在的tab表从被rename的表中。
那么第二次insert的时候,默认主键也是0 ,那么主键冲突。
MariaDB [test]> alter table tab add column i int not null auto_increment;
ERROR 1075 (42000): Incorrect table definition; there can be only one auto column and it must be defined as a key
这样也不行。要这样;
MariaDB [test]> alter table tab add column i int not null primary key auto_increment;
Query OK, 0 rows affected (1.30 sec)
Records: 0 Duplicates: 0 Warnings: 0
MariaDB [test]> select * from tab limit 10;
+------+----+
| id | i |
+------+----+
| 1 | 6 |
| 2 | 8 |
| 3 | 10 |
这里需要注意2个问题,第一,主键还是放第一个位置为好,所以:alter table tab add column i int not null primary key auto_increment first;
第二个问题:ID的自增长步长大小,上面暴露出步长为4,偏移量为2,。
MariaDB [test]> set auto_increment_offset=1;
Query OK, 0 rows affected (0.00 sec)
MariaDB [test]> set auto_increment_increment=1;
Query OK, 0 rows affected (0.00 sec)
MariaDB [test]> alter table tab add column i int not null primary key auto_increment first;
Query OK, 0 rows affected (0.28 sec)
Records: 0 Duplicates: 0 Warnings: 0
================================原理
pt-table-checksum日志解析:
mysqlbinlog /log/binlog/3306/bin.000002 >/tmp/111.sql
# at 6811
#160524 19:01:34 server id 161 end_log_pos 6849 GTID 0-161-27
/*!100001 SET @@session.gtid_seq_no=27*//*!*/;
BEGIN
/*!*/;
# at 6849
#160524 19:01:34 server id 161 end_log_pos 7241 Query thread_id=44 exec_time=0 error_code=0
use `test`/*!*/;
SET TIMESTAMP=1464087694/*!*/;
REPLACE INTO `pt`.`checksums` (db, tbl, chunk, chunk_index, lower_boundary, upper_boundary, this_cnt, this_crc) SELECT 'test', 't', '1', NULL, NULL, NULL, COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', `id`, CONCAT(ISNULL(`id`)))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `test`.`t` /*checksum table*/
/*!*/;
# at 7241
#160524 19:01:34 server id 161 end_log_pos 7268 Xid = 422
COMMIT/*!*/;
# at 7268
#160524 19:01:34 server id 161 end_log_pos 7306 GTID 0-161-28
/*!100001 SET @@session.gtid_seq_no=28*//*!*/;
BEGIN
/*!*/;
# at 7306
#160524 19:01:34 server id 161 end_log_pos 7515 Query thread_id=44 exec_time=0 error_code=0
SET TIMESTAMP=1464087694/*!*/;
UPDATE `pt`.`checksums` SET chunk_time = '0.000767', master_crc = '246be59', master_cnt = '2' WHERE db = 'test' AND tbl = 't' AND chunk = '1'
/*!*/;
# at 7515
#160524 19:01:34 server id 161 end_log_pos 7542 Xid = 425
COMMIT/*!*/;
--------------------------------------上面这个只有一个字段,就是id-----------------------------
这个是线上校验,提取出来的SQL:
REPLACE INTO `pt`.`checksums` (db, tbl, chunk, chunk_index, lower_boundary, upper_boundary, this_cnt, this_crc) SELECT 'jz_user_message_17', 't_user_message', '1', NULL, NULL, NULL, COUNT(*) AS cnt,
COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', `id`, `type`, `code`, `model_id`, `bussiness_id`, `comment_id`, `send_user_id`, `member_id`, `title`, `content`, `data`, `is_read`,
`state`, `start_index`, `create_time` + 0, `update_time` + 0, `create_by`, `update_by`, CONCAT(ISNULL(`type`), ISNULL(`code`), ISNULL(`model_id`), ISNULL(`bussiness_id`), ISNULL(`comment_id`),
ISNULL(`send_user_id`), ISNULL(`member_id`), ISNULL(`title`), ISNULL(`content`), ISNULL(`data`), ISNULL(`is_read`), ISNULL(`state`), ISNULL(`start_index`), ISNULL(`create_time`),
ISNULL(`update_time`), ISNULL(`create_by`), ISNULL(`update_by`)))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `jz_user_message_17`.`t_user_message` /*checksum table*/
UPDATE `pt`.`checksums` SET chunk_time = '0.000767', master_crc = '246be59', master_cnt = '2' WHERE db = 'test' AND tbl = 't' AND chunk = '1'
/*!*/;
这两行,pt-table-checksum是在主从上执行相同的SQL,而计算出来的值,对值进行对比。CRC32,BIT_XOR,CONV进行处理。
上可以看出,checksum把所有字段连接起来,crc32计算校验值。然后bit_xor处理,最后conv转换。
coalesce():返回参数中的第一个非空表达式(从左向右);
大胆的想法,使用相同的方法,写工具,完成任意表单,库的数据校验对比。
精简效果:
select CONCAT_WS('#', `aigl_id`, aigl_vipid, aigl_vipname,concat(isNULL(`aigl_vipid`), ISNULL(`aigl_vipname`))) AS s
from accig01_list_copy FORCE INDEX(`PRIMARY`) WHERE ((`aigl_id` >= 1)) AND ((`aigl_id` <= 325206))
完整校验mysql数据一致性SQL
REPLACE INTO `pt`.`checksums` (db, tbl, chunk, chunk_start,chunk_end,this_cnt,this_crc)
SELECT 'sltonline','accig01_list','1', 1, 325206,COUNT(*) AS cnt,
COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', `aigl_id`, aigl_vipid, aigl_vipname,concat(isNULL(`aigl_vipid`), ISNULL(`aigl_vipname`))) AS UNSIGNED )), 10, 16)), 0) AS crc
from accig01_list_copy FORCE INDEX(`PRIMARY`) WHERE ((`aigl_id` >= 1)) AND ((`aigl_id` <= 325206))
concat是不包含主键的。
=======================================
有数据不一致的表。如何找出来》?
07-21T02:29:54 0 1 127763 15 0 15.913 jz_yaya.t_information
07-21T02:30:03 0 1 155 1 0 0.021 jz_yaya.t_information_soword
07-21T02:30:06 0 10 745685 77 0 3.184 jz_yaya.t_integral_user
07-21T02:30:35 0 5 515328 54 0 10.528 jz_yaya.t_interact_history
07-21T02:30:45 0 1 613064 64 0 4.706 jz_yaya.t_member_detail
07-21T02:36:53 0 1 212826 24 0 2.763 jz_yaya.t_user_praise_reward
数据行是一致的。
pt-table-checksum --ignore-databases=mysql --host=10.10.10.181 --user=checksum --password=xxx --port=3307 \
--replicate=pt.checksums --no-check-binlog-format --no-check-replication-filters \
--max-lag=5m --max-load=Threads_running=50 --chunk-size=10000 --chunk-size-limit=3.0 \
--recursion-method=dsn=h=10.10.10.181,D=pt,t=dsns
nohup pt-table-sync --print --sync-to-master h=10.10.10.181,P=3308,u=checksum,p='a90r9f2A' --replicate=pt.checksums --charset=utf8;
对比两个复制的是否不一致。
单独校验方法:
=======================================
pt-table-checksum --databases=ydy --host=192.168.0.15 --user=checksum --password='xxx' --port=3916 \
--replicate=pt.checksums --no-check-binlog-format --no-check-replication-filters \
--max-lag=5m --max-load=Threads_running=50 --chunk-size=10000 --chunk-size-limit=3.0 \
--recursion-method=dsn=h=192.168.0.15,D=pt,t=dsns
#直接指定一个slave
# /usr/bin/pt-table-checksum --ignore-databases=mysql --host=$HOST --user=$USER --password=$PASSWD --port=$PORT --replicate=pt.checksums --no-check-binlog-format --no-check-replication-filters --max-lag=10m --max-load=Threads_running=50 --chunk-size=10000 -h$HOST -u$USER -p$PASSWD >>$LOG 2>&1
#指定多个slave列表
/usr/bin/pt-table-checksum --ignore-databases=mysql --host=$HOST --user=$USER --password=$PASSWD --port=$PORT --replicate=pt.checksums --no-check-binlog-format --no-check-replication-filters --max-lag=10m --max-load=Threads_running=50 --chunk-size=10000 --chunk-size-limit=3.0 --recursion-method=dsn=h=$HOST,D=pt,t=dsns >>$LOG 2>&1
提示这样的错误:
*******************************************************************
Using the default of SSL_verify_mode of SSL_VERIFY_NONE for client
is deprecated! Please set SSL_verify_mode to SSL_VERIFY_PEER
together with SSL_ca_file|SSL_ca_path for verification.
If you really don't want to verify the certificate and keep the
connection open to Man-In-The-Middle attacks please set
SSL_verify_mode explicitly to SSL_VERIFY_NONE in your application.
*******************************************************************
at /bin/pt-table-checksum line 331.
脚本:
cat table_checksum.sh
#!/bin/bash
#Description:Consistency check
#date:2015-5-25
#author:lzj
LOG='/data0/log/checksum.log'
#空格隔开
IP=192.168.0.15
PORT=3306
USER=checksum
PASSWD='xxxxx'
echo >>$LOG
DATE=`date +"%Y-%m-%d %T"`
echo "--------$DATE--host=$HOST --port=$PORT ------------------------">>$LOG
/usr/bin/pt-table-checksum --databases=ydy --host=$IP --user=$USER --password=$PASSWD --port=$PORT --replicate=pt.checksums --no-check-binlog-format --no-check-replication-filters --max-lag=10m --max-load=Threads_running=50 --chunk-size=10000 --chunk-size-limit=3.0 --recursion-method=dsn=h=$IP,D=pt,t=dsns >>$LOG 2>&1
echo >>$LOG
echo "====================================================================================================">>$LOG
自动运行,每周6晚上进行校验一次。
0 1 * * 6 /bin/sh /root/script/table_checksum.sh >/dev/null 2>&1 &
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· 单线程的Redis速度为什么快?
· SQL Server 2025 AI相关能力初探
· AI编程工具终极对决:字节Trae VS Cursor,谁才是开发者新宠?
· 展开说说关于C#中ORM框架的用法!