修复分区表中错位的行
2023-04-03 17:07 abce 阅读(144) 评论(0) 编辑 收藏 举报在MySQL中,分区表的数据被分到不同的表空间中,但仍然被视为单个表。在处理大量数据集的某些情况下,分区可能是一种有用的方法。分区表可以加快删除大型数据集的速度,但如果处理不当,可能会将数据错误地放置在表中。
在这将分享如何检查和修复这样一个表中的数据,同时尽量减少对表的破坏。
分区表:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | mysql> show create table salariesG *************************** 1. row *************************** Table : salaries Create Table : CREATE TABLE `salaries` ( `emp_no` int (11) NOT NULL , `salary` int (11) NOT NULL , `from_date` date NOT NULL , `to_date` date NOT NULL , PRIMARY KEY (`emp_no`,`from_date`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 /*!50500 PARTITION BY RANGE COLUMNS(from_date) (PARTITION p01 VALUES LESS THAN ('1985-12-31') ENGINE = InnoDB, PARTITION p02 VALUES LESS THAN ('1986-12-31') ENGINE = InnoDB, PARTITION p03 VALUES LESS THAN ('1987-12-31') ENGINE = InnoDB, PARTITION p04 VALUES LESS THAN ('1988-12-31') ENGINE = InnoDB, PARTITION p05 VALUES LESS THAN ('1989-12-31') ENGINE = InnoDB, PARTITION p06 VALUES LESS THAN ('1990-12-31') ENGINE = InnoDB, PARTITION p07 VALUES LESS THAN ('1991-12-31') ENGINE = InnoDB, PARTITION p08 VALUES LESS THAN ('1992-12-31') ENGINE = InnoDB, PARTITION p09 VALUES LESS THAN ('1993-12-31') ENGINE = InnoDB, PARTITION p10 VALUES LESS THAN ('1994-12-31') ENGINE = InnoDB, PARTITION p11 VALUES LESS THAN ('1995-12-31') ENGINE = InnoDB, PARTITION p12 VALUES LESS THAN ('1996-12-31') ENGINE = InnoDB, PARTITION p13 VALUES LESS THAN ('1997-12-31') ENGINE = InnoDB, PARTITION p14 VALUES LESS THAN ('1998-12-31') ENGINE = InnoDB, PARTITION p15 VALUES LESS THAN ('1999-12-31') ENGINE = InnoDB, PARTITION p16 VALUES LESS THAN ('2000-12-31') ENGINE = InnoDB, PARTITION p17 VALUES LESS THAN ('2001-12-31') ENGINE = InnoDB, PARTITION p18 VALUES LESS THAN ('2002-12-31') ENGINE = InnoDB, PARTITION p19 VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB) */ |
查询任意一个数据集,MySQL可以快速地返回数据,没有任何错误:
1 2 3 4 5 6 7 | mysql> SELECT * FROM salaries where emp_no=10001 and from_date= '1999-06-23' ; + --------+--------+------------+------------+ | emp_no | salary | from_date | to_date | + --------+--------+------------+------------+ | 10001 | 84917 | 1999-06-23 | 2000-06-22 | + --------+--------+------------+------------+ 1 row in set (0.00 sec) |
但是如果数据发生了分区错位,就不能返回任何值了:
1 | mysql> SELECT * FROM salaries where emp_no=10001 and from_date= '1999-06-23' ; |
乍看数据像是已经被删除了,即使使用explain查看:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | mysql> EXPLAIN SELECT * FROM salaries where emp_no=10001 and from_date= '1999-06-23' G *************************** 1. row *************************** id: 1 select_type: SIMPLE table : NULL partitions: NULL type: NULL possible_keys: NULL key : NULL key_len: NULL ref: NULL rows : NULL filtered: NULL Extra: no matching row in const table 1 row in set , 1 warning (0.00 sec) |
但是数据还在表中,只是在错误的分区中:
1 2 3 4 5 6 7 8 | mysql> SELECT * FROM salaries PARTITION(p08) LIMIT 1; + --------+--------+------------+------------+ | emp_no | salary | from_date | to_date | + --------+--------+------------+------------+ | 10001 | 84917 | 1999-06-23 | 2000-06-22 | + --------+--------+------------+------------+ 1 row in set (0.00 sec) Empty set (0.00 sec) |
如何检测是否有错位的行呢?
可以借助工具check,来检测:(大表需要的时间会很长)
1 2 3 4 5 6 7 8 9 | mysql> CHECK TABLE salaries; + --------------------+-------+----------+---------------------------------------------------------------------------------------+ | Table | Op | Msg_type | Msg_text | + --------------------+-------+----------+---------------------------------------------------------------------------------------+ | employees.salaries | check | error | Found a misplaced row in part 7 should be in part 14: from_date:1999-06-23 | | employees.salaries | check | error | Partition p08 returned error | | employees.salaries | check | error | Table upgrade required. Please do "REPAIR TABLE `salaries`" or dump/reload to fix it! | + --------------------+-------+----------+---------------------------------------------------------------------------------------+ 3 rows in set (0.83 sec) |
行是如何错位的?
上面的例子抛出了一个问题:MySQL怎么允许这种情况发生的,如何修复它?对于前者的答案是MySQL允许用户用一个单独的表交换一个分区;默认情况下,它在插入数据到表之前会检查每一行。
1 2 | mysql> ALTER TABLE salaries EXCHANGE PARTITION p15 WITH TABLE salaries_p15; Query OK, 0 rows affected (0.04 sec) |
这里,salaries_p15是一个空表,结构和salaries表相同,只是没有做分区而已。
1 2 3 4 5 6 7 8 9 10 11 | mysql> show create table salaries_p15G *************************** 1. row *************************** Table : salaries_p15 Create Table : CREATE TABLE `salaries_p15` ( `emp_no` int (11) NOT NULL , `salary` int (11) NOT NULL , `from_date` date NOT NULL , `to_date` date NOT NULL , PRIMARY KEY (`emp_no`,`from_date`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec) |
交换分区之后,分区p15就是空的,而salaries_p15包含了分区P15的数据。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | mysql> SELECT count (*) from salaries_p15; + ----------+ | count (*) | + ----------+ | 260926 | + ----------+ 1 row in set (0.02 sec) mysql> SELECT count (*) from salaries PARTITION(p15); + ----------+ | count (*) | + ----------+ | 0 | + ----------+ 1 row in set (0.06 sec) |
MySQL为了加快分区交换的速度,可以不做数据检查:
1 2 | mysql> ALTER TABLE salaries EXCHANGE PARTITION p08 WITH TABLE salaries_p15 WITHOUT VALIDATION; Query OK, 0 rows affected (0.04 sec) |
这里分区p8可以和salaries_p15快速地交换,从而就导致了记录的错位!在生产环境下,建议谨慎使用without validation选项。
1 2 3 4 5 6 7 | mysql> SELECT * FROM salaries where emp_no=10001 and from_date= '1999-06-23' ; + --------+--------+------------+------------+ | emp_no | salary | from_date | to_date | + --------+--------+------------+------------+ | 10001 | 84917 | 1999-06-23 | 2000-06-22 | + --------+--------+------------+------------+ 1 row in set (0.00 sec) |
如何修复?
有两种修复方法:
1.使用repair table命令
2.执行一次空的alter交换操作
repair操作会将错位的记录安排到正确的分区,但是需要在表上加上system lock。
1 2 3 4 5 6 7 8 | mysql> repair table salaries; + --------------------+--------+----------+-----------------------------+ | Table | Op | Msg_type | Msg_text | + --------------------+--------+----------+-----------------------------+ | employees.salaries | repair | warning | Moved 260926 misplaced rows | | employees.salaries | repair | status | OK | + --------------------+--------+----------+-----------------------------+ 2 rows in set (5.30 sec) |
第二种方法也要在表上加锁,直到运行结束:
1 | ALTER employees.salaries ENGINE=Innodb; |
此外,可以借助pt-online工具,在不加锁的情况下重建表。比如:
1 | pt-online- schema -change --execute --alter "ENGINE=InnoDb" h=localhost,D=employees,t=salaries |
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· .NET10 - 预览版1新功能体验(一)
2021-04-03 pg_repack安装和使用
2018-04-03 Innotop的安装和使用