W
e
l
c
o
m
e
: )

8.mysql表分区

MySQL表分区

	表分区是将⼀个表的数据按照⼀定的规则⽔平划分为不同的逻辑块,并分别进⾏物理存储,这个规则就叫做分区函数,可以有不同的分区规则

	5.7可以通过show plugins语句查看当前MySQL是否⽀持表分区功能

	MySQL8.0移除了show plugins⾥对partition的显示,但社区版本的表分区功能是默认开启的

MySQL分区介绍

	当表中含有主键或者唯一键时,则每个被用作分区函数的字段必须是表中唯一键和主键的全部或者一部分,否则就无法创建分区表

⽐如下⾯的表由于唯⼀键和主键没有相同的字段,所以⽆法创建表分区
mysql> CREATE TABLE tnp ( id INT NOT NULLAUTO_INCREMENT,
ref BIGINT NOT NULL,
name VARCHAR(255),
PRIMARY KEY pk (id),
UNIQUE KEY uk (ref) ) PARTITION BY RANGE (id)
( PARTITION p0 VALUES LESS THAN (6), PARTITION p1 VALUES LESS THAN(11));
ERROR 1503 (HY000): A UNIQUE INDEX must include all columns in the tables partitioning function

mysql> CREATE TABLE tnp (id INT NOT NULLAUTO_INCREMENT,
ref BIGINT NOT NULL,
name VARCHAR(255),
PRIMARY KEY pk (id),
UNIQUE KEY uk (ref) ) PARTITION BY RANGE (ref)
( PARTITION p0 VALUES LESS THAN (6), PARTITION p1 VALUES LESS THAN(11));
ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function

上述例⼦中删除唯⼀键,确保主键中的字段包含分区函数中的所有字段,创建成功

mysql> CREATE TABLE tnp (
id INT NOT NULL AUTO_INCREMENT,
ref BIGINT NOT NULL,
name VARCHAR(255),
PRIMARY KEY pk (id))
PARTITION BY RANGE (id)
( PARTITION p0 VALUES LESS THAN (6), PARTITION p1 VALUES LESS THAN (11));

或者将主键扩展为包含ref字段

mysql> CREATE TABLE tnp (id INT NOT NULL ,
ref BIGINT NOT NULL,
name VARCHAR(255),
PRIMARY KEY pk (id,ref),
UNIQUE KEY uk (ref) )
PARTITION BY RANGE (ref)
( PARTITION p0 VALUES LESS THAN (6), PARTITION p1 VALUES LESS THAN (11));

● 表分区的主要优势在于:
○ 可以允许在⼀个表⾥存储更多的数据,突破磁盘限制或者⽂件系统限制
○ 对于从表⾥将过期或历史的数据移除在表分区很容易实现,只要将对应的分区移除即可
○ 对某些查询和修改语句来说,可以⾃动将数据范围缩⼩到⼀个或⼏个表分区上,优化语句执⾏效率。⽽且可以通过显示指定表分区来执⾏语句,⽐如SELECT * FROM t PARTITION (p0,p1) WHERE c < 5
● 表分区类型分为:
○ RANGE表分区:
■ RANGE表分区:范围表分区,按照⼀定的范围值来确定每个分区包含的数据,分区函数使⽤的字段必须只能是整数类型
■ 分区的定义范围必须是连续的,且不能有重叠部分,通过使⽤VALUES LESS THAN来定义分区范围,表分区的范围定义是从⼩到⼤定义的
CREATE TABLE employees ( id INT NOT NULL,
fname VARCHAR(30), lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT NOT NULL,
store_id INT NOT NULL )
PARTITION BY RANGE (store_id)
( PARTITION p0 VALUES LESS THAN (6), PARTITION p1 VALUES LESS THAN (11),
PARTITION p2 VALUES LESS THAN (16), PARTITION p3 VALUES LESS THAN (21) );

Store_id<6的数据被放在p0分区⾥,6<=store_id<10之间的数据被放在p1分区⾥,以此类推

当新插⼊的数据为(72, ‘Mitchell’, ‘Wilson’, ‘1998-06-25’, NULL, 13) 时,则新数据被插⼊到p2 分区⾥

但当插⼊的数据的store_id为21时,由于没有分区去容纳此数据,所以会报错,我们需要修改⼀下表的定义

mysql> insert into employees values(4,'d','d',now(),now(),1,21);
ERROR 1526 (HY000): Table has no partition for value 21

CREATE TABLE employees ( id INT NOT NULL, fname VARCHAR(30), lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01', separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT NOT NULL, store_id INT NOT NULL )
PARTITION BY RANGE (store_id)
( PARTITION p0 VALUES LESS THAN (6),
PARTITION p1 VALUES LESS THAN (11),
PARTITION p2 VALUES LESS THAN (16),
PARTITION p3 VALUES LESS THAN MAXVALUE
);

MAXVALUE关键词的作⽤是表示可能的最⼤值,所以任何store_id>=16的数据都会被写⼊到p3分区⾥

对timestamp字段类型可以使⽤的表达式⽬前仅有unix_timestamp,其他的表达式都不允许

CREATE TABLE quarterly_report_status ( report_id INT NOT NULL, report_status VARCHAR(20) NOT
NULL, report_updated TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE
CURRENT_TIMESTAMP )
PARTITION BY RANGE ( UNIX_TIMESTAMP(report_updated) )
( PARTITION p0 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-01-01 00:00:00') ),
PARTITION p1 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-04-01 00:00:00') ),
PARTITION p2 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-07-01 00:00:00') ),
PARTITION p3 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-10-01 00:00:00') ),
PARTITION p4 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-01-01 00:00:00') ),
PARTITION p5 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-04-01 00:00:00') ),
PARTITION p6 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-07-01 00:00:00') ),
PARTITION p7 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-10-01 00:00:00') ),
PARTITION p8 VALUES LESS THAN ( UNIX_TIMESTAMP('2010-01-01 00:00:00') ),
PARTITION p9 VALUES LESS THAN (MAXVALUE) );
○ LIST表分区:
■ LIST表分区:列表表分区,按照⼀个⼀个确定的值来确定每个分区包含的数据
■ 通过PARTITION BY LIST(expr)分区函数表达式必须返回整数,取值范围通过VALUES IN (value_list)定义
CREATE TABLE employees ( id INT NOT NULL, fname VARCHAR(30), lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT, store_id INT )
PARTITION BY LIST(store_id)
( PARTITION pNorth VALUES IN (3,5,6,9,17),
PARTITION pEast VALUES IN (1,2,10,11,19,20),
PARTITION pWest VALUES IN (4,12,13,14,18),
PARTITION pCentral VALUES IN (7,8,15,16) );

error

ERROR 1697 (HY000): VALUES value for partition 'pNorth' must have type INT

对List表分区来说,没有MAXVALUE特殊值,所有的可能取值都需要再VALUES IN中包含,如果有定义的取值则会报错

同样,当有主键或者唯⼀键存在的情况下,分区函数字段需要包含在主键或唯⼀键中

mysql> CREATE TABLE employees (
id INT NOT NULL primary key,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT,
store_id int )
PARTITION BY LIST(store_id)
( PARTITION pNorth VALUES IN (1,3),
PARTITION pEast VALUES IN (2,4));
ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function
■ 对range和list表分区来说,分区函数可以包含多个字段
■ 分区多字段函数(column partition)所涉及的字段类型可以包括
● TINYINT, SMALLINT, MEDIUMINT, INT (INTEGER), and BIGINT.
● DATE and DATETIME.
● CHAR, VARCHAR, BINARY, and VARBINARY
■ 其他的字段类型都不⽀持
■ 范围多字段分区函数与普通的范围分区函数的区别在于:
● a) 字段类型多样化
b) 范围多字段分区函数不⽀持表达式,只能⽤字段名
c) 范围多字段分区函数⽀持⼀个或多个字段
■ 每个column_list⾥的字段和value_list⾥的数值必须⼀⼀对应,数据类型也要⼀致
■ 对范围多字段分区来说,有时⼀⾏数据的分区列表的第⼀个元素等于VALUES LESS THAN的值列表的第⼀个元素是会被插⼊到相应的分区
mysql> CREATE TABLE rcx (a INT, b INT,
c CHAR(3), d INT)
PARTITION BY RANGE COLUMNS(a,b,c)
(PARTITION p0 VALUES LESS THAN (5,10,'ggg'),
PARTITION p1 VALUES LESS THAN (10,20,'mmm'),
PARTITION p2 VALUES LESS THAN (15,30,'sss'),
PARTITION p3 VALUES LESS THAN (MAXVALUE,MAXVALUE,MAXVALUE));
■ 当然只要保证取值范围是增⻓的,表分区就能创建成功,但如果取值范围不是增⻓的,就会返回错误:
○ HASH表分区:哈希表分区,按照⼀个⾃定义的函数返回值来确定每个分区包含的数据
■ 哈希表分区,按照⼀个⾃定义的函数返回值来确定每个分区包含的数据,这个⾃定义函数也可以仅仅是⼀个字段名字
■ 通过PARTITION BY HASH (expr)⼦句来表达哈希表分区,其中的expr表达式必须返回⼀个整数,基于分区个数的取模(%)运算。根据余数插⼊到指定的分区
■ 对哈希表分区来说只需要定义分区的个数,其他的事情由内部完成
CREATE TABLE employees ( id INT NOT NULL, fname VARCHAR(30),
lname VARCHAR(30), hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT, store_id INT )
PARTITION BY HASH(store_id)
PARTITIONS 4;

如果没有写明PARTITIONS字段,则默认为1

表达式可以是整数类型字段,也可以是⼀个函数,⽐如

CREATE TABLE employees ( id INT NOT NULL, fname VARCHAR(30),
lname VARCHAR(30), hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31', job_code INT, store_id INT )
PARTITION BY HASH( YEAR(hired) )
PARTITIONS 4;

⽐如:
CREATE TABLE t1 (col1 INT, col2 CHAR(5), col3 DATE)
PARTITION BY HASH( YEAR(col3) )
PARTITIONS 4;
如果插⼊⼀条数据对应的col3为‘2005-09-15’时,则插⼊数据的分区计算⽅法为:
MOD(YEAR('2005-09-01'),4)
= MOD(2005,4)
= 1
○ KEY表分区 :key表分区,与哈希表分区类似,只是⽤MySQL⾃⼰的HASH函数来确定每个分区包含的数据
■ key表分区:与哈希表分区类似,只不过哈希表分区依赖于⾃定义的函数,⽽key表分区的哈希算法是依赖MySQL本身
■ CREATE TABLE ... PARTITION BY KEY ()创建key表分区,括号⾥⾯可以包含0个或者多个字段,所引⽤的字段必须是主键或者主键的⼀部分,如果括号⾥⾯没有字段,则代表使⽤主键
CREATE TABLE k1 ( id INT NOT NULL PRIMARY KEY, name VARCHAR(20) )
PARTITION BY KEY()
PARTITIONS 2;
■ 如果表中没有主键但有唯⼀键,则使⽤唯⼀键,但唯⼀键字段必须定义为not null,否则报错
ERROR 1488 (HY000): Field in list of fields for partition function not found in table

mysql子表分区

● ⼦表分区,是在表分区的基础上再创建表分区的概念,每个表分区下的⼦表分区个数必须⼀致,⽐如:

CREATE TABLE ts (id INT, purchased DATE)
PARTITION BY RANGE( YEAR(purchased) )
SUBPARTITION BY HASH( TO_DAYS(purchased) )
SUBPARTITIONS 2
( PARTITION p0 VALUES LESS THAN (1990),
PARTITION p1 VALUES LESS THAN (2000),
PARTITION p2 VALUES LESS THAN MAXVALUE );

● ⼦表分区必须是范围/列表分区+哈希/key⼦表分区的组合

CREATE TABLE ts (id INT, purchased DATE)
PARTITION BY RANGE( YEAR(purchased) ) SUBPARTITION BY HASH( TO_DAYS(purchased) ) ( PARTITION p0 VALUES LESS THAN (1990)
( SUBPARTITION s0, SUBPARTITION s1 ),
PARTITION p1 VALUES LESS THAN (2000)
( SUBPARTITION s2, SUBPARTITION s3 ),
PARTITION p2 VALUES LESS THAN MAXVALUE
( SUBPARTITION s4, SUBPARTITION s5 ) );

mysql表分区对Null值处理

● 对范围表分区来说,如果插⼊的是NULL值,则将数据放到最⼩的分区表⾥
○ 对list表分区来说,⽀持NULL值的唯⼀情况就是某个分区的允许值中包含NULL

MySQL表分区管理

● 通过alter table命令可以执⾏增加,删除,重新定义,合并或者拆分表分区的管理动作

对范围表分区和列表表分区来说,删除⼀个表分区命令如下:
mysql> CREATE TABLE tr (id INT, name VARCHAR(50), purchased DATE)
PARTITION BY RANGE( YEAR(purchased) )
( PARTITION p0 VALUES LESS THAN (1990), PARTITION p1 VALUES LESS THAN (1995),
PARTITION p2 VALUES LESS THAN (2000), PARTITION p3 VALUES LESS THAN (2005),
PARTITION p4 VALUES LESS THAN (2010), PARTITION p5 VALUES LESS THAN (2015) );

mysql> ALTER TABLE tr DROP PARTITION p2;

● 删除表分区的动作不光会把分区删掉,也会把表分区⾥原来的数据给删除掉
● 在原分区上增加⼀个表分区可以通过alter table … add partition语句来完成

CREATE TABLE members ( id INT, fname VARCHAR(25), lname VARCHAR(25), dob
DATE ) PARTITION BY RANGE( YEAR(dob) )
( PARTITION p0 VALUES LESS THAN (1980),
PARTITION p1 VALUES LESS THAN (1990),
PARTITION p2 VALUES LESS THAN (2000) );
ALTER TABLE members ADD PARTITION (PARTITION p3 VALUES LESS THAN (2010));

● 但对范围表分区来说,增加的表分区必须在尾部增加,在头部或者在中间增加都会失败:

• mysql> ALTER TABLE members ADD PARTITION ( PARTITION n VALUES LESS THAN (1970));
• ERROR 1463 (HY000): VALUES LESS THAN value must be strictly increasing for each partition
• 为解决这个问题,可以使⽤REORGANIZE命令:
• ALTER TABLE members REORGANIZE PARTITION p0 INTO
• ( PARTITION n0 VALUES LESS THAN (1970), PARTITION n1 VALUES LESS THAN
(1980) );

● 对列表表分区来说,只要新增加的分区对应的值在之前的表分区中没有出现过,就可以通过alter table… add partition来增加
● 当然,也可以通过REORGANIZE命令将之前的多个分区合并成⼀个或⼏个分区,但要保持分区值⼀致:
● 对哈希表分区和KEY表分区的管理⼿段与范围和列表表分区完全不同,⽐如不能删除表分区,但可以通过ALTER TABLE ... COALESCE PARTITION语句合并表分区,其partition后⾯的数字代表缩减的个数,⽽不是缩减到的个数
● 对于哈希表分区和key表分区,如果是增加表分区,则可以使⽤add partition语句

ALTER TABLE clients ADD PARTITION PARTITIONS 6;

● 对分区表可以通过ALTER TABLE pt EXCHANGE PARTITION p WITH TABLE nt命令将⼀个分区或者是⼦分区的数据与普通的表的数据相互交换,其本身的表结构不会变化
● 交换的分区表和⽬标表必须结构完全相同,包括字段,类型,索引,存储引擎必须完全⼀样

CREATE TABLE e ( id INT NOT NULL, fname VARCHAR(30), lname VARCHAR(30) )
PARTITION BY RANGE (id) ( PARTITION p0 VALUES LESS THAN (50), PARTITION p1
VALUES LESS THAN (100), PARTITION p2 VALUES LESS THAN (150), PARTITION p3
VALUES LESS THAN (MAXVALUE) );

CREATE TABLE e2 LIKE e;

ALTER TABLE e2 REMOVE PARTITIONING;

ALTER TABLE e EXCHANGE PARTITION p0 WITH TABLEe2;

ALTER TABLE e EXCHANGE PARTITION p0 WITH TABLE e2; ##再次执⾏后数据交换回来

● 执⾏exchange命令时,⽬标表⾥不⼀定是空数据,如果有数据需要保证⾥⾯的数据符合表分区的条件,否则只能⽤WITHOUT VALIDATION来跳过验证环节
● 对⼦表分区也可以执⾏exchange命令
● 当需要去除分区碎⽚是,可以执⾏rebuild命令,相当于删除数据之后重新插⼊

ALTER TABLE t1 REBUILD PARTITION p0, p1;

也可以执⾏OPTIMIZE命令回收分区中未使⽤的空间和重新获取统计资料

ALTER TABLE t1 OPTIMIZE PARTITION p0, p1;

Analyzing partitions命令重新收集分区统计资料
ALTER TABLE t1 ANALYZE PARTITION p3;

Repairing partitions命令修复异常的分区
ALTER TABLE t1 REPAIR PARTITION p0,p1;

Checking partitions命令检查分区中数据或者索引数据是否损坏
ALTER TABLE t1 CHECK PARTITION p1;

● ALTER TABLE ... TRUNCATE PARTITION命令⽤来删除分区中的所有数据
● 获取表的分区信息有如下⼏种:

show create table e;

show table status like 'e';

通过information_schema.partitions系统表来查看分区表的具体信息
select * from information_schema.partitions where table_name='e';

MySQL表分区修剪

● • 表分区修剪是MySQL优化的⼀种,其核⼼就是只扫描需要的分区

CREATE TABLE t1 ( fname VARCHAR(50) NOT NULL, lname VARCHAR(50) NOT NULL,
region_code TINYINT UNSIGNED NOT NULL, dob DATE NOT NULL ) PARTITION BY
RANGE( region_code ) ( PARTITION p0 VALUES LESS THAN (64), PARTITION p1 VALUES LESS
THAN (128), PARTITION p2 VALUES LESS THAN (192), PARTITION p3 VALUES LESS THAN
MAXVALUE );

explain SELECT fname, lname, region_code, dob.
• -> FROM t1
• -> WHERE region_code > 125 AND region_code < 130;

● 此查询只需要扫描P1和P2两个分区就能得到结果,从⽽获得额外的性能提升
● 不光是select语句可以被使⽤表分区修剪,update和delete语句也可以使⽤
● 表分区选择和表分区修剪类似,只不过修剪是⾃动实现的,⽽表分区选择是现实的指定分区范围
● 表分区选择不仅⽀持select语句,也⽀持update,insert,delete等语句

CREATE TABLE employees ( id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, fname VARCHAR(25)
NOT NULL, lname VARCHAR(25) NOT NULL, store_id INT NOT NULL, department_id INT NOT NULL ) • PARTITION BY RANGE(id) ( PARTITION p0 VALUES LESS THAN (5),
PARTITION p1 VALUES LESS THAN (10), PARTITION p2 VALUES LESS THAN (15), PARTITION p3
VALUES LESS THAN MAXVALUE );

SELECT * FROM employees PARTITION (p1);

SELECT * FROM employees PARTITION (p0, p2);

SELECT e.id AS 'Employee ID', CONCAT(e.fname, ' ', e.lname) AS Name,
s.city AS City, d.name AS department
FROM employees AS e
JOIN stores PARTITION (p1) AS s ON e.store_id=s.id
JOIN departments PARTITION (p0) AS d ON e.department_id=d.id
ORDER BY e.lname;

DELETE FROM employees PARTITION (p0, p1) WHERE fname LIKE 'j%';

UPDATE employees PARTITION (p0) SET store_id = 2 WHERE fname = 'Jill';

MySQL表分区函数
函数名 说明
ABS()
DAYOFMONTH()
DATEDIFF()
HOUR()
MOD()
SECOND()
TO_SECOND()
YEAR()
CEILENG()
DAYOFWEEK()
EXTRACT()
MICROSECOND()
MONTH()
TIME_TO_SEC()
UNIX_TIMESTAMP()
DAY()
DAYOFYEAR()
FLOOR()
MINUTE()
QUARTER()
TO_DAYS()
WEEKDAY()
YEARWEEK()

posted @ 2024-12-16 17:01  水一RAR  阅读(9)  评论(0编辑  收藏  举报