表分区学习
1. 概述
1.1. 优点:
l 将表分区比一个表在单个磁盘或者文件系统存储能够存储更多数据
l 可以通过drop分区删除无用数据,也可以通过增加分区添加数据
l 查询可以通过分区裁剪进行优化,设置可以显示指定查找分区
l 对于聚合函数的全表查询,可以并行对分区查询,汇总得到结果
l 可以对分区指定不同存储磁盘实现更高的查询吞吐量
1.2. 分区类型:
l 范围分区
l 列表分区
l 哈希分区
l 键值分区
1.3. 时间范围分区应用
同时支持将date,datetime字段作为范围分区、列表分区的分区键:
CREATE TABLE members (
firstname VARCHAR(25) NOT NULL,
lastname VARCHAR(25) NOT NULL,
username VARCHAR(16) NOT NULL,
email VARCHAR(35),
joined DATE NOT NULL
)
PARTITION BY RANGE( YEAR(joined) ) (
PARTITION p0 VALUES LESS THAN (1960),
PARTITION p1 VALUES LESS THAN (1970),
PARTITION p2 VALUES LESS THAN (1980),
PARTITION p3 VALUES LESS THAN (1990),
PARTITION p4 VALUES LESS THAN MAXVALUE
);
对于mysql的分区,有一点是非常重要:无论什么类型的分区,分区在创建时都会逐一数字化,当一个新行插入分区表,通过匹配不同的值插入不同分区
分区表的分区名称区分大小写:
mysql> CREATE TABLE t2 (val INT)
-> PARTITION BY LIST(val)(
-> PARTITION mypart VALUES IN (1,3,5),
-> PARTITION MyPart VALUES IN (2,4,6)
-> );
ERROR 1488 (HY000): Duplicate partition name mypart
当指定分区数字时,分区数字只能是正整数,不能是零,也不能以0靠头,不能是类似的表达式,0.8E+01 or 6-2,decimal类型不能带有小数部分。
2. 分区类型详解
2.1. 范围分区:
2.1.1. 定义:分区包含数据在一个给定的范围内,范围应该是连续的且不重叠
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT NOT NULL,
store_id INT NOT NULL
)
PARTITION BY RANGE (store_id) (
PARTITION p0 VALUES LESS THAN (6),
PARTITION p1 VALUES LESS THAN (11),
PARTITION p2 VALUES LESS THAN (16),
PARTITION p3 VALUES LESS THAN (21)
PARTITION p3 VALUES LESS THAN MAXVALUE
);
VALUES LESS THAN clause in the CREATE TABLE statement hat provides for all values greater than the highest value explicitly named,将超过最大值的记录都存到p3分区内。可以通过alter table语句增加新分区将p3分区内的数据做进一步分区。
2.1.2. 重点应用基于时间间隔的分区方案
l 在date、time、dateitme字段上利用函数做分区键
CREATE TABLE members (
firstname VARCHAR(25) NOT NULL,
lastname VARCHAR(25) NOT NULL,
username VARCHAR(16) NOT NULL,
email VARCHAR(35),
joined DATETIME NOT NULL
)
PARTITION BY RANGE( YEAR(joined) ) (
PARTITION p0 VALUES LESS THAN (1960),
PARTITION p1 VALUES LESS THAN (1970),
PARTITION p2 VALUES LESS THAN (1980),
PARTITION p3 VALUES LESS THAN (1990),
PARTITION p4 VALUES LESS THAN MAXVALUE
);
l 基于时间戳的分区方案,其余对时间戳类型值的表达式不允许做分区key
CREATE TABLE quarterly_report_status (
report_id INT NOT NULL,
report_status VARCHAR(20) NOT NULL,
report_updated TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
)
PARTITION BY RANGE ( UNIX_TIMESTAMP(report_updated) ) (
PARTITION p0 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-01-01 00:00:00') ),
PARTITION p1 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-04-01 00:00:00') ),
PARTITION p2 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-07-01 00:00:00') ),
PARTITION p3 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-10-01 00:00:00') ),
PARTITION p4 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-01-01 00:00:00') ),
PARTITION p5 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-04-01 00:00:00') ),
PARTITION p6 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-07-01 00:00:00') ),
PARTITION p7 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-10-01 00:00:00') ),
PARTITION p8 VALUES LESS THAN ( UNIX_TIMESTAMP('2010-01-01 00:00:00') ),
PARTITION p9 VALUES LESS THAN (MAXVALUE)
);
l 范围Columns分区
CREATE TABLE members (
firstname VARCHAR(25) NOT NULL,
lastname VARCHAR(25) NOT NULL,
username VARCHAR(16) NOT NULL,
email VARCHAR(35),
joined DATETIME NOT NULL
)
PARTITION BY RANGE COLUMNS(joined) (
PARTITION p1 VALUES LESS THAN ('2015-01-01'),
PARTITION max VALUES LESS THAN MAXVALUE,
);
2.2. 列表分区
类似于范围分区,每个分区必须显示定义,每一个分区的定义和选择基于在列表中定义的值,而不是连续范围的值。分区表达式或者字段,最终返回的也是一个整数
注意:在支持事务的InnoDB中,如果使用insert语句同时插入多条记录,如果有一条失败,那么整个事务回滚,所有插入都失败。如果使用IGNORE关键字插入,那么能够插入的会成功,失败的不会插入,并且不报错;在不支持事务的MyISAM中,在错误数据之前插入的全都可以插入,错误数据之后插入的全都报错
mysql> CREATE TABLE h2 (
-> c1 INT,
-> c2 INT
-> )
-> PARTITION BY LIST(c1) (
-> PARTITION p0 VALUES IN (1, 4, 7),
-> PARTITION p1 VALUES IN (2, 5, 8)
-> );
Query OK, 0 rows affected (0.11 sec)
mysql> INSERT INTO h2 VALUES (3, 5);
ERROR 1525 (HY000): Table has no partition for value 3
mysql> TRUNCATE h2;
Query OK, 1 row affected (0.00 sec)
mysql> SELECT * FROM h2;
Empty set (0.00 sec)
mysql> INSERT IGNORE INTO h2 VALUES (2, 5), (6, 10), (7, 5), (3, 1), (1, 9);
Query OK, 3 rows affected (0.00 sec)
Records: 5 Duplicates: 2 Warnings: 0
mysql> SELECT * FROM h2;
+------+------+
| c1 | c2 |
+------+------+
| 7 | 5 |
| 1 | 9 |
| 2 | 5 |
+------+------+
3 rows in set (0.00 sec)
2.3. COLUMNS分区
该类型分区是范围分区和列表分区的变形,支持多个列作为分区键,这些列共同决定数据插入时匹配分区和检索时的分区裁剪。此外范围COLUMNS和列表COLUMNS分区支持非数字类型作为分区键。
支持类型如下:
l All integer types: TINYINT, SMALLINT, MEDIUMINT, INT (INTEGER), and BIGINT. (This is the
l same as with partitioning by RANGE and LIST.)
l Other numeric data types (such as DECIMAL or FLOAT) are not supported as partitioning columns.
l DATE and DATETIME.
l Columns using other data types relating to dates or times are not supported as partitioning columns.
l The following string types: CHAR, VARCHAR, BINARY, and VARBINARY.
l TEXT and BLOB columns are not supported as partitioning columns.
2.3.1. 范围COLUMNS分区:类似于范围分区,但分区键支持非数字类型,并且分区键可以由多个列组成。
分区键不支持表达式,只能是列名。指定多个列作为分区键,那么比较方式是将多个字段作为元组进行比较的,不是单独比较。
CREATE TABLE rc1 (
a INT,
b INT
)
PARTITION BY RANGE COLUMNS(a, b) (
PARTITION p0 VALUES LESS THAN (5, 12),
PARTITION p3 VALUES LESS THAN (MAXVALUE, MAXVALUE)
);
mysql> INSERT INTO rc1 VALUES (5,10), (5,11), (5,12);
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> SELECT PARTITION_NAME,TABLE_ROWS
-> FROM INFORMATION_SCHEMA.PARTITIONS
-> WHERE TABLE_NAME = 'rc1';
+--------------+----------------+------------+
| TABLE_SCHEMA | PARTITION_NAME | TABLE_ROWS |
+--------------+----------------+------------+
| p | p0 | 2 |
| p | p1 | 1 |
+--------------+----------------+------------+
2 rows in set (0.00 sec)
mysql> SELECT (5,10) < (5,12), (5,11) < (5,12), (5,12) < (5,12);
+-----------------+-----------------+-----------------+
| (5,10) < (5,12) | (5,11) < (5,12) | (5,12) < (5,12) |
+-----------------+-----------------+-----------------+
| 1 | 1 | 0 |
+-----------------+-----------------+-----------------+
1 row in set (0.00 sec)
作为分区键的所有列,必须严格递增,对于超出元组最大值的行,可以指定MAXVALUE作为新分区。
严格递增是按照元组的元素顺序比较,可以通过查询语句进行测试:
mysql> SELECT (5,10) < (5,12), (5,11) < (5,12), (5,12) < (5,12);
+-----------------+-----------------+-----------------+
| (5,10) < (5,12) | (5,11) < (5,12) | (5,12) < (5,12) |
+-----------------+-----------------+-----------------+
| 1 | 1 | 0 |
+-----------------+-----------------+-----------------+
1 row in set (0.00 sec)
所以如下定义也是合法的
CREATE TABLE rc4 (
a INT,
b INT,
c INT
)
PARTITION BY RANGE COLUMNS(a,b,c) (
PARTITION p0 VALUES LESS THAN (0,25,50),
PARTITION p1 VALUES LESS THAN (10,20,100),
PARTITION p2 VALUES LESS THAN (10,30,50)
PARTITION p3 VALUES LESS THAN (MAXVALUE,MAXVALUE,MAXVALUE)
);
如下定义是不合法的
mysql> CREATE TABLE rcf (
-> a INT,
-> b INT,
-> c INT
-> )
-> PARTITION BY RANGE COLUMNS(a,b,c) (
-> PARTITION p0 VALUES LESS THAN (0,25,50),
-> PARTITION p1 VALUES LESS THAN (20,20,100),
-> PARTITION p2 VALUES LESS THAN (10,30,50),
-> PARTITION p3 VALUES LESS THAN (MAXVALUE,MAXVALUE,MAXVALUE)
-> );
ERROR 1493 (HY000): VALUES LESS THAN value must be strictly increasing for each partition
2.3.2. 列表COLUMNS分区:类似于列表分区,但分区键支持非数字类型,并且分区键可以由多个列组成
根据以上需求,可以创建以字符串作为分区键的列表COLUMNS分区:
CREATE TABLE customers_1 (
first_name VARCHAR(25),
last_name VARCHAR(25),
street_1 VARCHAR(30),
street_2 VARCHAR(30),
city VARCHAR(15),
renewal DATE
)
PARTITION BY LIST COLUMNS(city) (
PARTITION pRegion_1 VALUES IN('Oskarshamn', 'Högsby', 'Mönsterås'),
PARTITION pRegion_2 VALUES IN('Vimmerby', 'Hultsfred', 'Västervik'),
PARTITION pRegion_3 VALUES IN('Nässjö', 'Eksjö', 'Vetlanda'),
PARTITION pRegion_4 VALUES IN('Uppvidinge', 'Alvesta', 'Växjo')
);
2.4. 分区表对于NULL值处理
MySQL允许NULL作为分区值,对于不同的分区类型对NULL的处理不一样
l 范围分区分区键的值为NULL的记录会被插入到最低的分区内
l 列表分区分区键的值为NULL的记录时,如果显示定了NULL为一个列表分区,则插入该分区,否则无法插入
l 哈希和键值分区会将NULL当做0来处理
3. 分区管理
5.6支持多种方式修改分区表,包括添加、删除、重定义、合并、拆分等操作
对于分区的添加删除比较简单,不给出具体说明和示例,参考语法图即可。
语法图:
ALTER [IGNORE] TABLE tbl_name
| ADD PARTITION (partition_definition)
| DROP PARTITION partition_names
| TRUNCATE PARTITION {partition_names | ALL}
| COALESCE PARTITION number
| REORGANIZE PARTITION partition_names INTO (partition_definitions)
| EXCHANGE PARTITION partition_name WITH TABLE tbl_name
| ANALYZE PARTITION {partition_names | ALL}
| CHECK PARTITION {partition_names | ALL}
| OPTIMIZE PARTITION {partition_names | ALL}
| REBUILD PARTITION {partition_names | ALL}
| REPAIR PARTITION {partition_names | ALL}
| REMOVE PARTITIONING
partition_definition:
PARTITION partition_name
[VALUES
{LESS THAN {(expr | value_list) | MAXVALUE}
|
IN (value_list)}]
[[STORAGE] ENGINE [=] engine_name]
[COMMENT [=] 'comment_text' ]
[DATA DIRECTORY [=] 'data_dir']
[INDEX DIRECTORY [=] 'index_dir']
[MAX_ROWS [=] max_number_of_rows]
[MIN_ROWS [=] min_number_of_rows]
[TABLESPACE [=] tablespace_name]
[NODEGROUP [=] node_group_id]
[(subpartition_definition [, subpartition_definition] ...)]
3.1. 将非分区表转换成分区表,该操作实质是删除原表并按照分区表方式重建,大数据量表慎用
CREATE TABLE trb3 (id INT, name VARCHAR(50), purchased DATE)
PARTITION BY RANGE( YEAR(purchased) ) (
PARTITION p0 VALUES LESS THAN (1990),
PARTITION p1 VALUES LESS THAN (1995),
PARTITION p2 VALUES LESS THAN (2000),
PARTITION p3 VALUES LESS THAN (2005)
);
ALTER TABLE trb3 PARTITION BY KEY(id) PARTITIONS 2;
3.2. 列表、范围分区管理:
范围分区时,可以增加一个大于最大分区的分区,但不可以在已存在分区或者最小分区之前增加分区。
mysql> ALTER TABLE members
> ADD PARTITION (
> PARTITION n VALUES LESS THAN (1960));
ERROR 1463 (HY000): VALUES LESS THAN value must be strictly »
increasing for each partition
可以通过重组第一个分区为两个新的范围分区:
mysql> ALTER TABLE members
> ADD PARTITION (
> PARTITION n VALUES LESS THAN (1960));
ERROR 1463 (HY000): VALUES LESS THAN value must be strictly »
increasing for each partition
同样列表分区也不能增加一个包含已有分区值的分区
CREATE TABLE tt (
id INT,
data INT
)
PARTITION BY LIST(data) (
PARTITION p0 VALUES IN (5, 10, 15),
PARTITION p1 VALUES IN (6, 12, 18)
);
mysql> ALTER TABLE tt ADD PARTITION
> (PARTITION np VALUES IN (4, 8, 12));
ERROR 1465 (HY000): Multiple definition of same constant »
in list partitioning
合并相邻分区:
ALTER TABLE members REORGANIZE PARTITION s0,s1 INTO (
PARTITION p0 VALUES LESS THAN (1970)
);
将多个分区合并为两个分区:
ALTER TABLE members REORGANIZE PARTITION p0,p1,p2,p3 INTO (
PARTITION m0 VALUES LESS THAN (1980),
PARTITION m1 VALUES LESS THAN (2000)
);
ALTER TABLE tt ADD PARTITION (PARTITION np VALUES IN (4, 8));
ALTER TABLE tt REORGANIZE PARTITION p1,np INTO (
PARTITION p1 VALUES IN (6, 18),
PARTITION np VALUES in (4, 8, 12)
);
关于列表分区和范围分区维护的要点:
l 新增分区不能和已有范围或者值重叠
l 合并分区应该考虑能够覆盖原有分区的范围或者值
l 合并范围分区只可以合并相邻分区,不能跨区合并
l 不能通过reorganize parition 语句改变表的分区类型,也不能改变分区键
3.3. 哈希和键值分区管理
可以使用ALTER TABLE ... COALESCE PARTITION语句合并分区
例如,将哈希分区表clients原12个分区减少为4个分区
CREATE TABLE clients (
id INT,
fname VARCHAR(30),
lname VARCHAR(30),
signed DATE
)
PARTITION BY HASH( MONTH(signed) )
PARTITIONS 12;
mysql> ALTER TABLE clients COALESCE PARTITION 4;
Query OK, 0 rows affected (0.02 sec)
将线性键值分区表clients原12分区减少为4个
mysql> CREATE TABLE clients_lk (
-> id INT,
-> fname VARCHAR(30),
-> lname VARCHAR(30),
-> signed DATE
-> )
-> PARTITION BY LINEAR KEY(signed)
-> PARTITIONS 12;
Query OK, 0 rows affected (0.03 sec)
mysql> ALTER TABLE clients_lk COALESCE PARTITION 4;
Query OK, 0 rows affected (0.06 sec)
Records: 0 Duplicates: 0 Warnings: 0
对于哈希分区和键值分区,合并表意味着减少分区,增加分区可以使用ALTER TABLE xxx ADD PARTITION:
ALTER TABLE clients ADD PARTITION PARTITIONS 6;
交换分区:
可以使用ALTER TABLE pt EXCHANGE PARTITION p WITH TABLE nt语句进行分区交换,但需要保证如下条件成立:
l nt表不是pt表的分区
l nt表不是临时表
l pt表和nt表的结构一致
l nt表不包括任何外键约束,其他表也不能有外键引用nt表
l nt表数据不能落在p分区范围外
l 必须有alter,insert,create,drop的权限
需要注意的是:
l 执行分区交换不会出发任何触发器,因为他是DDL操作
l 自增列将被重置
l 当使用交换分区语句时,ignore关键字无效
示例:
创建分区表
CREATE TABLE e (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30)
)
PARTITION BY RANGE (id) (
PARTITION p0 VALUES LESS THAN (50),
PARTITION p1 VALUES LESS THAN (100),
PARTITION p2 VALUES LESS THAN (150),
PARTITION p3 VALUES LESS THAN (MAXVALUE)
);
INSERT INTO e VALUES
(1669, "Jim", "Smith"),
(337, "Mary", "Jones"),
(16, "Frank", "White"),
(2005, "Linda", "Black");
创建表e2当做交换表
mysql> CREATE TABLE e2 LIKE e;
Query OK, 0 rows affected (1.34 sec)
mysql> ALTER TABLE e2 REMOVE PARTITIONING;
Query OK, 0 rows affected (0.90 sec)
Records: 0 Duplicates: 0 Warnings: 0
查询分区表信息:
mysql> SELECT PARTITION_NAME, TABLE_ROWS
-> FROM INFORMATION_SCHEMA.PARTITIONS
-> WHERE TABLE_NAME = 'e';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0 | 1 |
| p1 | 0 |
| p2 | 0 |
| p3 | 3 |
+----------------+------------+
执行分区交换:
mysql> ALTER TABLE e EXCHANGE PARTITION p0 WITH TABLE e2;
Query OK, 0 rows affected (0.28 sec)
查询分区信息,p0分区的数据变为0
mysql> SELECT PARTITION_NAME, TABLE_ROWS
-> FROM INFORMATION_SCHEMA.PARTITIONS
-> WHERE TABLE_NAME = 'e';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0 | 0 |
| p1 | 0 |
| p2 | 0 |
| p3 | 3 |
+----------------+------------+
4 rows in set (0.00 sec)
3.4. 重建分区:
实质类似于drop分区,然后将全部数据插入分区,目的是减少碎片。
示例:
ALTER TABLE t1 REBUILD PARTITION p0, p1;
3.5. 优化分区:
当分区内执行了大数据量delete操作,或者对有可变长度类型的行执行大量修改,可以使用该回收未使用空间和减少碎片
ALTER TABLE t1 OPTIMIZE PARTITION p0, p1;
和表优化操作一样,执行分区优化相当于在该分区上执行了CHECK PARTITION ,ANALYZE PARTITION , and REPAIR PARTITION
注意:一些存储引擎(包括InnoDB)并不支持分区优化,在5.6.9之后运行分区优化会重建整个表,一定要慎用,可以使用ALTER TABLE ...REBUILD PARTITION和ALTER TABLE ... ANALYZE PARTITION替代
示例:
ALTER TABLE t1 ANALYZE PARTITION p1;
ALTER TABLE trb3 CHECK PARTITION p1;
ALTER TABLE t1 REPAIR PARTITION p1;
3.6. 截断分区:
会删除整个分区的数据,由于是DDL操作,只产生少量日志,速度更快。
示例:
ALTER TABLE t1 TRUNCATE PARTITION p1;
3.7. 获取分区信息
l 使用SHOW CREATE TABLE语句查看创建分区表脚本
l 使用SHOW TABLE STATUS语句判断一个表是否是分区表,Create_options字段标注为partitioned
l 查询INFORMATION_SCHEMA.PARTITIONS获取分区信息
l 使用EXPLAIN PARTITIONS SELECT看查询语句是否使用分区裁剪
示例,查看分区裁剪信息
CREATE TABLE trb1 (id INT, name VARCHAR(50), purchased DATE)
PARTITION BY RANGE(id)
(
PARTITION p0 VALUES LESS THAN (3),
PARTITION p1 VALUES LESS THAN (7),
PARTITION p2 VALUES LESS THAN (9),
PARTITION p3 VALUES LESS THAN (11)
);
INSERT INTO trb1 VALUES
(1, 'desk organiser', '2003-10-15'),
(2, 'CD player', '1993-11-05'),
(3, 'TV set', '1996-03-10'),
(4, 'bookcase', '1982-01-10'),
(5, 'exercise bike', '2004-05-09'),
(6, 'sofa', '1987-06-05'),
(7, 'popcorn maker', '2001-11-22'),
(8, 'aquarium', '1992-08-04'),
(9, 'study desk', '1984-09-16'),
(10, 'lava lamp', '1998-12-25');
mysql> EXPLAIN PARTITIONS SELECT * FROM trb1\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: trb1
partitions: p0,p1,p2,p3
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 10
Extra: Using filesort
mysql> EXPLAIN PARTITIONS SELECT * FROM trb1 WHERE id < 5\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: trb1
partitions: p0,p1
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 10
Extra: Using where
4. 分区裁剪
4.1. 自动分区裁剪
分区裁剪的概念很简单,就是没有匹配的记录就不扫描该分区,可以比通过全表或全分区扫描话费更少的时间,这一切都是优化器后台执行的,开发人员不需要关心。在查询时,MyISAM分区表由于设计原因,所有分区文件描述将会被打开。所以,使用MyISAM分区表的时候,要考虑文件打卡数量的限制。InnoDB不存在这样的问题。
使用分区裁剪的情况,适用于SELECT,DELETE,UPDATGE语句
l partition_column = constant(在特定情况下,也可以是<,>,<=,>=,<>,between and,是否能够分区裁剪这取决于优化器的分析结果)
l partition_column IN ( constant1, constant2, ..., constantN)
在时间范围分区中,分区键是时间函数表达式也可以进行分区裁剪:
示例1:范围分区的分区裁剪
CREATE TABLE t2 (
fname VARCHAR(50) NOT NULL,
lname VARCHAR(50) NOT NULL,
region_code TINYINT UNSIGNED NOT NULL,
dob DATE NOT NULL
)
PARTITION BY RANGE( YEAR(dob) ) (
PARTITION d0 VALUES LESS THAN (1970),
PARTITION d1 VALUES LESS THAN (1975),
PARTITION d2 VALUES LESS THAN (1980),
PARTITION d3 VALUES LESS THAN (1985),
PARTITION d4 VALUES LESS THAN (1990),
PARTITION d5 VALUES LESS THAN (2000),
PARTITION d6 VALUES LESS THAN (2005),
PARTITION d7 VALUES LESS THAN MAXVALUE
);
以下查询均可以用到分区裁剪:
SELECT * FROM t2 WHERE dob = '1982-06-23';
UPDATE t2 SET region_code = 8 WHERE dob BETWEEN '1991-02-15' AND '1997-04-25';
DELETE FROM t2 WHERE dob >= '1984-06-21' AND dob <= '1999-06-21'
示例2:列表分区的分区裁剪
CREATE TABLE t3 (
fname VARCHAR(50) NOT NULL,
lname VARCHAR(50) NOT NULL,
region_code TINYINT UNSIGNED NOT NULL,
dob DATE NOT NULL
)
PARTITION BY LIST(region_code) (
PARTITION r0 VALUES IN (1, 3),
PARTITION r1 VALUES IN (2, 5, 8),
PARTITION r2 VALUES IN (4, 9),
PARTITION r3 VALUES IN (6, 7, 10)
);
以下查询均可以用到分区裁剪:
select * from t3 where region_code =3
select * from t3 where region_code in(1,2,3)
select * from te where region_code between 1 and 3
示例3:哈希或者键值分区的分区裁剪
CREATE TABLE t4 (
fname VARCHAR(50) NOT NULL,
lname VARCHAR(50) NOT NULL,
region_code TINYINT UNSIGNED NOT NULL,
dob DATE NOT NULL
)
PARTITION BY KEY(region_code)
PARTITIONS 8;
以下均查询可以用到分区裁剪:
select * from t4 where region_code=7
select * from t4 where region_code > 2 and region_code < 6
select * from t4 between 3 and 5
4.2. 显示的指定分区
显示指定分区适用于如下语句:
l SELECT
l DELETE
l INSERT
l REPLACE
l UPDATE
l LOAD DATA.
l LOAD XML.
语法:
PARTITION ( partition_names)
partition_names :
partition_name , ...
示例:
CREATE TABLE employees (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
fname VARCHAR(25) NOT NULL,
lname VARCHAR(25) NOT NULL,
store_id INT NOT NULL,
department_id INT NOT NULL
)
PARTITION BY RANGE(id) (
PARTITION p0 VALUES LESS THAN (5),
PARTITION p1 VALUES LESS THAN (10),
PARTITION p2 VALUES LESS THAN (15),
PARTITION p3 VALUES LESS THAN MAXVALUE
);
INSERT INTO employees VALUES
('', 'Bob', 'Taylor', 3, 2), ('', 'Frank', 'Williams', 1, 2),
('', 'Ellen', 'Johnson', 3, 4), ('', 'Jim', 'Smith', 2, 4),
('', 'Mary', 'Jones', 1, 1), ('', 'Linda', 'Black', 2, 3),
('', 'Ed', 'Jones', 2, 1), ('', 'June', 'Wilson', 3, 1),
('', 'Andy', 'Smith', 1, 3), ('', 'Lou', 'Waters', 2, 4),
('', 'Jill', 'Stone', 1, 4), ('', 'Roger', 'White', 3, 2),
('', 'Howard', 'Andrews', 1, 2), ('', 'Fred', 'Goldberg', 3, 3),
('', 'Barbara', 'Brown', 2, 3), ('', 'Alice', 'Rogers', 2, 2),
('', 'Mark', 'Morgan', 3, 3), ('', 'Karen', 'Cole', 3, 2);
mysql> SELECT * FROM employees PARTITION (p1);
+----+-------+--------+----------+---------------+
| id | fname | lname | store_id | department_id |
+----+-------+--------+----------+---------------+
| 5 | Mary | Jones | 1 | 1 |
| 6 | Linda | Black | 2 | 3 |
| 7 | Ed | Jones | 2 | 1 |
| 8 | June | Wilson | 3 | 1 |
| 9 | Andy | Smith | 1 | 3 |
+----+-------+--------+----------+---------------+
5 rows in set (0.00 sec)
注意:对于hash分区和线性分区表,没法在语句中指定分区名称的,MySQL会自动指定名称为p0,p1,p2,...,pN-1,N为定义的分区个数。如果还有自分区,那么自动指定子分区为pX(X为父分区编号),那么p1子分区的编号为p1sp1,p1sp2,...,p1spM-1,M为定义的子分区个数。
示例:
mysql> CREATE TABLE employees_sub (
-> id INT NOT NULL AUTO_INCREMENT,
-> fname VARCHAR(25) NOT NULL,
-> lname VARCHAR(25) NOT NULL,
-> store_id INT NOT NULL,
-> department_id INT NOT NULL,
-> PRIMARY KEY pk (id, lname)
-> )
-> PARTITION BY RANGE(id)
-> SUBPARTITION BY KEY (lname)
-> SUBPARTITIONS 2 (
-> PARTITION p0 VALUES LESS THAN (5),
-> PARTITION p1 VALUES LESS THAN (10),
-> PARTITION p2 VALUES LESS THAN (15),
-> PARTITION p3 VALUES LESS THAN MAXVALUE
-> );
Query OK, 0 rows affected (1.14 sec)
mysql> INSERT INTO employees_sub # re-use data in employees table
-> SELECT * FROM employees;
Query OK, 18 rows affected (0.09 sec)
Records: 18 Duplicates: 0 Warnings: 0
mysql> SELECT id, CONCAT(fname, ' ', lname) AS name
-> FROM employees_sub PARTITION (p2sp1);
+----+---------------+
| id | name |
+----+---------------+
| 10 | Lou Waters |
| 14 | Fred Goldberg |
+----+---------------+
2 rows in set (0.00 sec)
指定分区也可以用于关联:
mysql> SELECT
-> e.id AS 'Employee ID', CONCAT(e.fname, ' ', e.lname) AS Name,
-> s.city AS City, d.name AS department
-> FROM employees AS e
-> JOIN stores PARTITION (p1) AS s ON e.store_id=s.id
-> JOIN departments PARTITION (p0) AS d ON e.department_id=d.id
-> ORDER BY e.lname;
+-------------+---------------+-----------+------------+
| Employee ID | Name | City | department |
+-------------+---------------+-----------+------------+
| 14 | Fred Goldberg | Bellingen | Delivery |
| 5 | Mary Jones | Nambucca | Sales |
| 17 | Mark Morgan | Bellingen | Delivery |
| 9 | Andy Smith | Nambucca | Delivery |
| 8 | June Wilson | Bellingen | Sales |
+-------------+---------------+-----------+------------+
5 rows in set (0.00 sec)
5. 分区限制和特性
5.1. 分区的限制
l 自定义函数不可以用作分区表达式
l 用户变量不可以用于分区表达式
l 算数运算符/(除)不可以用于分区表达式
l 未操作运算符不可以用于分区表达式
l handler语句不支持分区表
l InnoDB的分区表不支持外键
l 不支持全文索引,空间类型列,ENUM类型
l 临时表、日志不能定义为分区表
l 自分区只能是哈希和键值分区
l 不支持DELAYED
l SQL mode的改变可能会引起用户自定义的的分区表崩溃或者丢失数据
l 唯一建、主键必须包含分区字段
5.2. 基于分区表的优化问题
l 创建和重建分区操作和文件系统有关,比如文件系统的类型、磁盘速度、交换分区大小等。在做这些操作时,确保large_files_support开启、open_files_limit设置较大
l MyISAM表需要打开更多的文件,所以如果分区表是MyISAM的,请设置合理的open_files_limit。
l 当对分区进行DDL操作时,会产生表锁,查询可以并发,但DML会阻塞到分区操作完成
l MyISAM分区表的性能要优于InnoDB
l 利用索引和分区裁剪能够大大提高查询效率,需要注意ICP(Index Condition Pushdown)不能应用于分区表
l 分区表能够加速LOAD DATA操作,每个分区需要130K的缓存用于此加速,分区表过多时会占用大量内存
l 5.6.7之前默认支持最大的分区数是1024,之后是8192,实际决定分区多少还需要考虑open_files_limit和系统本身支持也有关系。对于有几百个分区的表,往往不是一个很好的设计
l 不支持查询缓存,支持预存分区索引
5.3. 分区和锁
DML语句:
l update语句会在更新非分区字段时进行分区裁剪
l replace和insert会锁住插入行所在的分区,但如果是由自增列生成的分区则会锁住所有分区
l INSERT ... ON DUPLICATE KEY UPDATE如果非分区列更新将会分区裁剪
l INSERT...SELECT只锁那些需要读取的分区
DDL语句:
l CREATE VIEW将不会引起任何锁
l ALTER TABLE ... EXCHANGE PARTITION会进行分区裁剪
l ALTER TABLE ... TRUNCATE PARTITION会进行分区裁剪
l ALTER TABLE操作会产生表级别的metadata锁