公司线上在用partition,有一个表的分区字段错了,需要重建,结果发现没有办法像修改主键字段或者修改索引字段那样直接一条sql搞定。而是需要建临时表,有down time,所以去仔细看了文档,研究下partition的细节问题。

自己公司线上采取的时候,凌晨1点业务低峰期,执行:

建立临时表

CREATE TABLE tbname_TMP (
    SHARD_ID INT NOT NULL,
    ...

    xxx_DATE DATETIME NOT NULL,
    PRIMARY KEY (xxx_DATE,shard_id)
) ENGINE=INNODB DEFAULT CHARSET=utf8 COLLATE=utf8_bin
PARTITION BY LIST(MONTH(xxx_DATE)) (
    PARTITION m1 VALUES IN (1),
    PARTITION m2 VALUES IN (2),
    PARTITION m3 VALUES IN (3),
    PARTITION m4 VALUES IN (4),
    PARTITION m5 VALUES IN (5),
    PARTITION m6 VALUES IN (6),
    PARTITION m7 VALUES IN (7),
    PARTITION m8 VALUES IN (8),
    PARTITION m9 VALUES IN (9),
    PARTITION m10 VALUES IN (10),
    PARTITION m11 VALUES IN (11),
    PARTITION m12 VALUES IN (12)
);

切换表名字,修改表结构

RENAME TABLE xxx TO xxx_DELETED, xxx_TMP TO xxx;

导入原始数据

insert into xxx select * from xxx_DELETEDxxx_DELETED;

OK,一切搞定,整个过程50分钟,MMM failover切换中后outline操作表结构变更以及数据导入,实际downtime不包括修改表结构分区字段的时间,只包括failover切换时间 为30秒

MySQL Partition,看的官方英文资料,翻译水平有限,有些不翻译成中文了,直接贴英文了。
1 list partition table
mysql> CREATE TABLE `eh` (
    ->   `id` int(11) NOT NULL,
    ->   `ENTITLEMENT_HIST_ID` bigint(20) NOT NULL,
    ->   `ENTITLEMENT_ID` bigint(20) NOT NULL,
    ->   `USER_ID` bigint(20) NOT NULL,
    ->   `DATE_CREATED` datetime NOT NULL,
    ->   `STATUS` smallint(6) NOT NULL,
    ->   `CREATED_BY` varchar(32) COLLATE utf8_bin DEFAULT NULL,
    ->   `MODIFIED_BY` varchar(32) COLLATE utf8_bin DEFAULT NULL,
    ->   `DATE_MODIFIED` datetime NOT NULL,
    ->   PRIMARY KEY (`DATE_MODIFIED`,`id`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin
    -> /*!50100 PARTITION BY LIST (MONTH(DATE_MODIFIED))
    -> (PARTITION m1 VALUES IN (1) ENGINE = InnoDB,
    ->  PARTITION m2 VALUES IN (2) ENGINE = InnoDB,
    ->  PARTITION m3 VALUES IN (3) ENGINE = InnoDB,
    ->  PARTITION m4 VALUES IN (4) ENGINE = InnoDB,
    ->  PARTITION m5 VALUES IN (5) ENGINE = InnoDB,
    ->  PARTITION m6 VALUES IN (6) ENGINE = InnoDB,
    ->  PARTITION m7 VALUES IN (7) ENGINE = InnoDB,
    ->  PARTITION m8 VALUES IN (8) ENGINE = InnoDB,
    ->  PARTITION m9 VALUES IN (9) ENGINE = InnoDB,
    ->  PARTITION m10 VALUES IN (10) ENGINE = InnoDB,
    ->  PARTITION m11 VALUES IN (11) ENGINE = InnoDB,
    ->  PARTITION m12 VALUES IN (12) ENGINE = InnoDB) */;
Query OK, 0 rows affected (0.10 sec)


2 rang partition table
mysql> CREATE TABLE rcx (
    ->     a INT,
    ->     b INT,
    ->     c CHAR(3),
    ->     d INT
    -> )
    -> PARTITION BY RANGE COLUMNS(a,d,c) (
    ->     PARTITION p0 VALUES LESS THAN (5,10,'ggg'),
    ->     PARTITION p1 VALUES LESS THAN (10,20,'mmmm'),
    ->     PARTITION p2 VALUES LESS THAN (15,30,'sss'),
    ->     PARTITION p3 VALUES LESS THAN (MAXVALUE,MAXVALUE,MAXVALUE)
    -> );
Query OK, 0 rows affected (0.15 sec)

 

3 create range use less character
CREATE TABLE employees_by_lname (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
    job_code INT NOT NULL,
    store_id INT NOT NULL
)
PARTITION BY RANGE COLUMNS (lname)  (
    PARTITION p0 VALUES LESS THAN ('g'),
    PARTITION p1 VALUES LESS THAN ('m'),
    PARTITION p2 VALUES LESS THAN ('t'),
    PARTITION p3 VALUES LESS THAN (MAXVALUE)
);

alter table structure,add a new partition block
ALTER TABLE employees_by_lname PARTITION BY RANGE COLUMNS (lname)  (
    PARTITION p0 VALUES LESS THAN ('g'),
    PARTITION p1 VALUES LESS THAN ('m'),
    PARTITION p2 VALUES LESS THAN ('t'),
 PARTITION p3 VALUES LESS THAN ('u'),
    PARTITION p4 VALUES LESS THAN (MAXVALUE)
);


4 List columns partitioning
character column
CREATE TABLE customers_1 (
    first_name VARCHAR(25),
    last_name VARCHAR(25),
    street_1 VARCHAR(30),
    street_2 VARCHAR(30),
    city VARCHAR(15),
    renewal DATE
)
PARTITION BY LIST COLUMNS(city) (
    PARTITION pRegion_1 VALUES IN('Oskarshamn', 'H?gsby', 'M?nster?s'),
    PARTITION pRegion_2 VALUES IN('Vimmerby', 'Hultsfred', 'V?stervik'),
    PARTITION pRegion_3 VALUES IN('N?ssj?', 'Eksj?', 'Vetlanda'),
    PARTITION pRegion_4 VALUES IN('Uppvidinge', 'Alvesta', 'V?xjo')
);

date column
CREATE TABLE customers_2 (
    first_name VARCHAR(25),
    last_name VARCHAR(25),
    street_1 VARCHAR(30),
    street_2 VARCHAR(30),
    city VARCHAR(15),
    renewal DATE
)
PARTITION BY LIST COLUMNS(renewal) (
    PARTITION pWeek_1 VALUES IN('2010-02-01', '2010-02-02', '2010-02-03',
        '2010-02-04', '2010-02-05', '2010-02-06', '2010-02-07'),
    PARTITION pWeek_2 VALUES IN('2010-02-08', '2010-02-09', '2010-02-10',
        '2010-02-11', '2010-02-12', '2010-02-13', '2010-02-14'),
    PARTITION pWeek_3 VALUES IN('2010-02-15', '2010-02-16', '2010-02-17',
        '2010-02-18', '2010-02-19', '2010-02-20', '2010-02-21'),
    PARTITION pWeek_4 VALUES IN('2010-02-22', '2010-02-23', '2010-02-24',
        '2010-02-25', '2010-02-26', '2010-02-27', '2010-02-28')
);

5 HASH Partitioning
 int column,it can use digital function
CREATE TABLE employeesint (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
    job_code INT,
    store_id INT
)
PARTITION BY HASH(MOD(store_id,4))
PARTITIONS 4;

If you do not include a PARTITIONS clause, the number of partitions defaults to 1. as below:
CREATE TABLE employeestest (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
    job_code INT,
    store_id INT
)
PARTITION BY HASH(store_id);

 date colum
CREATE TABLE employees2 (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
    job_code INT,
    store_id INT
)
PARTITION BY HASH( YEAR(hired) )
PARTITIONS 4;

truncate all data rows:  alter table rcx truncate PARTITION;

 

6 LINEAR HASH Partitioning
CREATE TABLE employees_linear (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
    job_code INT,
    store_id INT
)
PARTITION BY LINEAR HASH( YEAR(hired) )
PARTITIONS 4;

Given an expression expr, the partition in which the record is stored when linear hashing is used is partition number N from among num partitions, where N is derived according to the following algorithm:
(1)  Find the next power of 2 greater than num. We call this value V; it can be calculated as:
     V = POWER(2, CEILING(LOG(2, num)))
     (Suppose that num is 13. Then LOG(2,13) is 3.7004397181411. CEILING(3.7004397181411) is 4, and V = POWER(2,4), which is 16.)
(2) Set N = F(column_list) & (V - 1).
(3)  While N >= num:
    Set V = CEIL(V / 2)
    Set N = N & (V - 1)


 [注释] & 在SQL里面的计算原理为:比如
 把十进制转化进制成二进制,就得到了 http://zh.wikipedia.org/wiki/%E4%BA%8C%E8%BF%9B%E5%88%B6


 首先按右对齐,例如变成0011和1000,按照每一位的数字来判断,如果两个都是1,则结果的相应位置就是1,否则就是0
 如果是1011和1000,结果就是1000
 如果是0110和1010,则结果就是0010
 但是3是0011,8 是1000,所以3&8结果就是0

CEILING(X) CEIL(X): 返回不小于X 的最小整数值。
LOG(X) LOG(B,X) :若用一个参数调用,这个函数就会返回X 的自然对数。
POWER(X,Y) : 返回X 的Y乘方的结果值。


数据分布在哪个片区的计算方法:
Suppose that the table t1, using linear hash partitioning and having 6 partitions, is created using this statement:

CREATE TABLE t1 (col1 INT, col2 CHAR(5), col3 DATE)
    PARTITION BY LINEAR HASH( YEAR(col3) )
    PARTITIONS 6;

Now assume that you want to insert two records into t1 having the col3 column values '2003-04-14' and '1998-10-19'. The partition number for the first of these is determined as follows:
 V = POWER(2, CEILING( LOG(2,6) )) = 8
 N = YEAR('2003-04-14') & (8 - 1)
    = 2003 & 7
    = 3
 (3 >= 6 is FALSE: record stored in partition #3)

The number of the partition where the second record is stored is calculated as shown here:
 V = 8
 N = YEAR('1998-10-19') & (8-1)
   = 1998 & 7
   = 6

 (6 >= 6 is TRUE: additional step required)

 N = 6 & CEILING(8 / 2)
   = 6 & 3
   = 2

 (2 >= 6 is FALSE: record stored in partition #2)


The advantage in partitioning by linear hash is that the adding, dropping, merging, and splitting of partitions is made much faster, which can be beneficial when dealing with tables containing extremely large amounts (terabytes) of data. The disadvantage is that data is less likely to be evenly distributed between partitions as compared with the distribution obtained using regular hash partitioning.

疑问之一:MySQL 如何用一条SQL,不需要用临时表来删除分区字段?将分区表变成普通的表?

 

Copyright © 2024 冰天雪域
Powered by .NET 9.0 on Kubernetes