MYSQL之表分区----按日期分区

错误的按日期分区例子
最直观的方法,就是直接用年月日这种日期格式来进行常规的分区:
PLAIN TEXT
CODE:
  1.  
    mysql> create table rms (d date)
  2.  
    -> partition by range (d)
  3.  
    -> (partition p0 values less than ('1995-01-01'),
  4.  
    -> partition p1 VALUES LESS THAN ('2010-01-01'));


 
上面的例子中,就是直接用"Y-m-d"的格式来对一个table进行分区,可惜想当然往往不能奏效,会得到一个错误信息:
ERROR 1064 (42000): VALUES value must be of same type as partition function near '),
partition p1 VALUES LESS THAN ('2010-01-01'))' at line 3
上述分区方式没有成功,而且明显的不经济,老练的DBA会用整型数值来进行分区:
PLAIN TEXT
CODE:
  1.  
    mysql> CREATE TABLE part_date1
  2.  
    -> ( c1 int default NULL,
  3.  
    -> c2 varchar(30) default NULL,
  4.  
    -> c3 date default NULL) engine=myisam
  5.  
    -> partition by range (cast(date_format(c3,'%Y%m%d') as signed))
  6.  
    -> (PARTITION p0 VALUES LESS THAN (19950101),
  7.  
    -> PARTITION p1 VALUES LESS THAN (19960101) ,
  8.  
    -> PARTITION p2 VALUES LESS THAN (19970101) ,
  9.  
    -> PARTITION p3 VALUES LESS THAN (19980101) ,
  10.  
    -> PARTITION p4 VALUES LESS THAN (19990101) ,
  11.  
    -> PARTITION p5 VALUES LESS THAN (20000101) ,
  12.  
    -> PARTITION p6 VALUES LESS THAN (20010101) ,
  13.  
    -> PARTITION p7 VALUES LESS THAN (20020101) ,
  14.  
    -> PARTITION p8 VALUES LESS THAN (20030101) ,
  15.  
    -> PARTITION p9 VALUES LESS THAN (20040101) ,
  16.  
    -> PARTITION p10 VALUES LESS THAN (20100101),
  17.  
    -> PARTITION p11 VALUES LESS THAN MAXVALUE );


Query OK, 0 rows affected (0.01 sec)
 
搞定?接着往下分析
PLAIN TEXT
CODE:
  1.  
    mysql> explain partitions
  2.  
    -> select count(*) from part_date1 where
  3.  
    -> c3> '1995-01-01' and c3 <'1995-12-31'\G
  4.  
    *************************** 1. row ***************************
  5.  
    id: 1
  6.  
    select_type: SIMPLE
  7.  
    table: part_date1
  8.  
    partitions: p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11
  9.  
    type: ALL
  10.  
    possible_keys: NULL
  11.  
    key: NULL
  12.  
    key_len: NULL
  13.  
    ref: NULL
  14.  
    rows: 8100000
  15.  
    Extra: Using where
  16.  
    1 row in set (0.00 sec)


 
万恶的mysql居然对上面的sql使用全表扫描,而不是按照我们的日期分区分块查询。原文中解释到MYSQL的优化器并不认这种日期形式的分区,花了大量的篇幅来引诱俺走上歧路,过分。
正确的日期分区例子
mysql优化器支持以下两种内置的日期函数进行分区:
TO_DAYS()
YEAR()
看个例子:
PLAIN TEXT
CODE:
  1.  
    mysql> CREATE TABLE part_date3
  2.  
    -> ( c1 int default NULL,
  3.  
    -> c2 varchar(30) default NULL,
  4.  
    -> c3 date default NULL) engine=myisam
  5.  
    -> partition by range (to_days(c3))
  6.  
    -> (PARTITION p0 VALUES LESS THAN (to_days('1995-01-01')),
  7.  
    -> PARTITION p1 VALUES LESS THAN (to_days('1996-01-01')) ,
  8.  
    -> PARTITION p2 VALUES LESS THAN (to_days('1997-01-01')) ,
  9.  
    -> PARTITION p3 VALUES LESS THAN (to_days('1998-01-01')) ,
  10.  
    -> PARTITION p4 VALUES LESS THAN (to_days('1999-01-01')) ,
  11.  
    -> PARTITION p5 VALUES LESS THAN (to_days('2000-01-01')) ,
  12.  
    -> PARTITION p6 VALUES LESS THAN (to_days('2001-01-01')) ,
  13.  
    -> PARTITION p7 VALUES LESS THAN (to_days('2002-01-01')) ,
  14.  
    -> PARTITION p8 VALUES LESS THAN (to_days('2003-01-01')) ,
  15.  
    -> PARTITION p9 VALUES LESS THAN (to_days('2004-01-01')) ,
  16.  
    -> PARTITION p10 VALUES LESS THAN (to_days('2010-01-01')),
  17.  
    -> PARTITION p11 VALUES LESS THAN MAXVALUE );


Query OK, 0 rows affected (0.00 sec)
 
以to_days()函数分区成功,我们分析一下看看:
PLAIN TEXT
CODE:
  1.  
    mysql> explain partitions
  2.  
    -> select count(*) from part_date3 where
  3.  
    -> c3> date '1995-01-01' and c3 <date '1995-12-31'\G
  4.  
    *************************** 1. row ***************************
  5.  
    id: 1
  6.  
    select_type: SIMPLE
  7.  
    table: part_date3
  8.  
    partitions: p1
  9.  
    type: ALL
  10.  
    possible_keys: NULL
  11.  
    key: NULL
  12.  
    key_len: NULL
  13.  
    ref: NULL
  14.  
    rows: 808431
  15.  
    Extra: Using where
  16.  
    1 row in set (0.00 sec)


 
可以看到,mysql优化器这次不负众望,仅仅在p1分区进行查询。在这种情况下查询,真的能够带来提升查询效率么?下面分别对这次建立的part_date3和之前分区失败的part_date1做一个查询对比:
PLAIN TEXT
CODE:
  1.  
    mysql> select count(*) from part_date3 where
  2.  
    -> c3> date '1995-01-01' and c3 <date '1995-12-31';
  3.  
    +----------+
  4.  
    | count(*) |
  5.  
    +----------+
  6.  
    | 805114 |
  7.  
    +----------+
  8.  
    1 row in set (4.11 sec)
  9.  
     
  10.  
    mysql> select count(*) from part_date1 where
  11.  
    -> c3> date '1995-01-01' and c3 <date '1995-12-31';
  12.  
    +----------+
  13.  
    | count(*) |
  14.  
    +----------+
  15.  
    | 805114 |
  16.  
    +----------+
  17.  
    1 row in set (40.33 sec)


 
可以看到,分区正确的话query花费时间为4秒,而分区错误则花费时间40秒(相当于没有分区),效率有90%的提升!所以我们千万要正确的使用分区功能,分区后务必用explain验证,这样才能获得真正的性能提升。


注意:
在mysql5.1中建立分区表的语句中,只能包含下列函数:
  1.  
    ABS()
  2.  
    CEILING() and FLOOR() (在使用这2个函数的建立分区表的前提是使用函数的分区键是INT类型),例如
  3.  
    mysql> CREATE TABLE t (c FLOAT) PARTITION BY LIST( FLOOR(c) )( -> PARTITION p0 VALUES IN (1,3,5), -> PARTITION p1 VALUES IN (2,4,6) -> );; ERROR 1491 (HY000): The PARTITION function returns the wrong type mysql> CREATE TABLE t (c int) PARTITION BY LIST( FLOOR(c) )( -> PARTITION p0 VALUES IN (1,3,5), -> PARTITION p1 VALUES IN (2,4,6) -> ); Query OK, 0 rows affected (0.01 sec)
  4.  
    DAY()
  5.  
    DAYOFMONTH()
  6.  
    DAYOFWEEK()
  7.  
    DAYOFYEAR()
  8.  
    DATEDIFF()
  9.  
    EXTRACT()
  10.  
    HOUR()
  11.  
    MICROSECOND()
  12.  
    MINUTE()
  13.  
    MOD()
  14.  
    MONTH()
  15.  
    QUARTER()
  16.  
    SECOND()
  17.  
    TIME_TO_SEC()
  18.  
    TO_DAYS()
  19.  
    WEEKDAY()
  20.  
    YEAR()
  21.  
    YEARWEEK()