MySQL优化器 --- index_merge

背景

  对于关系数据库中的一张表,通常来说数据页面的总大小要比较某一个索引占用的页面要大的多(上面说的索引是不包涵主键索引的);

  更进一步我们可以推导出,如果我们通过读索引就能解决问题,那么它相比读数据页来说要廉价的多;整体上看数据库会尽可能的通过

  读索引就解决问题。

 

index_merge是什么

  为了说明index_merge是什么、这里还是从一个例子开始;假设数据库存在如下内容

create table person (id int not null auto_increment primary key,
                     name varchar(8) default null,
                     age tinyint default null,
                     key idx_person_name (name),
                     key idx_person_age (age)
);

  

  表中的数据如下

select * from person;
+----+-------+------+
| id | name  | age  |
+----+-------+------+
|  1 | tom   |   16 |
|  2 | jerry |   17 |
|  3 | neeky |    3 |
+----+-------+------+
3 rows in set (0.00 sec)

  

  下面的这条SQL语句事实上可以这样做,读取idx_person_name找到name='tom'的行id,读取idx_person_age找到age=17的行id;

  给这两个id的集合做一下交集;这样就找到了所有满足条件的行id,最后回表把对应的行给查询出来;如果MySQL这样做的话

  在索引页面数理远远小于数据页面数量的情况下是有节约成功的优势的

select name,age from person where name='tom' and age=17; 

  

  事实上MySQL会不会这样干呢?对于这个还是要看执行记录比较靠普;从下面的执行计划可以看出MySQL选择了只用

  idx_person_name这一个索引,从innodb中捞到数据后在server层过滤的方式来完成查询。明显没有用到index_merge

explain select name,age from person where name='tom' and age=17;                                            
+----+-------------+--------+------------+------+--------------------------------+-----------------+---------+-------+------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys                  | key             | key_len | ref   | rows | filtered | Extra       |
+----+-------------+--------+------------+------+--------------------------------+-----------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | person | NULL       | ref  | idx_person_name,idx_person_age | idx_person_name | 67      | const |    1 |    33.33 | Using where |
+----+-------------+--------+------------+------+--------------------------------+-----------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec

 

  从上面的分析我们可以知道用不用index_merge优化,不光是看可不可以用理加重要的是看代价是否合理;为了让MySQL知道索引页面的数量要远远小于

  数据页面的数量,我要在表中多插入一些数据(复制执行下面的语句)

insert into person(name,age) select name,age from person; -- 执行n次
select count(*) from person;
+----------+
| count(*) |
+----------+
|   393216 |
+----------+
1 row in set (0.05 sec)

 

  在数据量差不多40w的情况下我们再看一下优化器的选择

explain select name,age from person where name='tom' and age=17; 
+----+-------------+--------+------------+-------------+--------------------------------+--------------------------------+---------+------+-------+----------+---------------------------------------------------------------------------+
| id | select_type | table  | partitions | type        | possible_keys                  | key                            | key_len | ref  | rows  | filtered | Extra                                                                     |
+----+-------------+--------+------------+-------------+--------------------------------+--------------------------------+---------+------+-------+----------+---------------------------------------------------------------------------+
|  1 | SIMPLE      | person | NULL       | index_merge | idx_person_name,idx_person_age | idx_person_name,idx_person_age | 67,2    | NULL | 98237 |   100.00 | Using intersect(idx_person_name,idx_person_age); Using where; Using index |
+----+-------------+--------+------------+-------------+--------------------------------+--------------------------------+---------+------+-------+----------+---------------------------------------------------------------------------+
1 row in set, 1 warning (0.00 sec)

 

 

用了Index_merge优化会比没有用index_merge优化快多少呢

  1、测试启用index_merge情况下40w行数据时查询的用时

select name,age from person where name='tom' and age=17;                                                    
Empty set (0.08 sec)

  2、关闭MySQL数据库对index_merge的优化

set @@global.optimizer_switch='index_merge=off,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on,use_invisible_indexes=off,skip_scan=on';

--: 退出重新连接(这样刚才的设置就生效了)

  3、在没有index_merge的情况下发起查询

explain select name,age from person where name='tom' and age=17; 
+----+-------------+--------+------------+------+--------------------------------+-----------------+---------+-------+--------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys                  | key             | key_len | ref   | rows   | filtered | Extra       |
+----+-------------+--------+------------+------+--------------------------------+-----------------+---------+-------+--------+----------+-------------+
|  1 | SIMPLE      | person | NULL       | ref  | idx_person_name,idx_person_age | idx_person_name | 67      | const | 196474 |    50.00 | Using where |
+----+-------------+--------+------------+------+--------------------------------+-----------------+---------+-------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

-- 从执行计划上可以看出index_merge关闭了

select name,age from person where name='tom' and age=17;                                            
Empty set (0.34 sec)

 

总结

  对比开启和关闭index_merge,在数据量为40w这个量级的表上,开启优化相比不开有4倍以上的优化成绩。由index_merge的原理可以知在数据理更大的

  情况下优化的效果会更加明显

 

我的个人站点

  www.sqlpy.com

 

posted on 2019-01-18 14:02  蒋乐兴的技术随笔  阅读(3124)  评论(0编辑  收藏  举报

导航