索引优化
一、单表优化
建表sql
create table if not exists article (
id int unsigned not null primary key auto_increment,
author_id int unsigned not null,
category_id int unsigned not null,
views int unsigned not null,
comments int unsigned not null,
title varbinary(255) not null,
content text not null
);
insert into article(author_id, category_id, views, comments, title, content) values
(1, 1, 1, 1, '1', '1'),
(2, 2, 2, 2, '2', '2'),
(1, 1, 3, 3, '3', '3');
示例
查询category_id为1且comments大于1的情况下,views最多的article_id。
在不使用索引的情况下,执行查询语句, 查看执行计划。
mysql> explain select id,author_id from article where category_id = 1 and comments > 1 order by views desc limit 1;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+
| 1 | SIMPLE | article | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 33.33 | Using where; Using filesort |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+
1 row in set (0.02 sec)
可以很清楚的看到查询的type是all,即全表扫描,此外,还出现了filesort这种文件排序,这是不能容忍的,因此必须建立索引来解决这种情况。
首先尝试为使用到的三个字段创建索引.
create index idx_article_ccv on article(category_id, comments, views);
# 重新执行explain查看执行计划
mysql> explain select id,author_id from article where category_id = 1 and comments > 1 order by views desc limit 1;
+----+-------------+---------+------------+-------+-----------------+-----------------+---------+------+------+----------+---------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+-----------------+-----------------+---------+------+------+----------+---------------------------------------+
| 1 | SIMPLE | article | NULL | range | idx_article_ccv | idx_article_ccv | 8 | NULL | 1 | 100.00 | Using index condition; Using filesort |
+----+-------------+---------+------------+-------+-----------------+-----------------+---------+------+------+----------+---------------------------------------+
1 row in set (0.02 sec)
可以看到, 虽然使用到了索引, 但是因为使用到了>
,导致后面的排序索引失效, 最后还是使用到了文件排序, 这也是不太能够接受的, 所以需要继续重新修改索引。
# 尝试删除索引, 重新建立,这次只建立分类和点击量的索引关系。
drop index idx_article_ccv on article;
create index idx_article_ccv on article(category_id, views);
# 重新执行上面的操作,发现期待的一幕出现了,不但没有文件排序, 且索引的类型还是ref,效率不错, 如下所示。
mysql> explain select id,author_id from article where category_id = 1 and comments > 1 order by views desc limit 1;
+----+-------------+---------+------------+------+-----------------+-----------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+-----------------+-----------------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | article | NULL | ref | idx_article_ccv | idx_article_ccv | 4 | const | 2 | 33.33 | Using where |
+----+-------------+---------+------------+------+-----------------+-----------------+---------+-------+------+----------+-------------+
1 row in set (0.04 sec)
二、双表优化
建表sql
create table if not exists class (
id int unsigned not null auto_increment,
card int unsigned not null,
primary key(id)
);
create table if not exists book (
bookid int unsigned not null auto_increment primary key,
card int unsigned not null
);
insert into class(card) values
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20));
insert into book(card) values
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20));
示例:
class表左连接book表, 根据card字段
# 首先尝试非索引下的连接
mysql> explain select * from class left join book on class.card = book.card;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | class | NULL | ALL | NULL | NULL | NULL | NULL | 22 | 100.00 | NULL |
| 1 | SIMPLE | book | NULL | ALL | NULL | NULL | NULL | NULL | 32 | 100.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
2 rows in set (0.04 sec)
结果显而易见, 两次全表连接,效果非常差, 现在需要为该次查询建立索引来进行优化.
首先尝试在book
表的关联字段上创建索引.
# 创建索引
create index idx_book_card on book(card);
# 重新查看sql执行计划
mysql> explain select * from class left join book on class.card = book.card;
+----+-------------+-------+------------+------+---------------+---------------+---------+----------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+---------------+---------+----------------+------+----------+-------------+
| 1 | SIMPLE | class | NULL | ALL | NULL | NULL | NULL | NULL | 22 | 100.00 | NULL |
| 1 | SIMPLE | book | NULL | ref | idx_book_card | idx_book_card | 4 | emp.class.card | 1 | 100.00 | Using index |
+----+-------------+-------+------------+------+---------------+---------------+---------+----------------+------+----------+-------------+
2 rows in set (0.05 sec)
结果显示优化效果很明显, 被驱动表book
被索引的行直接变为1行, 直接少了一半多. 这说明将索引添加到book
中是正确的. 那如果现在将索引天添加到驱动表class
中呢?
drop index idx_book_card on book;
create index idx_class_card on class(card);
mysql> explain select * from class left join book on class.card = book.card;
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | class | NULL | index | NULL | idx_class_card | 4 | NULL | 22 | 100.00 | Using index |
| 1 | SIMPLE | book | NULL | ALL | NULL | NULL | NULL | NULL | 32 | 100.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+----------------------------------------------------+
2 rows in set (0.05 sec)
从上面的结果可以得出结论, 如果两表使用左连接的话, 那么索引最好加到被驱动表中, 因为驱动表无论如何都会是全表扫描的, 加到被驱动表可以完美的利用到索引的优化功能. 同理右外连接也一样是添加到被驱动表中.
三、三表优化
建表sql
# 在上面双表的基础上再添加上一个表。
create table if not exists phone (
id int unsigned not null primary key auto_increment,
card int unsigned not null
);
insert into phone(card) values
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20)),
(floor(1 + rand() * 20));
示例:
使用左连接连接class, book, phone表。
mysql> explain select * from class left join book on book.card = class.card left join phone on book.card = phone.card;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | class | NULL | ALL | NULL | NULL | NULL | NULL | 22 | 100.00 | NULL |
| 1 | SIMPLE | book | NULL | ALL | NULL | NULL | NULL | NULL | 32 | 100.00 | Using where; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | phone | NULL | ALL | NULL | NULL | NULL | NULL | 30 | 100.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
3 rows in set (0.03 sec)
同样均使用到了全表连接,接下来在被驱动的两张表中使用索引。
create index idx_book_card on book(card);
create index idx_phone_card on class(phone);
mysql> explain select * from class left join book on book.card = class.card left join phone on book.card = phone.card;
+----+-------------+-------+------------+------+----------------+----------------+---------+----------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+----------------+----------------+---------+----------------+------+----------+-------------+
| 1 | SIMPLE | class | NULL | ALL | NULL | NULL | NULL | NULL | 22 | 100.00 | NULL |
| 1 | SIMPLE | book | NULL | ref | idx_book_card | idx_book_card | 4 | emp.class.card | 1 | 100.00 | Using index |
| 1 | SIMPLE | phone | NULL | ref | idx_phone_card | idx_phone_card | 4 | emp.book.card | 2 | 100.00 | Using index |
+----+-------------+-------+------------+------+----------------+----------------+---------+----------------+------+----------+-------------+
3 rows in set (0.04 sec)
结果基本能够满足要求。
四、join优化小总结
- 尽量减少join连接内层循环的次数, 永远用“小表驱动大表的结果集”。
- 优先优化join连接的内层循环
- 保证join语句中被驱动表的join字段已经用上了索引。
- 当无法保证被驱动表的Join条件字段内存是否充足的前提下, 加大Join Buffer的设置。
五、索引优化的总结
这里仍然以一个例子来进行说明。
建表sql:
简要说明:创建你staffs表, 为name,age,pos三个字段创建一个联合索引。
create table if not exists staffs (
id int primary key auto_increment,
name varchar(24) not null default '' comment '姓名',
age int not null default 0 comment '年龄',
pos varchar(20) not null default '' comment '职位',
add_time timestamp not null default current_timestamp comment '入职时间'
)charset utf8mb4 comment '员工记录表';
desc staffs;
insert into staffs(name, age , pos, add_time) values
('z3', 22, 'manageer', now()),
('July', 23, 'dev', now()),
('2000', 23, 'dev', now());
select * from staffs;
alter table staffs add index idx_staffs_nameAgePos(name, age, pos);
优化
1 全值匹配
使用全值匹配能够完美的使用到索引,效率也非常好,此外联合索引中多个索引调换顺序不会影响索引的使用,在优化器那会做一层优化。
mysql> explain select * from staffs where name = 'July';
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | staffs | NULL | ref | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 98 | const | 1 | 100.00 | NULL |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+
1 row in set (0.04 sec)
mysql> explain select * from staffs where name = 'July' and age = 22;
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------+------+----------+-------+
| 1 | SIMPLE | staffs | NULL | ref | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 102 | const,const | 1 | 100.00 | NULL |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------+------+----------+-------+
1 row in set (0.04 sec)
# 调换顺序不会影响索引。
mysql> explain select * from staffs where age = 22 and name = 'Joly';
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------+------+----------+-------+
| 1 | SIMPLE | staffs | NULL | ref | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 102 | const,const | 1 | 100.00 | NULL |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------+------+----------+-------+
1 row in set (0.03 sec)
2 最佳左前缀法则
如果索引了多列,要遵守最左前缀法则。指的是查询从联合索引的最左前列开始并且不跳过索引中的列。如果跳过了中间列, 则会导致索引失效。
# 正常使用了索引
mysql> explain select * from staffs where age = 22 and name = 'Joly' and pos = 'a';
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------------+------+----------+-------+
| 1 | SIMPLE | staffs | NULL | ref | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 184 | const,const,const | 1 | 100.00 | NULL |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------------+------+----------+-------+
1 row in set (0.04 sec)
# 丢失索引
mysql> explain select * from staffs where age = 22 and pos = 'a';
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | staffs | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 33.33 | Using where |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set (0.05 sec)
上面第二种情况直接走了全表扫描, 因为mysql
是按照联合索引定义的顺序在索引树上排序的,现在丢失了第一个排序手段name
,mysql
就不知道从哪里开始找起,因此直接走全表。
下面是一种只用到了部分索引,导致后续索引失效的情况。
mysql> explain select * from staffs where name = 'zhang' and pos = 'a';
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-----------------------+
| 1 | SIMPLE | staffs | NULL | ref | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 98 | const | 1 | 33.33 | Using index condition |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-----------------------+
1 row in set (0.03 sec)
从ref这个字段的结果可以看出来, 这里只用到一个索引,后面那个索引并没有起到作用。
3 不在索引列上做任何操作
不要在索引列上做任何操作(计算、函数、(自动、手动)类型转换),会导致索引失效而转向全表扫描。
例如下面两种本质上是相同的操作,但是第二种方式直接导致扫描全表。
# 第一种
mysql> explain select * from staffs where name = 'zhang';
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | staffs | NULL | ref | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 98 | const | 1 | 100.00 | NULL |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+
1 row in set (0.05 sec)
# 第二种
mysql> explain select * from staffs where concat(name, '') = 'zhang';
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | staffs | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | Using where |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set (0.04 sec)
4 存储引擎不能使用索引中范围条件右边的列
索引中包含比较运算符, 那么除了当列,后面的索引列都会失效。比较下面两条sql,可以通过key_len这个参数看到第二条参数最后一个索引没有被使用到。
mysql> explain select * from staffs where name = 'zhang' and age = 11 and pos = '';
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------------+------+----------+-------+
| 1 | SIMPLE | staffs | NULL | ref | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 184 | const,const,const | 1 | 100.00 | NULL |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------------+------+----------+-------+
1 row in set (0.04 sec)
mysql> explain select * from staffs where name = 'zhang' and age > 11 and pos = '';
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | staffs | NULL | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 102 | NULL | 1 | 33.33 | Using index condition |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
1 row in set (0.04 sec)
5 尽量使用覆盖索引
我们在业务中,尽量做到用到什么变量取什么变量,最好都使用覆盖索引(只访问索引的查询,索引列和查询列一致),减少select*
的查询语句,select * 这样的操作会多出很多无意义的回表操作,甚至会让原本走索引的查询语句放弃走索引,反而去走全表扫描。
它们的详细差别可以通过下面几个例子明显的感受出来。
mysql> explain select * from staffs where name in ('zhang', 'li') and age = 11 and pos = '';
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | staffs | NULL | ALL | idx_staffs_nameAgePos | NULL | NULL | NULL | 3 | 66.67 | Using where |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
1 row in set (0.06 sec)
mysql> explain select id,name,age from staffs where name in ('zhang', 'li') and age = 11 and pos = '';
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | staffs | NULL | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 184 | NULL | 2 | 100.00 | Using where; Using index |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+--------------------------+
1 row in set (0.04 sec)
6 不等于操作(<>、!=)的理解误区
mysql在使用不等于(!=或者<>)的时候无法使用索引,进而会导致全表扫描。
网上大多的结论都是如此,可事实真是如此吗?就以上面的初始例子来看。
下面是初始数据
mysql> select * from staffs;
+----+------+-----+----------+---------------------+
| id | name | age | pos | add_time |
+----+------+-----+----------+---------------------+
| 1 | z3 | 22 | manageer | 2019-12-14 07:40:22 |
| 2 | July | 23 | dev | 2019-12-14 07:40:22 |
| 3 | 2000 | 23 | dev | 2019-12-14 07:40:22 |
+----+------+-----+----------+---------------------+
接下来我们执行如下操作。
mysql> explain select * from staffs where name != 'July';
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | staffs | NULL | ALL | idx_staffs_nameAgePos | NULL | NULL | NULL | 3 | 66.67 | Using where |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
1 row in set (0.03 sec)
一切看起来确实如同网上所总结的那样,!=
操作确实会导致走全表,放弃走索引。那接下来看下面这条sql
mysql> explain select id,name from staffs where name != 'July';
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | staffs | NULL | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 98 | NULL | 2 | 100.00 | Using where; Using index |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+--------------------------+
1 row in set (0.04 sec)
只是使用到了覆盖索引,但是我们也使用了!=
,但是mysql
还是走了索引。可能我们会觉得这是覆盖索引导致的。那么我们在原先表的3条数据的基础上,重新添加4条数据,然后重新执行第一条sql
。
insert into staffs(name, age, pos) values ('July', 23, 'dev');
insert into staffs(name, age, pos) values ('July', 23, 'dev');
insert into staffs(name, age, pos) values ('July', 23, 'dev');
insert into staffs(name, age, pos) values ('July', 23, 'dev');
mysql> select * from staffs;
+----+------+-----+----------+---------------------+
| id | name | age | pos | add_time |
+----+------+-----+----------+---------------------+
| 1 | z3 | 22 | manageer | 2019-12-14 07:40:22 |
| 2 | July | 23 | dev | 2019-12-14 07:40:22 |
| 3 | 2000 | 23 | dev | 2019-12-14 07:40:22 |
| 4 | July | 23 | dev | 2019-12-16 22:42:36 |
| 5 | July | 23 | dev | 2019-12-16 22:42:36 |
| 6 | July | 23 | dev | 2019-12-16 22:42:36 |
| 7 | July | 23 | dev | 2019-12-16 22:42:36 |
+----+------+-----+----------+---------------------+
mysql> explain select * from staffs where name != 'July';
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | staffs | NULL | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 98 | NULL | 2 | 100.00 | Using index condition |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
1 row in set (0.03 sec)
结果显而易见,网上总结的所谓!=
就会导致索引一定失效,是站不住脚的。
7 is null, is not null 的理解误区
is null,is not null 在where子句中就无法使用索引
.
先看下面两个例子.
mysql> explain select * from staffs where name is null;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------+
| 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Impossible WHERE |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------+
1 row in set (0.03 sec)
mysql> explain select * from staffs where name is not null;
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | staffs | NULL | ALL | idx_staffs_nameAgePos | NULL | NULL | NULL | 7 | 85.71 | Using where |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
1 row in set (0.05 sec)
上面的看起来确实可以简单的得出无论是is null
还是is not null
都无法运用索引。但是这里需要注意到staffs表的name字段的约束条件,它是not null,这说明这个字段无法存储null值,因此当mysql优化器去判断name is null
的时候,直接判断为Impossible WHERE
,这样当然就无法走索引,而且这种写法也完全没有意义。
因此这里我们只要修改一下表的结构,把name这个字段的not null约束改成允许为空,再来执行上述命令,我们就能再次清楚的看出mysql走不走索引与where子句是否is null, is not null, !=, ...
等关键字没有直接的关系。
mysql> alter table staffs modify column name varchar(50) null;
Query OK, 0 rows affected (0.06 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> explain select * from staffs where name is null;
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-----------------------+
| 1 | SIMPLE | staffs | NULL | ref | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 203 | const | 1 | 100.00 | Using index condition |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-----------------------+
1 row in set (0.06 sec)
只需要一个简单的例子的校验,就能看出即使条件使用的是is null
仍然可以走索引。
8 like的优化
like以通配符开头('%abc...')可能会导致mysql索引失效进而变成全表扫描操作.
mysql> explain select * from staffs where name like '%zhang';
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | staffs | NULL | ALL | NULL | NULL | NULL | NULL | 7 | 14.29 | Using where |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set (0.04 sec)
所以有时候如果可以做到的话, 尽量在模糊匹配的时候把%写在末尾.
mysql> explain select * from staffs where name like 'zhang%';
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | staffs | NULL | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 203 | NULL | 1 | 100.00 | Using index condition |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
1 row in set (0.05 sec)
但是业务逻辑很少会出现开头字符串是固定的情况, 这时候就需要用到覆盖索引的理念来优化.
mysql> explain select name from staffs where name like '%zhang';
+----+-------------+--------+------------+-------+---------------+-----------------------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+---------------+-----------------------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | staffs | NULL | index | NULL | idx_staffs_nameAgePos | 289 | NULL | 7 | 14.29 | Using where; Using index |
+----+-------------+--------+------------+-------+---------------+-----------------------+---------+------+------+----------+--------------------------+
1 row in set (0.05 sec)
像上面的情况, 如果需要的字段正好是索引字段,那么最好不要使用select *
, 虽然优化后的结果type的类型是index, 也是扫描索引列全表, 但是索引文件一般都比全表数据小,所以这样还是能够优化,
9 字符串不加单引号索引失效
mysql> explain select * from staffs where name = '2000';
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | staffs | NULL | ref | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 203 | const | 1 | 100.00 | NULL |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+
1 row in set (0.05 sec)
当不加引号, mysql
会做一个隐式装换, 然后做匹配.如下所示, 这样索引就会失效
mysql> explain select * from staffs where name = 2000;
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | staffs | NULL | ALL | idx_staffs_nameAgePos | NULL | NULL | NULL | 7 | 14.29 | Using where |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
1 row in set (0.03 sec)
10 or的使用
or的使用和前面的null不等于这些判断条件一样, 并不是根据是否使用这些运算符导致不走索引, 最后还是mysql的优化器根据总的代价计算后,然后决定是否走索引.
就如下面的例子.
mysql> select * from staffs;
+----+------+-----+----------+---------------------+
| id | name | age | pos | add_time |
+----+------+-----+----------+---------------------+
| 1 | z3 | 22 | manageer | 2019-12-14 07:40:22 |
| 2 | July | 23 | dev | 2019-12-14 07:40:22 |
| 3 | 2000 | 23 | dev | 2019-12-14 07:40:22 |
| 8 | July | 23 | dev | 2019-12-16 22:58:45 |
| 9 | July | 23 | dev | 2019-12-16 22:58:45 |
| 10 | July | 23 | dev | 2019-12-16 22:58:45 |
| 11 | July | 23 | dev | 2019-12-16 22:58:45 |
+----+------+-----+----------+---------------------+
7 rows in set (0.04 sec)
mysql> explain select * from staffs where name = 'July' or name = 'z3';
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | staffs | NULL | ALL | idx_staffs_nameAgePos | NULL | NULL | NULL | 7 | 85.71 | Using where |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
1 row in set (0.04 sec)
在这种情况下优化方式还是使用覆盖索引.
mysql> explain select name from staffs where name = 'July' or name = 'z3';
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | staffs | NULL | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 203 | NULL | 6 | 100.00 | Using where; Using index |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+--------------------------+
1 row in set (0.03 sec)
当像下面这样查询, 最后的执行方案又是走索引.
mysql> explain select * from staffs where name in ('2000', 'z3');
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | staffs | NULL | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 203 | NULL | 2 | 100.00 | Using index condition |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
1 row in set (0.05 sec)
六、查询优化的一般性建议
- 对于单键索引,尽量选择针对当前query过滤性更好的索引
- 在选择组合索引的时候,当前Query中过滤性最好的字段在索引字段顺序中,位置越靠前越好。
- 在选择组合索引的时候,尽量选择可以能包含当前query中的where子句中更多字段的索引
- 尽可能通过分析统计信息和调整query的写法来达到选择合适索引的目的
- 使用join连表时,永远小表驱动大表。
- order by子句,尽量使用Index方式排序,避免使用FileSort方式排序
- 尽可能在索引列上完成排序操作,遵照索引建的最佳左前缀, order by中不能既有asc也有desc
- having只会在检索出所有记录之后才对结果集进行过滤。这个处理需要排序,总计等操作。如果能通过where子句限制记录的数目,那就能减少这方面的开销。
- 当orderby 或 group by无法使用索引列,可以选择增大max_length_for_sort_data参数的设置+增大sort_buffer_size参数的设置
- 除非确实需要去掉重复的行,否则尽量使用union all而不是union。因为union会自带distinct操作,代价很大
七、总结
其实看了这么多例子,是否走索引还是mysql优化器经过自己的优化算法计算,然后确认它认为的cost(代价)最少的执行方案,这与数据的分布,用到索引的列数,最后筛选的数据的比例,回表的次数等等都有关系,并不能单纯的通过某些条件就轻易下结论。
参考
1 MySQL中IS NULL、IS NOT NULL、!=不能用索引?胡扯!
2 官网