索引优化
十三、索引优化
1 单表索引优化案例
准备数据:
# 模拟,博客文章的数据库
CREATE TABLE IF NOT EXISTS article(
id INT(10) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
author_id INT(10) UNSIGNED NOT NULL,
category_id INT(10) UNSIGNED NOT NULL,
views INT(10) UNSIGNED NOT NULL,
comments INT(10) UNSIGNED NOT NULL,
title VARCHAR(255) NOT NULL,
content TEXT NOT NULL
);
INSERT INTO article(author_id,category_id,views,comments,title,content)
VALUES
(1,1,1,1,'1','1'),
(2,2,2,2,'2','2'),
(1,1,3,3,'3','3');
需求为:查询category_id
为1且comments
大于1的情况下,views
最多的id
,查询语句如下:
SELECT id, author_id FROM article WHERE category_id = 1 AND comments > 1 ORDER BY views DESC LIMIT 1;
+----+-----------+
| id | author_id |
+----+-----------+
| 3 | 1 |
+----+-----------+
此时没有建立索引,尝试用explain分析:
EXPLAIN SELECT id, author_id FROM article WHERE category_id = 1 AND comments > 1 ORDER BY views DESC LIMIT 1;
很显然,type是ALL,即最坏的情况,没有用到索引。Extra里出现了Using filesort文件排序,也是最坏的情况,所以优化是必须的。
首先想到的是为每个查询的字段都建立索引:
CREATE INDEX idx_article_ccv ON article ( category_id, comments, views );
SHOW INDEX FROM article;
再次分析:
type变成了range,这是可以忍受的。但是extra里使用Using filesort仍是无法接受的。但是我们已经建立了索引,为什么没用呢?因为按照B+Tree索引的工作原理,先排序category_id,如果遇到相同的ategory_id则再排序comments,如果遇到相同的comments则再排序 views。当comments字段在联合索引里处于中间位置时,因为comments>1
条件是一个范围值(range),MySQL无法利用索引再对后面的views部分进行检索,即range类型查询字段后面的索引失效。
于是有了第二个索引建立思路,对于范围查询的字段不建索引,直接把它跳过去,为后面的字段建索引。首先把原来的索引删掉:
DROP INDEX [indexName] ON mytable;
重新建立索引:
CREATE INDEX idx_article_ccv ON article ( category_id, views );
再次分析:可以看到,type变为了ref,Extra中的Using filesort也消失了,结果非常理想:
2 双表索引优化案例
准备数据:
CREATE TABLE IF NOT EXISTS class(
id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
card INT(10) UNSIGNED NOT NULL,
PRIMARY KEY(id)
);
CREATE TABLE IF NOT EXISTS book(
bookid INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
card INT(10) UNSIGNED NOT NULL,
PRIMARY KEY(bookid)
);
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
需求为:实现两表的外连接,连接条件是 class.card = book.card,查询语句如下:
SELECT * FROM class LEFT JOIN book ON class.card = book.card;
用explain分析:
EXPLAIN SELECT * FROM class LEFT JOIN book ON class.card = book.card;
可以看到没有添加索引时,type为All ,rows为表中数据总行数,说明class和book进行了全表检索。Extra中Using join buffer,表明连接过程中使用了join缓冲区。
由于是左连接,左边的数据肯定都要查询,所以右表才是我们的关键点,一定需要建立索引。即:左表连接右表,则需要拿着左表的数据去右表里面查,索引需要在右表中建立索引。
ALTER TABLE book ADD INDEX Y(card);
再次分析:
可以看到第二行的type变为了ref,rows也变成了优化比较明显,且Using join buffer也消失了。
如果只在左表建索引:
DROP INDEX Y ON book; # 删除右表索引
ALTER TABLE class ADD INDEX X(card); # 添加左表索引
再次分析:
效果是不如在右表建立索引好的,这也印证了之前的理论。
当然,为两个表都建索引也是可行的。但是如果只建一个索引,那么原则就是为相反的表建索引:
- 左外连接,为右表的字段建索引
- 右外连接,为左表的字段建索引
3 三表索引优化案例
准备数据(以及案例2中的class表和book表):
CREATE TABLE IF NOT EXISTS phone(
phoneid INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
card INT(10) UNSIGNED NOT NULL,
PRIMARY KEY(phoneid)
)ENGINE=INNODB;
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
需求为:实现三表的外连接,查询语句如下:
SELECT * FROM class LEFT JOIN book ON class.card = book.card LEFT JOIN phone ON book.card = phone.card;
用explain分析:
可以看到没有添加索引时,type为All,rows 为表数据总行数,说明class、book和phone表都进行了全表检索Extra中Using join buffer,表明连接过程中使用了join缓冲区。
根据前面的理论,我们需要在右表建立索引,所以book要建立,phone也是右表,也要建立:
ALTER TABLE book ADD INDEX Y (card);
ALTER TABLE phone ADD INDEX Z (card);
再次分析:
后2行的type都是ref,且总rows优化很好。因此索引最好设置在需要经常查询的字段中。
得出Join语句优化的结论:
- 尽可能减少Join语句中的NestedLoop的循环总次数
- 永远用小结果集驱动大的结果集(在大结果集中建立索引,在小结果集中遍历全表)
- 优先优化NestedLoop的内层循环
- 保证Join语句中被驱动表上Join条件字段已经被索引
- 当无法保证被驱动表的Join条件字段被索引且内存资源充足的前提下,不要太吝惜JoinBuffer的设置
4 避免索引失效
避免索引失效:
- 尽量使用全值匹配
- 使用最佳左前缀法则:如果索引了多例,要遵守最左前缀法则。指的是查询从索引的最左前列开始并且不跳过索引中的列。
- 不在索引列上做任何操作(计算、函数、(自动or手动)类型转换),会导致索引失效而转向全表扫描
- 范围条件右边的索引会失效
- 尽量使用覆盖索引(只访问索引的查询(索引列和查询列一致)),减少使用
select *
- mysql在使用不等于(!=或者<>)的时候无法使用索引会导致全表扫描
- is null,is not null 也无法使用索引(早期版本不能走索引,后续版本应该优化过,可以走索引)
- like以通配符开头(
%abc…
)索引失效,会变成全表扫描操作- 解决
like '%str%'
索引失效的问题:使用覆盖索引
- 解决
- 字符串不加单引号索引失效
- 如果字符串忘记用单引号括起来,那么mysql会为我们进行隐式的类型转换,而进行了类型转换,索引就会失效
- 少用or,用or连接时会索引失效
现在回头再看看前面提到的何时需要建索引,会有更加清晰的认识。
本文来自博客园,作者:yyyz,转载请注明原文链接:https://www.cnblogs.com/yyyzyyyz/p/15853257.html