Loading

索引优化

十三、索引优化

1 单表索引优化案例

准备数据:

# 模拟,博客文章的数据库
CREATE TABLE IF NOT EXISTS article(
	id INT(10) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
	author_id INT(10) UNSIGNED NOT NULL,
	category_id INT(10) UNSIGNED NOT NULL,
	views INT(10) UNSIGNED NOT NULL,
	comments INT(10) UNSIGNED NOT NULL,
	title VARCHAR(255) NOT NULL,
	content TEXT NOT NULL
);

INSERT INTO article(author_id,category_id,views,comments,title,content)
VALUES
(1,1,1,1,'1','1'),
(2,2,2,2,'2','2'),
(1,1,3,3,'3','3');

需求为:查询category_id为1且comments大于1的情况下,views最多的id,查询语句如下:

SELECT id, author_id FROM article WHERE category_id = 1 AND comments > 1 ORDER BY views DESC LIMIT 1;
+----+-----------+
| id | author_id |
+----+-----------+
|  3 |         1 |
+----+-----------+

此时没有建立索引,尝试用explain分析:

EXPLAIN SELECT id, author_id FROM article WHERE category_id = 1 AND comments > 1 ORDER BY views DESC LIMIT 1;

image

很显然,type是ALL,即最坏的情况,没有用到索引。Extra里出现了Using filesort文件排序,也是最坏的情况,所以优化是必须的。

首先想到的是为每个查询的字段都建立索引:

CREATE INDEX idx_article_ccv ON article ( category_id, comments, views );
SHOW INDEX FROM article;

再次分析:
image

type变成了range,这是可以忍受的。但是extra里使用Using filesort仍是无法接受的。但是我们已经建立了索引,为什么没用呢?因为按照B+Tree索引的工作原理,先排序category_id,如果遇到相同的ategory_id则再排序comments,如果遇到相同的comments则再排序 views。当comments字段在联合索引里处于中间位置时,因为comments>1条件是一个范围值(range),MySQL无法利用索引再对后面的views部分进行检索,即range类型查询字段后面的索引失效。

于是有了第二个索引建立思路,对于范围查询的字段不建索引,直接把它跳过去,为后面的字段建索引。首先把原来的索引删掉:

DROP INDEX [indexName] ON mytable;

重新建立索引:

CREATE INDEX idx_article_ccv ON article ( category_id, views );

再次分析:可以看到,type变为了ref,Extra中的Using filesort也消失了,结果非常理想:
image

2 双表索引优化案例

准备数据:

CREATE TABLE IF NOT EXISTS class(
	id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
	card INT(10) UNSIGNED NOT NULL,
	PRIMARY KEY(id)
);

CREATE TABLE IF NOT EXISTS book(
	bookid INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
	card INT(10) UNSIGNED NOT NULL,
	PRIMARY KEY(bookid)
);

INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20)));

INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));

需求为:实现两表的外连接,连接条件是 class.card = book.card,查询语句如下:

SELECT * FROM class LEFT JOIN book ON class.card = book.card;

用explain分析:

EXPLAIN SELECT * FROM class LEFT JOIN book ON class.card = book.card;

image

可以看到没有添加索引时,type为All ,rows为表中数据总行数,说明class和book进行了全表检索。Extra中Using join buffer,表明连接过程中使用了join缓冲区。

由于是左连接,左边的数据肯定都要查询,所以右表才是我们的关键点,一定需要建立索引。即:左表连接右表,则需要拿着左表的数据去右表里面查,索引需要在右表中建立索引。

ALTER TABLE book ADD INDEX Y(card);

再次分析:
image

可以看到第二行的type变为了ref,rows也变成了优化比较明显,且Using join buffer也消失了。

如果只在左表建索引:

DROP INDEX Y ON book;  # 删除右表索引
ALTER TABLE class ADD INDEX X(card); # 添加左表索引

再次分析:
image

效果是不如在右表建立索引好的,这也印证了之前的理论。

当然,为两个表都建索引也是可行的。但是如果只建一个索引,那么原则就是为相反的表建索引:

  • 左外连接,为右表的字段建索引
  • 右外连接,为左表的字段建索引

3 三表索引优化案例

准备数据(以及案例2中的class表和book表):

CREATE TABLE IF NOT EXISTS phone(
	phoneid INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
	card INT(10) UNSIGNED NOT NULL,
	PRIMARY KEY(phoneid)
)ENGINE=INNODB;

INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));

需求为:实现三表的外连接,查询语句如下:

SELECT * FROM class LEFT JOIN book ON class.card = book.card LEFT JOIN phone ON book.card = phone.card;

用explain分析:
image

可以看到没有添加索引时,type为All,rows 为表数据总行数,说明class、book和phone表都进行了全表检索Extra中Using join buffer,表明连接过程中使用了join缓冲区。

根据前面的理论,我们需要在右表建立索引,所以book要建立,phone也是右表,也要建立:

ALTER TABLE book ADD INDEX Y (card);
ALTER TABLE phone ADD INDEX Z (card);

再次分析:
image

后2行的type都是ref,且总rows优化很好。因此索引最好设置在需要经常查询的字段中。

得出Join语句优化的结论:

  • 尽可能减少Join语句中的NestedLoop的循环总次数
  • 永远用小结果集驱动大的结果集(在大结果集中建立索引,在小结果集中遍历全表)
  • 优先优化NestedLoop的内层循环
  • 保证Join语句中被驱动表上Join条件字段已经被索引
  • 当无法保证被驱动表的Join条件字段被索引且内存资源充足的前提下,不要太吝惜JoinBuffer的设置

4 避免索引失效

避免索引失效:

  • 尽量使用全值匹配
  • 使用最佳左前缀法则:如果索引了多例,要遵守最左前缀法则。指的是查询从索引的最左前列开始并且不跳过索引中的列。
  • 不在索引列上做任何操作(计算、函数、(自动or手动)类型转换),会导致索引失效而转向全表扫描
  • 范围条件右边的索引会失效
  • 尽量使用覆盖索引(只访问索引的查询(索引列和查询列一致)),减少使用select *
  • mysql在使用不等于(!=或者<>)的时候无法使用索引会导致全表扫描
  • is null,is not null 也无法使用索引(早期版本不能走索引,后续版本应该优化过,可以走索引)
  • like以通配符开头(%abc…)索引失效,会变成全表扫描操作
    • 解决like '%str%' 索引失效的问题:使用覆盖索引
  • 字符串不加单引号索引失效
    • 如果字符串忘记用单引号括起来,那么mysql会为我们进行隐式的类型转换,而进行了类型转换,索引就会失效
  • 少用or,用or连接时会索引失效

现在回头再看看前面提到的何时需要建索引,会有更加清晰的认识。

posted @ 2022-01-28 17:13  yyyz  阅读(30)  评论(0编辑  收藏  举报