索引及执行计划

索引的作用

提供了类似于书中目录的作用,目的是为了优化查询

索引的种类(算法)

B树索引
Hash索引
R树
Full text
GIS

Btree索引功能上的分类

辅助索引(S)怎么构建B树结构的?

(1). 索引是基于表中,列(索引键)的值生成的B树结构
(2). 首先提取此列所有的值,进行自动排序
(3). 将排好序的值,均匀的分布到索引树的叶子节点中(16K)
(4). 然后生成此索引键值所对应得后端数据页的指针
(5). 生成枝节点和根节点,根据数据量级和索引键长度,生成合适的索引树高度
id  name  age  gender
select  *  from  t1 where id=10;
问题: 基于索引键做where查询,对于id列是顺序IO,但是对于其他列的查询,可能是随机IO.

聚集索引

前提

(1)表中设置了主键,主键列就会自动被作为聚集索引.
(2)如果没有主键,会选择唯一键作为聚集索引.
(3)聚集索引必须在建表时才有意义,一般是表的无关列(ID)

辅助索引(S)怎么构建B树结构的?

(1) 在建表时,设置了主键列(ID)
(2) 在将来录入数据时,就会按照ID列的顺序存储到磁盘上.(我们又称之为聚集索引组织表)
(3) 将排好序的整行数据,生成叶子节点.可以理解为,磁盘的数据页就是叶子节点

聚集索引和辅助索引的区别

(1) 表中任何一个列都可以创建辅助索引,在你有需要的时候,只要名字不同即可
(2) 在一张表中,聚集索引只能有一个,一般是主键.
(3) 辅助索引,叶子节点只存储索引列的有序值+聚集索引列值.
(4) 聚集索引,叶子节点存储的时有序的整行数据.
(5) MySQL 的表数据存储是聚集索引组织表

辅助索引细分

1.普通的单列辅助索引
2.联合索引多个列作为索引条件,生成索引树,理论上设计的好的,可以减少大量的回表查询
3.唯一索引索引列的值都是唯一的.

关于索引树的高度受什么影响

1. 数据量级, 解决方法:分表,分库,分布式
2. 索引列值过长 , 解决方法:前缀索引
3. 数据类型:
变长长度字符串,使用了char,解决方案:变长字符串使用varchar
enum类型的使用enum ('山东','河北','黑龙江','吉林','辽宁','陕西'......)
                                         1      2      3

索引的基本管理

查询索引

desc city;
# 或者
show index from city\G;

单列普通辅助索引

alter table city add index idx_name(name);

多列的联合索引

alter table city add index idx_c_p(countrycode,population);

唯一索引

alter table city add unique index uidx_dis(district);

前缀索引

alter table city add index idx_dis(district(5));

删除索引

alter table city drop index idx_name;
alter table city drop index idx_c_p;
alter table city drop index idx_dis;

索引优化测试

未做优化之前测试 755s

[root@db01 ~]# mysqlslap --defaults-file=/etc/my.cnf \
> --concurrency=100 --iterations=1 --create-schema='test' \
> --query="select * from test.t100w where k2='MN89'" engine=innodb \
> --number-of-queries=2000 -uroot -p123 -verbose
mysqlslap: [Warning] Using a password on the command line interface can be insecure.
Benchmark
	Running for engine rbose
	Average number of seconds to run all queries: 755.861 seconds
	Minimum number of seconds to run all queries: 755.861 seconds
	Maximum number of seconds to run all queries: 755.861 seconds
	Number of clients running queries: 100
	Average number of queries per client: 20

索引优化后测试 1.67s

[root@db01 ~]# mysqlslap --defaults-file=/etc/my.cnf --concurrency=100 --iterations=1 --create-schema='test' --query="select * from test.t100w where k2='MN89'" engine=innodb 
--number-of-queries=2000 -uroot -p123 -verbose
mysqlslap: [Warning] Using a password on the command line interface can be insecure.
Benchmark
	Running for engine rbose
	Average number of seconds to run all queries: 1.678 seconds
	Minimum number of seconds to run all queries: 1.678 seconds
	Maximum number of seconds to run all queries: 1.678 seconds
	Number of clients running queries: 100
	Average number of queries per client: 20

执行计划分析

作用

将优化器选择后的执行计划截取出来.便于管理管判断语句得执行效率

获取执行

desc SQL语句
explain SQL 语句

desc select * from school.student where sname='zhang3';

重点关注的信息

table: city                              ---->查询操作的表    **
possible_keys: CountryCode,idx_co_po      ---->可能会走的索引  **
key: CountryCode   ---->真正走的索引    ***
type: all   ---->索引类型        *****
Extra: Using index condition              ---->额外信息        *****

type 查询的类型

全表扫描 : ALL
索引扫描 : index,range,ref,eq_ref,const(system),NULL

index: 全索引扫描
mysql> desc select countrycode  from city;

range: 索引范围扫描(> < >= <= , between and ,or,in,like )
mysql> desc select * from city where id>2000;
mysql> desc select  * from city where countrycode like 'CH%';

对于辅助索引来讲,!= 和not in等语句是不走索引的
对于主键索引列来讲,!= 和not in等语句是走range

===
mysql> desc select  * from city where countrycode='CHN' or countrycode='USA';
mysql> desc select  * from city where countrycode in ('CHN','USA');


一般改写为 union all 
desc 
select  * from city where countrycode='CHN' 
union all 
select  * from city where countrycode='USA';

ref: 辅助索引等值查询
desc 
select  * from city where countrycode='CHN' 
union all 
select  * from city where countrycode='USA';

eq_ref : 多表连接时,子表使用主键列或唯一列作为连接条件
A join B 
on a.x = B.y 

desc select b.name,a.name ,a.population  
from city as a 
join country as b 
on a.countrycode=b.code 
where a.population<100;

const(system) : 主键或者唯一键的等值查询
mysql> desc select * from  city where id=100;

其他字段解释 extra

1.当我们看到执行计划extra位置出现filesort,说明由文件排序出现
2.观察需要排序(ORDER BY,GROUP BY ,DISTINCT )的条件,有没有索引
3. 根据子句的执行顺序,去创建联合索引

explain(desc)使用场景（面试题）

题目意思:  我们公司业务慢,请你从数据库的角度分析原因
1.mysql出现性能问题,我总结有两种情况:
（1）应急性的慢：突然夯住
应急情况:数据库hang(卡了,资源耗尽)
处理过程:
1.show processlist;  获取到导致数据库hang的语句
2. explain 分析SQL的执行计划,有没有走索引,索引的类型情况
3. 建索引,改语句
（2）一段时间慢(持续性的):
(1)记录慢日志slowlog,分析slowlog
(2)explain 分析SQL的执行计划,有没有走索引,索引的类型情况
(3)建索引,改语句

key_len 索引覆盖长度

utf8mb4
varchar(10) : 没有not null(1)+4*10+2=43
char(10) : 没有not null(1)+4*10=41
int : 没有有not null(1)+4=5
用途:判断联合索引的覆盖长度,一般情况是越长越好

varchar(4) utf8mb4 4*4 + 3 是否为空占1个字节起止各占1个 char(4) 4*4 +1

1. 能存4个任意字符
2. 不管存储的时字符,数字,中文,都1个字符最大预留长度是4个字节
3. 对于中文,1个占4个字节
4. 对于数字和字母,1个实际占用大小是1个字节

select length() from test;

关注key_len应用的长度, 联合索引应用细节（命中key_length长度越多越好）

建的联合索引顺序是(k1,k2,k3,k4)

只要我们将来的查询,所有索引列都是<等值>查询条件下,无关排列顺序

原因: 优化器,自动做查询条件的排列

desc  select * from test where k1='aa' and k2='中国' and k3='aaaa' and k4='中国你好';
# 和一下索引key_length长度一样，都是key_length =56
desc select * from test where k2='中国' and k3='aaaa' and k4='中国你好' and k1='aa';

不连续部分条件，缺少key2

cda ----> acd ---> a -----> idx 走a索引 key_length =4
dba ----> abd ---> ab ----> idx(d,b,a) 走 ab索引 key_length =9

desc  select * from test where k1='aa' and k3='aaaa' and k4='中国你好';

在where查询中如果出现> < >= <= not in 不走索引（辅助索引）

EXPLAIN  SELECT * FROM teltab WHERE telnum  <> '110';
EXPLAIN  SELECT * FROM teltab WHERE telnum  NOT IN ('110','119');

mysql> select * from tab where telnum <> '1555555';
+------+------+---------+
| id  | name | telnum  |
+------+------+---------+
|    1 | a    | 1333333 |
+------+------+---------+
1 row in set (0.00 sec)
mysql> explain select * from tab where telnum <> '1555555';

单独的>,<,in 有可能走，也有可能不走，和结果集有关，尽量结合业务添加limit
or或in  尽量改成union
EXPLAIN  SELECT * FROM teltab WHERE telnum  IN ('110','119');
改写成：
EXPLAIN SELECT * FROM teltab WHERE telnum='110'
UNION ALL
SELECT * FROM teltab WHERE telnum='119'

like "%_" 百分号在最前面不走

EXPLAIN SELECT * FROM teltab WHERE telnum LIKE '31%'  走range索引扫描
EXPLAIN SELECT * FROM teltab WHERE telnum LIKE '%110'  不走索引
%linux%类的搜索需求，可以使用elasticsearch+mongodb 专门做搜索服务的数据库产品

索引应用规范

建立索引的原则

(1) 建表必须要有主键,一般是无关列,自增长
(2) 经常做为where条件列 order by group by join on, distinct 的条件
(3) 最好使用唯一值多的列作为联合索引前导列,其他的按照联合索引优化细节来做
(4) 列值长度较长的索引列,我们建议使用前缀索引.
(5) 降低索引条目,一方面不要创建没用索引,不常使用的索引清理,percona toolkit(xxxxx)
(6) 索引维护要避开业务繁忙期
(7) 小表不建索引

不走索引的情况

(1) 没有查询条件，或者查询条件没有建立索引
select * from city;
select * from city where 1=1;
(2) 查询结果集是原表中的大部分数据，应该是25％以上。
(3) 索引本身失效，统计数据不真实
面试题:同一个语句突然变慢?
统计信息过旧,导致的索引失效
(4) 查询条件使用函数在索引列上，或者对索引列进行运算，运算包括(+，-，*，/，! 等)
mysql> desc select * from city where id-99=1;
(5) 隐式转换导致索引失效.
(6) <> ，not in 不走索引（辅助索引）
(7) like "%aa" 百分号在最前面不走
(8) 联合索引

posted @ 2020-03-01 23:20 Crazymagic 阅读(548) 评论(0) 编辑收藏举报

刷新页面返回顶部

Crazymagic