一、索引原理
本质都是:通过不断地缩小想要获取数据的范围来筛选出最终想要的结果,同时把随机的事件变成顺序的事件,也就是说,有了这种索引机制,我们可以总是用同一种查找方式来锁定数据
二 磁盘IO与预读
当一次IO时,不光把当前磁盘地址的数据,而是把相邻的数据也都读取到内存缓冲区内,
三 索引的数据结构
1.索引字段要尽量的小
2.索引的最左匹配特性:
四 聚集索引与辅助索引
在数据库中,B+树的高度一般都在2~4层,这也就是说查找某一个键值的行记录时最多只需要2到4次IO,这倒不错。因为当前一般的机械硬盘每秒至少可以做100次IO,2~4次的IO意味着查询时间只需要0.02~0.04秒。
数据库中的B+树索引可以分为聚集索引(clustered index)和辅助索引(secondary index),
聚集索引与辅助索引相同的是:不管是聚集索引还是辅助索引,其内部都是B+树的形式,即高度是平衡的,叶子结点存放着所有的数据。
聚集索引与辅助索引不同的是:叶子结点存放的是否是一整行的信息
1、聚集索引
#InnoDB存储引擎表示索引组织表,即表中数据按照主键顺序存放。而聚集索引(clustered index)就是按照每张表的主键构造一棵B+树,同时叶子结点存放的即为整张表的行记录数据,也将聚集索引的叶子结点称为数据页。聚集索引的这个特性决定了索引组织表中数据也是索引的一部分。同B+树数据结构一样,每个数据页都通过一个双向链表来进行链接。 #如果未定义主键,MySQL取第一个唯一索引(unique)而且只含非空列(NOT NULL)作为主键,InnoDB使用它作为聚簇索引。 #如果没有这样的列,InnoDB就自己产生一个这样的ID值,它有六个字节,而且是隐藏的,使其作为聚簇索引。 #由于实际的数据页只能按照一棵B+树进行排序,因此每张表只能拥有一个聚集索引。在多少情况下,查询优化器倾向于采用聚集索引。因为聚集索引能够在B+树索引的叶子节点上直接找到数据。此外由于定义了数据的逻辑顺序,聚集索引能够特别快地访问针对范围值得查询。
聚集索引的好处之一:它对主键的排序查找和范围查找速度非常快,叶子节点的数据就是用户所要查询的数据。如用户需要查找一张表,查询最后的10位用户信息,由于B+树索引是双向链表,所以用户可以快速找到最后一个数据页,并取出10条记录
#参照第六小结测试索引的准备阶段来创建出表s1 mysql> desc s1; #最开始没有主键 +--------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------+-------------+------+-----+---------+-------+ | id | int(11) | NO | | NULL | | | name | varchar(20) | YES | | NULL | | | gender | char(6) | YES | | NULL | | | email | varchar(50) | YES | | NULL | | +--------+-------------+------+-----+---------+-------+ rows in set (0.00 sec) mysql> explain select * from s1 order by id desc limit 10; #Using filesort,需要二次排序 +----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+----------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+----------------+ | 1 | SIMPLE | s1 | NULL | ALL | NULL | NULL | NULL | NULL | 2633472 | 100.00 | Using filesort | +----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+----------------+ row in set, 1 warning (0.11 sec) mysql> alter table s1 add primary key(id); #添加主键 Query OK, 0 rows affected (13.37 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> explain select * from s1 order by id desc limit 10; #基于主键的聚集索引在创建完毕后就已经完成了排序,无需二次排序 +----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------+ | 1 | SIMPLE | s1 | NULL | index | NULL | PRIMARY | 4 | NULL | 10 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------+ row in set, 1 warning (0.04 sec)
聚集索引的好处之二:范围查询(range query),即如果要查找主键某一范围内的数据,通过叶子节点的上层中间节点就可以得到页的范围,之后直接读取数据页即可
mysql> alter table s1 drop primary key; Query OK, 2699998 rows affected (24.23 sec) Records: 2699998 Duplicates: 0 Warnings: 0 mysql> desc s1; +--------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------+-------------+------+-----+---------+-------+ | id | int(11) | NO | | NULL | | | name | varchar(20) | YES | | NULL | | | gender | char(6) | YES | | NULL | | | email | varchar(50) | YES | | NULL | | +--------+-------------+------+-----+---------+-------+ rows in set (0.12 sec) mysql> explain select * from s1 where id > 1 and id < 1000000; #没有聚集索引,预估需要检索的rows数如下 +----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+ | 1 | SIMPLE | s1 | NULL | ALL | NULL | NULL | NULL | NULL | 2690100 | 11.11 | Using where | +----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+ row in set, 1 warning (0.00 sec) mysql> alter table s1 add primary key(id); Query OK, 0 rows affected (16.25 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> explain select * from s1 where id > 1 and id < 1000000; #有聚集索引,预估需要检索的rows数如下 +----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+ | 1 | SIMPLE | s1 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 1343355 | 100.00 | Using where | +----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+ row in set, 1 warning (0.09 sec)
2、辅助索引
表中除了聚集索引外其他索引都是辅助索引(Secondary Index,也称为非聚集索引),与聚集索引的区别是:辅助索引的叶子节点不包含行记录的全部数据。
五 MySQL索引管理
一 功能
#1. 索引的功能就是加速查找 #2. mysql中的primary key,unique,联合唯一也都是索引,这些索引除了加速查找以外,还有约束的功能
二 MySQL常用的索引
索引分类: 普通索引INDEX:加速查找 唯一索引: -主键索引PRIMARY KEY:加速查找+约束(不为空、不能重复) -唯一索引UNIQUE:加速查找+约束(不能重复) 联合索引: -PRIMARY KEY(id,name):联合主键索引 -UNIQUE(id,name):联合唯一索引 -INDEX(id,name):联合普通索引
普通索引INDEX:加速查找 唯一索引: -主键索引PRIMARY KEY:加速查找+约束(不为空、不能重复) -唯一索引UNIQUE:加速查找+约束(不能重复) 联合索引: -PRIMARY KEY(id,name):联合主键索引 -UNIQUE(id,name):联合唯一索引 -INDEX(id,name):联合普通索引
举个例子来说,比如你在为某商场做一个会员卡的系统。 这个系统有一个会员表 有下列字段: 会员编号 INT 会员姓名 VARCHAR(10) 会员身份证号码 VARCHAR(18) 会员电话 VARCHAR(10) 会员住址 VARCHAR(50) 会员备注信息 TEXT 那么这个 会员编号,作为主键,使用 PRIMARY 会员姓名 如果要建索引的话,那么就是普通的 INDEX 会员身份证号码 如果要建索引的话,那么可以选择 UNIQUE (唯一的,不允许重复) #除此之外还有全文索引,即FULLTEXT 会员备注信息 , 如果需要建索引的话,可以选择全文搜索。 用于搜索很长一篇文章的时候,效果最好。 用在比较短的文本,如果就一两行字的,普通的 INDEX 也可以。 但其实对于全文搜索,我们并不会使用MySQL自带的该索引,而是会选择第三方软件如Sphinx,专门来做全文搜索。 #其他的如空间索引SPATIAL,了解即可,几乎不用 各个索引的应用场景
三 索引的两大类型hash与btree
#我们可以在创建上述索引的时候,为其指定索引类型,分两类 hash类型的索引:查询单条快,范围查询慢 btree类型的索引:b+树,层数越多,数据量指数级增长(我们就用它,因为innodb默认支持它) #不同的存储引擎支持的索引类型也不一样 InnoDB 支持事务,支持行级别锁定,支持 B-tree、Full-text 等索引,不支持 Hash 索引; MyISAM 不支持事务,支持表级别锁定,支持 B-tree、Full-text 等索引,不支持 Hash 索引; Memory 不支持事务,支持表级别锁定,支持 B-tree、Hash 等索引,不支持 Full-text 索引; NDB 支持事务,支持行级别锁定,支持 Hash 索引,不支持 B-tree、Full-text 等索引; Archive 不支持事务,支持表级别锁定,不支持 B-tree、Hash、Full-text 等索引;
四 创建/删除索引的语法
#方法一:创建表时 CREATE TABLE 表名 ( 字段名1 数据类型 [完整性约束条件…], 字段名2 数据类型 [完整性约束条件…], [UNIQUE | FULLTEXT | SPATIAL ] INDEX | KEY [索引名] (字段名[(长度)] [ASC |DESC]) ); #方法二:CREATE在已存在的表上创建索引 CREATE [UNIQUE | FULLTEXT | SPATIAL ] INDEX 索引名 ON 表名 (字段名[(长度)] [ASC |DESC]) ; #方法三:ALTER TABLE在已存在的表上创建索引 ALTER TABLE 表名 ADD [UNIQUE | FULLTEXT | SPATIAL ] INDEX 索引名 (字段名[(长度)] [ASC |DESC]) ; #删除索引:DROP INDEX 索引名 ON 表名字;
#方式一 create table t1( id int, name char, age int, sex enum('male','female'), unique key uni_id(id), index ix_name(name) #index没有key ); #方式二 create index ix_age on t1(age); #方式三 alter table t1 add index ix_sex(sex); #查看 mysql> show create table t1; | t1 | CREATE TABLE `t1` ( `id` int(11) DEFAULT NULL, `name` char(1) DEFAULT NULL, `age` int(11) DEFAULT NULL, `sex` enum('male','female') DEFAULT NULL, UNIQUE KEY `uni_id` (`id`), KEY `ix_name` (`name`), KEY `ix_age` (`age`), KEY `ix_sex` (`sex`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 示范
1 创建索引 - 在创建表时就创建 create table s1( id int, name char(6), age int, email varchar(30), index(id) ); - 在创建表后创建 create index name on s1(name);#添加普通索引 create unique index age on s1(age);#添加唯一索引 alter table s1 add primary key(id);#添加主键索引 create index name on s1(id,name);#添加联合普通索引 2 删除索引 drop index id on s1; drop index name on s1; alter table s1 drop primary key;#添加主键索引
建立索引:
六 测试索引
一 准备
#1. 准备表 create table s1( id int, name varchar(20), gender char(6), email varchar(50) ); #2. 创建存储过程,实现批量插入记录 delimiter $$ #声明存储过程的结束符号为$$ create procedure auto_insert1() BEGIN declare i int default 1; while(i<3000000)do insert into s1 values(i,'egon','male',concat('egon',i,'@oldboy')); set i=i+1; end while; END$$ #$$结束 delimiter ; #重新声明分号为结束符号 #3. 查看存储过程 show create procedure auto_insert1\G #4. 调用存储过程 call auto_insert1();
二 在没有索引的前提下测试查询速度
#无索引:mysql根本就不知道到底是否存在id等于333333333的记录,只能把数据表从头到尾扫描一遍,此时有多少个磁盘块就需要进行多少IO操作,所以查询速度很慢 mysql> select * from s1 where id=333333333; Empty set (0.33 sec)
三 在表中已经存在大量数据的前提下,为某个字段段建立索引,建立速度会很慢
mysql> select count(*) from s1 where id=1000; +----------+ | count(*) | +----------+ | 1 | +----------+ 1 row in set (0.12 sec) mysql> select count(*) from s1 where id>1000; +----------+ | count(*) | +----------+ | 298999 | +----------+ 1 row in set (0.12 sec)
mysql> create index a on s1(id) -> ; Query OK, 0 rows affected (3.21 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> select count(*) from s1 where id=1000; +----------+ | count(*) | +----------+ | 1 | +----------+ 1 row in set (0.00 sec) mysql> select count(*) from s1 where id>1000; +----------+ | count(*) | +----------+ | 298999 | +----------+ 1 row in set (0.12 sec) mysql> select count(*) from s1 where id>1000 and id < 2000; +----------+ | count(*) | +----------+ | 999 | +----------+ 1 row in set (0.00 sec) mysql> select count(*) from s1 where id>1000 and id < 300000; +----------+ | count(*) | +----------+ | 298999 | +----------+ 1 row in set (0.13 sec)
mysql> select count(*) from s1 where name='xxx'; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.00 sec) mysql> select count(*) from s1 where name='egon'; +----------+ | count(*) | +----------+ | 299999 | +----------+ 1 row in set (0.19 sec) mysql> select count(*) from s1 where name='egon' and age=123123123123123; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.45 sec) mysql> create index c on s1(age); Query OK, 0 rows affected (3.03 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> select count(*) from s1 where name='egon' and age=123123123123123; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.00 sec) mysql> select count(*) from s1 where name='egon' and age=10; +----------+ | count(*) | +----------+ | 299999 | +----------+ 1 row in set (0.35 sec) mysql> select count(*) from s1 where name='egon' and age=10 and id>3000 and id < 4000; +----------+ | count(*) | +----------+ | 999 | +----------+ 1 row in set (0.00 sec) mysql> select count(*) from s1 where name='egon' and age=10 and id>3000 and email='xxxx'; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.47 sec) mysql> create index d on s1(email); Query OK, 0 rows affected (4.83 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> select count(*) from s1 where name='egon' and age=10 and id>3000 and email='xxxx'; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.00 sec) mysql> drop index a on s1; Query OK, 0 rows affected (0.10 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> drop index b on s1; Query OK, 0 rows affected (0.09 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> drop index c on s1; Query OK, 0 rows affected (0.09 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> desc s1; +-------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-------------+------+-----+---------+-------+ | id | int(11) | NO | | NULL | | | name | char(20) | YES | | NULL | | | age | int(11) | YES | | NULL | | | email | varchar(30) | YES | MUL | NULL | | +-------+-------------+------+-----+---------+-------+ 4 rows in set (0.00 sec) mysql> select count(*) from s1 where name='egon' and age=10 and id>3000 and email='xxxx'; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.00 sec)
select count(*) from s1 where name='egon' and age=10 and id>3000 and email='xxxx'; index(name,email,age,id) select count(*) from s1 where name='egon' and age> 10 and id=3000 and email='xxxx'; index(name,email,id,age) select count(*) from s1 where name like 'egon' and age= 10 and id=3000 and email='xxxx'; index(email,id,age,name) mysql> desc s1; +-------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-------------+------+-----+---------+-------+ | id | int(11) | NO | | NULL | | | name | char(20) | YES | | NULL | | | age | int(11) | YES | | NULL | | | email | varchar(30) | YES | | NULL | | +-------+-------------+------+-----+---------+-------+ 4 rows in set (0.00 sec) mysql> create index xxx on s1(age,email,name,id); Query OK, 0 rows affected (6.89 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> select count(*) from s1 where name='egon' and age=10 and id>3000 and email='xxxx'; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.00 sec)
index(id,age,email,name) #条件中一定要出现id id id age id email id name email #不行 mysql> select count(*) from s1 where id=3000; +----------+ | count(*) | +----------+ | 1 | +----------+ 1 row in set (0.11 sec) mysql> create index xxx on s1(id,name,age,email); Query OK, 0 rows affected (6.44 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> select count(*) from s1 where id=3000; +----------+ | count(*) | +----------+ | 1 | +----------+ 1 row in set (0.00 sec) mysql> select count(*) from s1 where name='egon'; +----------+ | count(*) | +----------+ | 299999 | +----------+ 1 row in set (0.16 sec) mysql> select count(*) from s1 where email='egon3333@oldboy.com'; +----------+ | count(*) | +----------+ | 1 | +----------+ 1 row in set (0.15 sec) mysql> select count(*) from s1 where id=1000 and email='egon3333@oldboy.com'; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.00 sec) mysql> select count(*) from s1 where email='egon3333@oldboy.com' and id=3000; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.00 sec)
mysql> select count(*) from s1 where id=3000; +----------+ | count(*) | +----------+ | 1 | +----------+ 1 row in set (0.11 sec) mysql> create index xxx on s1(id,name,age,email); Query OK, 0 rows affected (6.44 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> select count(*) from s1 where id=3000; +----------+ | count(*) | +----------+ | 1 | +----------+ 1 row in set (0.00 sec) mysql> select count(*) from s1 where name='egon'; +----------+ | count(*) | +----------+ | 299999 | +----------+ 1 row in set (0.16 sec) mysql> select count(*) from s1 where email='egon3333@oldboy.com'; +----------+ | count(*) | +----------+ | 1 | +----------+ 1 row in set (0.15 sec) mysql> select count(*) from s1 where id=1000 and email='egon3333@oldboy.com'; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.00 sec) mysql> select count(*) from s1 where email='egon3333@oldboy.com' and id=3000; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.00 sec)