大数据量分页优化
有如下表(innodb引擎), sql语句在笔记中,
给定日照市,查询子地区, 且查询子地区的功能非常频繁,
如何优化索引及语句?
+------+-----------+------+
| id | name | pid |
+------+-----------+------+
| .... | .... | .... |
| 1584 | 日照市 | 1476 |
| 1586 | 东港区 | 1584 |
| 1587 | 五莲县 | 1584 |
| 1588 | 莒县 | 1584 |
+------+-----------+------+
1: 不加任何索引,自身连接查询
mysql> explain select s.id,s.name from it_area as p inner join it_area as s on p.id=s.pid where p.name='日照市' \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: p
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 3263
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: s
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 3263
Extra: Using where; Using join buffer
2 rows in set (0.00 sec)
2: 给name加索引
mysql> explain select s.id,s.name from it_area as p inner join it_area as s on p.id=s.pid where p.name='日照市' \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: p
type: ref
possible_keys: name
key: name
key_len: 93
ref: const
rows: 1
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: s
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 3243
Extra: Using where; Using join buffer
2 rows in set (0.00 sec)
3: 在Pid上也加索引
mysql> explain select s.id,s.name from it_area as p inner join it_area as s on p.id=s.pid where p.name='日照市' \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: p
type: ref
possible_keys: name
key: name
key_len: 93
ref: const
rows: 1
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: s
type: ref
possible_keys: pid
key: pid
key_len: 5
ref: big_data.p.id
rows: 4
Extra: Using where
2 rows in set (0.00 sec)
延迟关联
mysql> select * from it_area where name like '%东山%';
+------+-----------+------+
| id | name | pid |
+------+-----------+------+
| 757 | 东山区 | 751 |
| 1322 | 东山县 | 1314 |
| 2118 | 东山区 | 2116 |
| 3358 | 东山区 | 3350 |
+------+-----------+------+
4 rows in set (0.00 sec)
分析: 这句话用到了索引覆盖没有?
答: 没有,1 查询了所有列, 没有哪个索引,覆盖了所有列.
2 like %xx%”,左右都是模糊查询, name本身,都没用上索引
第2种做法:
select a.* from it_area as a inner join (select id from it_area where name like '%东山%') as t on a.id=t.id;
Show profiles; 查看效率:
| 18 | 0.00183800 | select * from it_area where name like '%东山%'
| 20 | 0.00169300 | select a.* from it_area as a inner join (select id from it_area where name like '%东山%') as t on a.id=t.id |
发现 第2种做法,虽然语句复杂,但速度却稍占优势.
第2种做法中, 内层查询,只沿着name索引层顺序走, name索引层包含了id值的.
所以,走完索引层之后,找到所有合适的id,
再通过join, 用id一次性查出所有列. 走完name列再取.
第1种做法: 沿着name的索引文件走, 走到满足的条件的索引,就取出其id,
并通过id去取数据, 边走边取.
通过id查找行的过程被延后了. --- 这种技巧,称为”延迟关联”.