【mysql优化】语句优化
1.int型子查询陷阱
如下两个表:
mysql> desc user; +----------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +----------+-------------+------+-----+---------+-------+ | id | int(11) | NO | PRI | NULL | | | name | varchar(22) | YES | | NULL | | | class_id | int(11) | YES | MUL | NULL | | +----------+-------------+------+-----+---------+-------+ 3 rows in set (0.17 sec) mysql> desc class; +------------+-------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------+-------------+------+-----+---------+----------------+ | class_id | int(11) | NO | PRI | NULL | auto_increment | | class_name | varchar(20) | YES | | 000 | | +------------+-------------+------+-----+---------+----------------+ 2 rows in set (0.15 sec)
2表的数据量:
mysql> select count(id) from user; +-----------+ | count(id) | +-----------+ | 1120013 | +-----------+ 1 row in set (2.50 sec) mysql> select count(class_id) from class; +-----------------+ | count(class_id) | +-----------------+ | 5 | +-----------------+ 1 row in set (0.10 sec)
要求查询班级名称是'3'的学生?
1.用in查询:
mysql> select id,name from user where class_id in (select class_id from class wh ere class_name='3') -> ; +-----+----------+ | id | name | +-----+----------+ | 3 | QLQ3 | | 9 | QLQ9 | | 10 | QLQ10 | | 12 | QLQ12 | | 13 | QLQ13 | | 102 | name2100 | | 103 | name3100 | | 104 | name4100 | | 5 | QLQ5 | +-----+----------+ 9 rows in set (0.17 sec)
花费了0.17秒,用explain分析语句:发现也是class表是全局扫描,user表用了class_id索引。
mysql> explain select id,name from user where class_id in (select class_id from class where class_name='3')\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: class partitions: NULL type: ALL possible_keys: PRIMARY key: NULL key_len: NULL ref: NULL rows: 5 filtered: 20.00 Extra: Using where *************************** 2. row *************************** id: 1 select_type: SIMPLE table: user partitions: NULL type: ref possible_keys: class_id key: class_id key_len: 5 ref: maven.class.class_id rows: 1 filtered: 100.00 Extra: NULL 2 rows in set, 1 warning (0.10 sec)
解释:我们理解的是先查询in里面的班级编号,然后利用班级编号去查询外面的id和name。两个表都是全表扫描。
2.改为连接查询:
mysql> SELECT -> u.id, -> u.name -> FROM USER AS u -> INNER JOIN (SELECT -> class_id -> FROM class -> WHERE class_name = '3') AS c -> ON c.class_id = u.class_id; +-----+----------+ | id | name | +-----+----------+ | 3 | QLQ3 | | 9 | QLQ9 | | 10 | QLQ10 | | 12 | QLQ12 | | 13 | QLQ13 | | 102 | name2100 | | 103 | name3100 | | 104 | name4100 | | 5 | QLQ5 | +-----+----------+ 9 rows in set (0.11 sec)
0.11秒
分析上面语句:发现也是class表是全局扫描,user表用了class_id索引。
mysql> EXPLAIN -> SELECT -> u.id, -> u.name -> FROM USER AS u -> INNER JOIN (SELECT -> class_id -> FROM class -> WHERE class_name = '3') AS c -> ON c.class_id = u.class_id \G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: class partitions: NULL type: ALL possible_keys: PRIMARY key: NULL key_len: NULL ref: NULL rows: 5 filtered: 20.00 Extra: Using where *************************** 2. row *************************** id: 1 select_type: SIMPLE table: u partitions: NULL type: ref possible_keys: class_id key: class_id key_len: 5 ref: maven.class.class_id rows: 1 filtered: 100.00 Extra: NULL 2 rows in set, 1 warning (0.10 sec)
总结:
也就是说这个mysql5.7的in查询与我们想的一样,就是先查询里面的东西,然后利用里面查到的值去外面查询,所以外面的查询可以用到索引。
网上总结的经验如下:
题: 在ecshop商城表中,查询6号栏目的商品, (注,6号是一个大栏目)
最直观的: mysql> select goods_id,cat_id,goods_name from goods where cat_id in (select cat_id from ecs_category where parent_id=6);
误区: 给我们的感觉是, 先查到内层的6号栏目的子栏目,如7,8,9,11
然后外层, cat_id in (7,8,9,11)
事实: 如下图, goods表全扫描, 并逐行与category表对照,看parent_id=6是否成立
原因: mysql的查询优化器,针对In型做优化,被改成了exists的执行效果.
当goods表越大时, 查询速度越慢.
改进: 用连接查询来代替子查询
explain select goods_id,g.cat_id,g.goods_name from goods as g inner join (select cat_id from ecs_category where parent_id=6) as t using(cat_id) \G
内层 select cat_id from ecs_category where parent_id=6 ; 用到Parent_id索引, 返回4行
网上总结的经验是 in 查询不是我们想的那样,而是逐行扫描外层的数据去匹配里面的查询,因此查询效率低下。改进方法是利用inner join改进。
网上总结:也就是in不走索引?待思考的问题。。。。。。。。。。。。
到底是我的实践是错的?还是因为mysql版本问题?
2 exists子查询 用法参考http://www.cnblogs.com/qlqwjy/p/8598091.html
exists只是返回一个ture或false的结果(这也是为什么子查询里是select 'x'的原因 当然也可以select任何东西) 也就是它只在乎括号里的数据能不能查找出来,是否存在这样的记录。
题: 查询有商品的栏目.
按上面的理解,我们用join来操作,如下:
mysql> select c.cat_id,cat_name from ecs_category as c inner join goods as g on c.cat_id=g.cat_id group by cat_name;
exists简单用法:
1. 查询id为5的数据: (数据存在)
SELECT * FROM class AS c1 WHERE EXISTS(SELECT class_id FROM class AS c2 WHERE c1.class_id = 5);
如果exists里面返回的结果行数大于1,则返回true,则外面的查询数据可以返回。
2. 查询id为10的数据: (数据不存在)
SELECT * FROM class AS c1 WHERE EXISTS(SELECT class_id FROM class AS c2 WHERE c1.class_id = 10);
因为exsits始终返回的是false,所以外层查询始终无效,也就不会产生数据。
自己的实践:in 用到了索引,exists没用到索引。。。。。。。。。。。。。
mysql> explain select id,name from user as u where exists(select class_id from c lass as c where c.class_id=u.class_id and c.class_name='3')\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: u partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1116336 filtered: 100.00 Extra: Using where *************************** 2. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: c partitions: NULL type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: maven.u.class_id rows: 1 filtered: 20.00 Extra: Using where 2 rows in set, 2 warnings (0.00 sec) mysql> explain select id,name from user as u where class_id in(select class_id f rom class as c where c.class_id=u.class_id and c.class_name='3')\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: c partitions: NULL type: ALL possible_keys: PRIMARY key: NULL key_len: NULL ref: NULL rows: 5 filtered: 20.00 Extra: Using where *************************** 2. row *************************** id: 1 select_type: SIMPLE table: u partitions: NULL type: ref possible_keys: class_id key: class_id key_len: 5 ref: maven.c.class_id rows: 1 filtered: 100.00 Extra: NULL 2 rows in set, 2 warnings (0.00 sec)
3.优化规则:
优化1: 在group时, 用带有索引的列来group, 速度会稍快一些,另外,用int型 比 char型 分组,也要快一些.
优化2: 在group时, 我们假设只取了A表的内容,group by 的列,尽量用A表的列,会比B表的列要快.
优化3: 从语义上去优化
select cat_id,cat_name from ecs_category where exists(select *from goods where goods.cat_id=ecs_category.cat_id)
优化4:from 型子查询:
注意::内层from语句查到的临时表, 是没有索引的.所以: from的返回内容要尽量少.
优化5:min() max()优化
例如:如下表结构:
mysql> show create table user\G *************************** 1. row *************************** Table: user Create Table: CREATE TABLE `user` ( `id` int(11) NOT NULL, `name` varchar(22) DEFAULT NULL, `class_id` int(11) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec)
mysql> select count(id) from user; +-----------+ | count(id) | +-----------+ | 1120013 | +-----------+ 1 row in set (2.32 sec)
现在查询班级编号为5的最小的id:
mysql> select min(id) from user where class_id=5; +---------+ | min(id) | +---------+ | 5 | +---------+ 1 row in set (2.37 sec) mysql> explain select min(id) from user where class_id=5\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: user partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1116336 filtered: 10.00 Extra: Using where 1 row in set, 1 warning (0.00 sec)
分析之后是全表扫描,需要扫描11W多行,因为class_id没有索引。
优化思路:
(1)在class_id上加上索引可以加快查询思路
(2)在不加索引的情况下优化:
试想 id是有顺序的,(默认索引是升续排列), 因此,如果我们沿着id的索引方向走,那么 第1个 pclass_id=5的索引结点,他的id就正好是最小的id.
mysql> explain select min(id) from user where class_id=5\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: user partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1116336 filtered: 10.00 Extra: Using where 1 row in set, 1 warning (0.00 sec) mysql> select id from user use index(primary) where class_id=5 limit 1; +----+ | id | +----+ | 5 | +----+ 1 row in set (0.10 sec) mysql> explain select id from user use index(primary) where class_id=5 limit 1\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: user partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1116336 filtered: 10.00 Extra: Using where 1 row in set, 1 warning (0.00 sec)
虽然说也没用到索引,可是扫描会在扫描到第一个就停下来,因此扫描的实际数据比上面少,因此查询速度快。
总结:min,max中如果where条件没有索引可以使查询沿着有索引的方向走,找到第一个数据后停下来即满足条件。
优化6: count()优化
1:myisam的count()非常快
答: 是比较快,.但仅限于查询表的”所有行”比较快, 因为Myisam对行数进行了存储.一旦有条件的查询, 速度就不再快了.尤其是where条件的列上没有索引.
2: 假如,id<100的商家都是我们内部测试的,我们想查查真实的商家有多少?
select count(*) from lx_com where id>=100; (1000多万行用了6.X秒) 小技巧: select count(*) from lx_com; 快 select count(*) from lx_com where id<100; 快 select count(*) frol lx_com -select count(*) from lx_com where id<100; 快 select (select count(*) from lx_com) - (select count(*) from lx_com where id<100)
优化7:巧用变量
1:用变量排名
表结构如下:
mysql> show create table test\G *************************** 1. row *************************** Table: test Create Table: CREATE TABLE `test` ( `name` varchar(5) DEFAULT NULL, `score` int(11) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec)
表数据:
mysql> select * from test; +------+-------+ | name | score | +------+-------+ | qlq1 | 100 | | qlq2 | 100 | | qlq3 | 90 | | qlq4 | 91 | | ql5q | 91 | +------+-------+ 5 rows in set (0.00 sec)
打印名次。利用变量计算。
mysql> set @curr_cnt := 0,@prev_cnt := 0, @rank := 0; Query OK, 0 rows affected (0.00 sec) mysql> select name,(@curr_cnt:=score) as score,@rank=if(@curr_cnt<>@prev_cnt,@ra nk+1,@rank) as rank,@prev_cnt:=score from test order by score desc; +------+-------+------+------------------+ | name | score | rank | @prev_cnt:=score | +------+-------+------+------------------+ | qlq1 | 100 | 0 | 100 | | qlq2 | 100 | 1 | 100 | | qlq4 | 91 | 0 | 91 | | ql5q | 91 | 1 | 91 | | qlq3 | 90 | 0 | 90 | +------+-------+------+------------------+ 5 rows in set (0.00 sec)
2.用变量计算真正影响的行数:
表结构:(name是主键)
mysql> show create table test\G *************************** 1. row *************************** Table: test Create Table: CREATE TABLE `test` ( `name` varchar(5) DEFAULT NULL, `score` int(11) DEFAULT NULL, UNIQUE KEY `name` (`name`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec)
这时候我们插入一个重复的主键,主键重复之后修改值,将分数减一:
mysql> insert into test values('qlq1',99) on DUPLICATE KEY update score=score-1; Query OK, 2 rows affected (0.15 sec)
mysql> insert into test values('qlq1',99),('qlq2','5') on DUPLICATE KEY update s core=score-1; Query OK, 4 rows affected (0.13 sec) Records: 2 Duplicates: 2 Warnings: 0
上面显示影响的是两行,因为增、修改,但实际修改的是一行。这时候如果我们想知道实际修改的行数:(如下)
mysql> set @x:=0; Query OK, 0 rows affected (0.00 sec) mysql> insert into test values('qlq1',99),('qlq2','5') on DUPLICATE KEY update s core=score-1+(@x:=@x+1)*0; Query OK, 4 rows affected (0.15 sec) Records: 2 Duplicates: 2 Warnings: 0 mysql> select @x; +------+ | @x | +------+ | 2 | +------+ 1 row in set (0.00 sec)
通过变量可以标识实际影响的行数为2行。
3.简化union
例如两个表数据:
mysql> select * from test; +------+-------+ | name | score | +------+-------+ | qlq1 | 97 | | qlq2 | 98 | | qlq3 | 90 | | qlq4 | 91 | | ql5q | 91 | +------+-------+ 5 rows in set (0.00 sec) mysql> select * from test2; +------+-------+ | name | score | +------+-------+ | qlq1 | 100 | | qlq5 | 100 | | qlq3 | 90 | +------+-------+ 3 rows in set (0.00 sec)
采用union取不重复的数据:(all不会去重)
mysql> select * from test union select * from test2; +------+-------+ | name | score | +------+-------+ | qlq1 | 97 | | qlq2 | 98 | | qlq3 | 90 | | qlq4 | 91 | | ql5q | 91 | | qlq1 | 100 | | qlq5 | 100 | +------+-------+ 7 rows in set (0.00 sec)
explain分析上面语句:
mysql> explain select * from test union select * from test2\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: test partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 100.00 Extra: NULL *************************** 2. row *************************** id: 2 select_type: UNION table: test2 partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 3 filtered: 100.00 Extra: NULL *************************** 3. row *************************** id: NULL select_type: UNION RESULT table: <union1,2> partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL filtered: NULL Extra: Using temporary 3 rows in set, 1 warning (0.01 sec)
如何利用变量让后半句select不执行 ,也就是只查询一个表,两个表都有数据但是我们想让他取一个表;
set @find:=0; select name,score,(@find := 1) from test where name='qlq2' union select name,score,(@find := 1) from test2 where name='qlq2' and (@find <= 0) union select 1,1,@find:=0 from test where 0;
解释:先让find变量置为0,如果name=qlq2在test表找到数据,name会将find置为1,此时不会进入第二个select;如果第一个没有查到,会走到第二个select并查询,查到之后将find置为1,第三个select纯粹是为了将find置为0.
结果:
+------+-------+--------------+ | name | score | (@find := 1) | +------+-------+--------------+ | qlq2 | 98 | 1 | +------+-------+--------------+ 1 row in set (0.00 sec)