【mysql】一条慢查询sql的的分析
这个是我在jobbole.com 上看到的
先给出数据表table结构
目前数据库中数据大概有2000W条,之后可能还会不断增长,现在想要查询的是:从表中找出相同姓名,相同规则ID,并且status = 1对应的最后一条记录
写个程序随机加入数据
<?php mysql_connect('10.0.0.234','root','123456'); mysql_select_db('testdb'); mysql_query("set names utf8"); $strs =''; $start = 'insert into tt values '; for($i=0;$i<3000000;$i++){ $k = range(a,z); shuffle($k); $str = implode('',$k); $num = mt_rand(8,16); $in = substr($str,0,$num); $strs .= " (NULL,'$in',$num,1),"; if($i%10000 == 0){ echo $i."\n"; $sql = trim($start.$strs,','); mysql_query($sql); $strs =''; } }
这样300W数据就有了
目前的查询语句是这样的
看一下查询计划
mysql> explain SELECT * FROM tt WHERE id IN ( SELECT max(id) AS max_id FROM tt GROUP BY name, rule_id) and status=1; +----+--------------------+-------+-------+---------------+------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+-------+-------+---------------+------+---------+------+---------+-------------+ | 1 | PRIMARY | tt | ALL | NULL | NULL | NULL | NULL | 1176818 | Using where | | 2 | DEPENDENT SUBQUERY | tt | index | NULL | ttx | 52 | NULL | 1 | Using index | +----+--------------------+-------+-------+---------------+------+---------+------+---------+-------------+
索引用上了,也没有filesort,这是不是就很快了,然后执行查询,就卡主了,卡主了,最后只能被Ctrl+C了
那么这个不行,就换一个,这里我想到了,子查询换成join看一看效果
SELECT a.id,b.name,b.rule_id FROM (select max(id) as id from tt group by name, rule_id) as a left join tt b on a.id = b.id
看一下查询计划
mysql> explain SELECT a.id,b.name,b.rule_id FROM (select max(id) as id from tt group by name, rule_id) as a left join tt b on a.id = b.id; +----+-------------+------------+--------+---------------+---------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+--------+---------------+---------+---------+------+---------+-------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 1176503 | | | 1 | PRIMARY | b | eq_ref | PRIMARY | PRIMARY | 4 | a.id | 1 | | | 2 | DERIVED | tt | index | NULL | ttx | 52 | NULL | 1176818 | Using index | +----+-------------+------------+--------+---------------+---------+---------+------+---------+-------------+
执行结果 1.77579775 sec
mysql> show profiles; +----------+------------+-------------------------------------------------------------------------------------------------------------------------------+ | Query_ID | Duration | Query | +----------+------------+-------------------------------------------------------------------------------------------------------------------------------+ | 6 | 1.77579775 | SELECT a.id,b.name,b.rule_id FROM (select max(id) as id from tt group by name, rule_id) as a left join tt b on a.id = b.id | +----------+------------+-------------------------------------------------------------------------------------------------------------------------------+ 7 rows in set (0.00 sec)
看一看执行 过程
mysql> show profile for query 6; +------------------------------+----------+ | Status | Duration | +------------------------------+----------+ | Sending data | 0.000596 | | Waiting for query cache lock | 0.000002 | | Sending data | 0.000579 | | Waiting for query cache lock | 0.000002 | | Sending data | 0.000534 | | Waiting for query cache lock | 0.000002 | | Sending data | 1.101490 | | end | 0.000010 | | query end | 0.000004 | | closing tables | 0.000003 | | removing tmp table | 0.001369 | | closing tables | 0.000010 | | freeing items | 0.000024 | | logging slow query | 0.000002 | | logging slow query | 0.000035 | | cleaning up | 0.000003 | +------------------------------+----------+ 100 rows in set (0.00 sec)
太多的查询缓存lock,关掉它,再看一下查询结果
mysql> show profile for query 10; +---------------------------+----------+ | Status | Duration | +---------------------------+----------+ | starting | 0.000154 | | checking permissions | 0.000006 | | checking permissions | 0.000005 | | Opening tables | 0.000124 | | System lock | 0.000122 | | optimizing | 0.000007 | | statistics | 0.000035 | | preparing | 0.000018 | | executing | 0.000012 | | Sorting result | 0.000006 | | Sending data | 0.518897 | | converting HEAP to MyISAM | 0.070147 | | Sending data | 0.067123 | | init | 0.000033 | | optimizing | 0.000003 | | statistics | 0.000020 | | preparing | 0.000009 | | executing | 0.000001 | | Sending data | 1.193679 | | end | 0.000011 | | query end | 0.000010 | | closing tables | 0.000002 | | removing tmp table | 0.001491 | | closing tables | 0.000011 | | freeing items | 0.000020 | | logging slow query | 0.000002 | | logging slow query | 0.000050 | | cleaning up | 0.000003 | +---------------------------+----------+
再次分析,我们发现 converting HEAP to MyISAM 这个很耗时
mysql> select @@max_heap_table_size/1024/1024; +---------------------------------+ | @@max_heap_table_size/1024/1024 | +---------------------------------+ | 16.00000000 | +---------------------------------+ mysql> set max_heap_table_size = 16777216*4; Query OK, 0 rows affected (0.00 sec) mysql> select @@max_heap_table_size/1024/1024; +---------------------------------+ | @@max_heap_table_size/1024/1024 | +---------------------------------+ | 64.00000000 | +---------------------------------+
再看看执行结果 1.77579775 > 1.68962725 ,还是有效果的
mysql> show profiles; +----------+------------+-------------------------------------------------------------------------------------------------------------------------------+ | Query_ID | Duration | Query | +----------+------------+-------------------------------------------------------------------------------------------------------------------------------+ | 17 | 1.68962725 | SELECT a.id,b.name,b.rule_id FROM (select max(id) as id from tt group by name, rule_id) as a left join tt b on a.id = b.id | +----------+------------+-------------------------------------------------------------------------------------------------------------------------------+
分析
mysql> show profile for query 17; +----------------------+----------+ | Status | Duration | +----------------------+----------+ | starting | 0.000120 | | checking permissions | 0.000004 | | checking permissions | 0.000003 | | Opening tables | 0.000106 | | System lock | 0.000113 | | optimizing | 0.000007 | | statistics | 0.000044 | | preparing | 0.000011 | | executing | 0.000006 | | Sorting result | 0.000002 | | Sending data | 0.567858 | | init | 0.000032 | | optimizing | 0.000004 | | statistics | 0.000017 | | preparing | 0.000015 | | executing | 0.000002 | | Sending data | 1.120159 | | end | 0.000011 | | query end | 0.000005 | | closing tables | 0.000002 | | removing tmp table | 0.001020 | | closing tables | 0.000011 | | freeing items | 0.000018 | | logging slow query | 0.000002 | | logging slow query | 0.000056 | | cleaning up | 0.000004 | +----------------------+----------+
好看多了,耗时的地方都在 Sending data 上了,如果硬盘换成PCI-SSD 估计又能提高不少
还有没有其他方法呢,当然有,那么换一种写法
先看查询计划
mysql> explain select max(concat_ws(' ',lpad(id,5,' '),status)) as res, name, rule_id from tt group by name, rule_id ; +----+-------------+-------+-------+---------------+------+---------+------+---------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+------+---------+------+---------+-------+ | 1 | SIMPLE | tt | index | NULL | ttx | 52 | NULL | 1176818 | | +----+-------------+-------+-------+---------------+------+---------+------+---------+-------+
再看看执行结果
mysql> show profiles; +----------+------------+-------------------------------------------------------------------------------------------------------------------------------+ | Query_ID | Duration | Query | +----------+------------+-------------------------------------------------------------------------------------------------------------------------------+ | 22 | 1.82505025 | select max(concat_ws(' ',lpad(id,5,' '),status)) as res, name, rule_id from tt group by name, rule_id | +----------+------------+-------------------------------------------------------------------------------------------------------------------------------+
好像更慢了 1.82505025 > 1.68962725
分析一下
mysql> show profile for query 22; +----------------------+----------+ | Status | Duration | +----------------------+----------+ | starting | 0.000157 | | checking permissions | 0.000007 | | Opening tables | 0.000019 | | System lock | 0.000020 | | init | 0.000032 | | optimizing | 0.000005 | | statistics | 0.000016 | | preparing | 0.000012 | | executing | 0.000008 | | Sorting result | 0.000003 | | Sending data | 1.824677 | | end | 0.000012 | | query end | 0.000005 | | closing tables | 0.000009 | | freeing items | 0.000016 | | logging slow query | 0.000002 | | logging slow query | 0.000049 | | cleaning up | 0.000004 | +----------------------+----------+
时间基本上都花费在send data了
这里说一下,max + group by 完全就是一个坑,如果有多个字段要返回数据不能这样写
select max(id) ,type ,name from table where type=xx group by name
因为group by默认返回第一条记录
如果像下面那样写,肯定会遇到坑的
select max(id) as res, name, rule_id from tt group by name, rule_id ;
这样也可以,而且快多了,但是如果有其他字段怎么办
这样的语句基本上没有什么可以再优化了,只能换换其他方式了,比如:换SSD+raid10 ,分区/分表/分库+中间件了
- 作者:踏雪无痕
- 出处:http://www.cnblogs.com/chenpingzhao/
- 本文版权归作者和博客园共有,如需转载,请联系 pingzhao1990#163.com