优化sql案例 mysql关联查询的时候,因关联字段的排序规则不一致导致索引失效案例
#########################
常见索引失效场景案例:
create table tbl_name(id bigint not null auto_increment,
name varchar(255) not null,
address varchar(255) not null,
create_time datetime,
primary key (id),
key idx_name(name),
key idx_address_time(address(32),create_time))
ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4;
索引失效场景 | 举例说明 |
like模糊匹配的时候以%开头 |
错误:select name,address,create_time from tbl_name where name like '%zhangsan'; 正确:select name,address,create_time from tbl_name where name='zhangsan%'; |
数据类型隐式或显式转换 |
错误:select name,address,create_time from tbl_name where name=123; 正确:select name,address,create_time from tbl_name where name='123';
|
多表联合查询的关联字段的排序规则或字符集不相同 |
通过以下方法查看表的字符集和排序规则: mysql>> show create table tbl_name\G; mysql>> show full columns from os_app_info; |
在索引列上使用函数或计算 |
错误:select name,address,create_time from tbl_name where upper(name)='zhangsan'; 正确:select name,address,create_time from tbl_name where name='zhangsan'; ------------------------------------------------------------------------------------------------------- 错误:select name,address,create_time from tbl_name where id-10<0; 正确:select name,address,create_time from tbl_name where id<10; |
违反联合索引的最左匹配规则 |
错误:select name,address,create_time from tbl_name where create_time > '2021-08-03 22:22:22'; 正确:select name,address,create_time from tbl_name where address='wuhan'; 正确:select name,address,create_time from tbl_name where address='wuhan' and create_time > '2021-08-03 22:22:22'; |
发现问题并定位问题:
# 从表结构,可以看出:iphone表的query字段的字符集是utf8mb4,排序规则是uft8mb4_bin,
# ipad表的query字段的字符集是utf8,排序规则是utf8_general_ci,
# 因此这两个表用query字段进行关联来查询的时候,索引会失效。
root@10.10.10.10(apple) > show create table iphone\G *************************** 1. row *************************** Table: iphone Create Table: CREATE TABLE `iphone` ( `id` bigint(5) unsigned NOT NULL AUTO_INCREMENT, `query` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL, `device` varchar(32) CHARACTER SET utf8 NOT NULL, `mydomain` varchar(32) CHARACTER SET utf8 NOT NULL, `mydate` int(8) NOT NULL, `status` tinyint(1) NOT NULL, `username` varchar(32) CHARACTER SET utf8 DEFAULT NULL, `create_time` timestamp NOT NULL DEFAULT '2000-01-01 00:00:00', `update_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `pv` int(6) NOT NULL DEFAULT '1', PRIMARY KEY (`id`), UNIQUE KEY `idx_query_device` (`query`(32),`device`(16)) USING BTREE ) ENGINE=InnoDB AUTO_INCREMENT=59415 DEFAULT CHARSET=utf8mb4 1 row in set (0.00 sec) root@10.10.10.10(apple) > show create table ipad\G *************************** 1. row *************************** Table: ipad Create Table: CREATE TABLE `ipad` ( `id` int(5) unsigned NOT NULL AUTO_INCREMENT, `query` varchar(64) CHARACTER SET utf8mb4 NOT NULL, `device` varchar(32) NOT NULL, `mydomain` varchar(128) NOT NULL, `myresult` varchar(256) NOT NULL, `address` varchar(128) DEFAULT NULL, `create_time` timestamp NOT NULL DEFAULT '2000-01-01 00:00:00', `update_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `username` varchar(32) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `uniq_query_device` (`query`,`device`) USING HASH ) ENGINE=InnoDB AUTO_INCREMENT=58917 DEFAULT CHARSET=utf8 1 row in set (0.00 sec) # iphone表的数据量:38899
root@10.10.10.10(apple) > select count(*) from iphone; +----------+ | count(*) | +----------+ | 38899 | +----------+ 1 row in set (0.06 sec)
# ipad表的数据量:3899
root@10.10.10.10(apple) > select count(*) from ipad; +----------+ | count(*) | +----------+ | 38899 | +----------+ 1 row in set (0.07 sec)
# iphone和ipad联合查询,关联字段是query,耗时3分10.23秒
root@10.10.10.10(apple) > select count(distinct t.query) from iphone as t left join ipad r on t.query = r.query where t.status = 1; +-------------------------+ | count(distinct t.query) | +-------------------------+ | 9722 | +-------------------------+ 1 row in set (3 min 10.23 sec)
# 查看详细的执行耗时分布: root@10.10.10.10(apple) > select @@profiling; +-------------+ | @@profiling | +-------------+ | 0 | +-------------+ 1 row in set (0.00 sec) root@10.10.10.10(apple) > set profiling = 1; Query OK, 0 rows affected (0.00 sec) root@10.10.10.10(apple) > select count(distinct t.query) from iphone as t left join ipad r on t.query = r.query where t.status = 1; +-------------------------+ | count(distinct t.query) | +-------------------------+ | 9722 | +-------------------------+ 1 row in set (3 min 2.00 sec)
# 发现在 Sending data这个阶段耗时最大,几乎所有时间都被他占用了
root@10.10.10.10(apple) > show profile for query 1; +--------------------------------+------------+ | Status | Duration | +--------------------------------+------------+ | starting | 0.000319 | | Waiting for query cache lock | 0.000091 | | checking query cache for query | 0.000117 | | checking permissions | 0.000098 | | checking permissions | 0.000091 | | Opening tables | 0.000120 | | System lock | 0.000096 | | init | 0.000102 | | optimizing | 0.000098 | | statistics | 0.000150 | | preparing | 0.000097 | | executing | 0.000115 | | Sending data | 181.991527 | | end | 0.000794 | | removing tmp table | 0.000096 | | end | 0.000092 | | query end | 0.000091 | | closing tables | 0.000106 | | freeing items | 0.000107 | | logging slow query | 0.000090 | | logging slow query | 0.000119 | | cleaning up | 0.000095 | +--------------------------------+------------+ 22 rows in set (0.00 sec)
解决问题:修改排序规则:
# 将ipad表的排序规则改成和iphone表的query字段相同,这是关键步骤。这里本来只需要将ipad表的query字段的字符集和排序规则改成相同就可以了,我这里主要是减少维护难度,因此直接将整个表的字符集和排序规则改成一样的
root@10.10.10.10(apple) > alter table ipad convert to character set utf8mb4 collate utf8mb4_bin; Query OK, 38899 rows affected (1 min 8.83 sec) Records: 38899 Duplicates: 0 Warnings: 0
# 再次查询,发现只需要0.09秒
root@10.10.10.10(apple) > select count(distinct t.query) from iphone as t left join ipad r on t.query = r.query where t.status = 1; +-------------------------+ | count(distinct t.query) | +-------------------------+ | 9722 | +-------------------------+ 1 row in set (0.09 sec)
# 查看详细执行过程,这里新开了一个会话,请根据实际确定哪个查询,此时sending data只需要0.08秒
root@10.10.10.10(apple) > show profile for query 1; +--------------------------------+----------+ | Status | Duration | +--------------------------------+----------+ | starting | 0.000323 | | Waiting for query cache lock | 0.000091 | | checking query cache for query | 0.000125 | | checking permissions | 0.000090 | | checking permissions | 0.000092 | | Opening tables | 0.000126 | | System lock | 0.000096 | | init | 0.000106 | | optimizing | 0.000097 | | statistics | 0.000155 | | preparing | 0.000100 | | executing | 0.000115 | | Sending data | 0.083838 | | end | 0.000327 | | removing tmp table | 0.000093 | | end | 0.000091 | | query end | 0.000091 | | closing tables | 0.000095 | | freeing items | 0.000111 | | logging slow query | 0.000093 | | cleaning up | 0.000090 | +--------------------------------+----------+ 21 rows in set (0.01 sec)
# 查看执行计划:均走了索引
root@10.10.10.10(apple) > desc select count(distinct t.query) from iphone as t left join ipad r on t.query = r.query where t.status = 1; +----+-------------+-------+------+-------------------+-------------------+---------+-----------------------------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+-------------------+-------------------+---------+-----------------------------+-------+-------------+ | 1 | SIMPLE | t | ref | idx_status | idx_status | 1 | const | 10528 | | | 1 | SIMPLE | r | ref | uniq_query_device | uniq_query_device | 258 | apple.t.query | 1 | Using index | +----+-------------+-------+------+-------------------+-------------------+---------+-----------------------------+-------+-------------+ 2 rows in set (0.00 sec)
我也测试过的,如果将两个表的关联字段的字符集保持相同,但是排序规则不同,那么索引会失效,那么有没有排序规则相同而字符集不同呢?哈哈哈,显然是没有的,排序规则就是根据字符集来定义的。那么字符集不同,排序规则的名称肯定也不一样,关联查询的时候关联字段也会索引失效。
#############
##########################