代码改变世界

MySQL bit类型增加索引后查询结果不正确案例浅析

2024-06-21 16:35  潇湘隐者  阅读(248)  评论(0编辑  收藏  举报

昨天同事遇到的一个案例,这里简单描述一下:一个表里面有一个bit类型的字段,同事在优化相关SQL的过程中,给这个表的bit类型的字段新增了一个索引,然后测试验证 时,居然发现SQL语句执行结果跟不加索引不一样。加了索引后,SQL语句没有查询出一条记录,删除索引后,SQL语句就能查询出几十条记录。下面我们构造一个简单 的例子,重现一下这个案例

我们先创建表student_attend,初始化一些数据。这篇文章的测试环境为MySQL 8.0.35社区版。

CREATE TABLE `student_attend` (
  `id` int NOT NULL AUTO_INCREMENT COMMENT '自增编号',
  `std_id` int DEFAULT NULL COMMENT '学号',
  `class_id` int DEFAULT NULL COMMENT '课程编号',
  `is_attend` bit(1DEFAULT b'1' COMMENT '是否缺陷考勤',
  PRIMARY KEY (`id`)
ENGINE=InnoDB;


insert into student_attend(std_id, class_id, is_attend)
select 100111 from dual union all
select 100120 from dual union all
select 100131 from dual union all
select 100141 from dual union all
select 100151 from dual union all
select 100160 from dual union all
select 100211 from dual union all
select 100221 from dual union all
select 100310 from dual union all
select 100320 from dual;

如下所示,假设我们要查询is_attend=1的所有学生信息,那么可以有下面三种写法

mysql> select * from student_attend where is_attend=1;
+----+--------+----------+----------------------+
| id | std_id | class_id | is_attend            |
+----+--------+----------+----------------------+
|  1 |   1001 |        1 | 0x01                 |
|  3 |   1001 |        3 | 0x01                 |
|  4 |   1001 |        4 | 0x01                 |
|  5 |   1001 |        5 | 0x01                 |
|  7 |   1002 |        1 | 0x01                 |
|  8 |   1002 |        2 | 0x01                 |
+----+--------+----------+----------------------+
6 rows in set (0.00 sec)

mysql> select * from student_attend where is_attend=b'1';
+----+--------+----------+----------------------+
| id | std_id | class_id | is_attend            |
+----+--------+----------+----------------------+
|  1 |   1001 |        1 | 0x01                 |
|  3 |   1001 |        3 | 0x01                 |
|  4 |   1001 |        4 | 0x01                 |
|  5 |   1001 |        5 | 0x01                 |
|  7 |   1002 |        1 | 0x01                 |
|  8 |   1002 |        2 | 0x01                 |
+----+--------+----------+----------------------+
6 rows in set (0.00 sec)

#遇到问题的SQL写法
mysql> select * from student_attend where is_attend='1';
+----+--------+----------+----------------------+
| id | std_id | class_id | is_attend            |
+----+--------+----------+----------------------+
|  1 |   1001 |        1 | 0x01                 |
|  3 |   1001 |        3 | 0x01                 |
|  4 |   1001 |        4 | 0x01                 |
|  5 |   1001 |        5 | 0x01                 |
|  7 |   1002 |        1 | 0x01                 |
|  8 |   1002 |        2 | 0x01                 |
+----+--------+----------+----------------------+
6 rows in set (0.00 sec)

mysql> 

接下来,我们在字段is_attend上创建索引ix_student_attend_n1,如下所示

create index ix_student_attend_n1 on student_attend(is_attend);

然后我们继续测试验证,就能出现我前文所说的情况,如需所示,最后一个SQL,它的返回记录数为0.

mysql> select * from student_attend where is_attend=1;
+----+--------+----------+----------------------+
| id | std_id | class_id | is_attend            |
+----+--------+----------+----------------------+
|  1 |   1001 |        1 | 0x01                 |
|  3 |   1001 |        3 | 0x01                 |
|  4 |   1001 |        4 | 0x01                 |
|  5 |   1001 |        5 | 0x01                 |
|  7 |   1002 |        1 | 0x01                 |
|  8 |   1002 |        2 | 0x01                 |
+----+--------+----------+----------------------+
6 rows in set (0.00 sec)

mysql> select * from student_attend where is_attend=b'1';
+----+--------+----------+----------------------+
| id | std_id | class_id | is_attend            |
+----+--------+----------+----------------------+
|  1 |   1001 |        1 | 0x01                 |
|  3 |   1001 |        3 | 0x01                 |
|  4 |   1001 |        4 | 0x01                 |
|  5 |   1001 |        5 | 0x01                 |
|  7 |   1002 |        1 | 0x01                 |
|  8 |   1002 |        2 | 0x01                 |
+----+--------+----------+----------------------+
6 rows in set (0.00 sec)

mysql> select * from student_attend where is_attend='1';
Empty set (0.00 sec)

mysql> 

其实第一次见到这种情况的时候,我还是有点震惊的,因为在我的观念中,索引只会影响执行计划,不会影响查询结果,但是现在的情况是 索引的存在影响了SQL的查询结果。那么为什么会出现这种情况呢?

首先看了一下执行计划,如下所示,从执行计划看,它既没有走全表扫描也没有走索引,仅仅有"message": "no matching row in const table"提示,如果仅仅分析 执行计划,我们得不到更多的有用信息

mysql> explain
    -> select * from student_attend where is_attend='1';
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                          |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------+
|  1 | SIMPLE      | NULL  | NULL       | NULL | NULL          | NULL | NULL    | NULL | NULL |     NULL | no matching row in const table |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------+
1 row in set1 warning (0.00 sec)

mysql> 
mysql> explain format=json
    -> select * from student_attend where is_attend='1'\G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id"1,
    "message""no matching row in const table"
  } /* query_block */
}
1 row in set1 warning (0.00 sec)

mysql> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: /* select#1 */ select `kerry`.`student_attend`.`id` AS `id`,`kerry`.`student_attend`.`std_id` AS `std_id`,`kerry`.`student_attend`.`class_id` AS `class_id`,`kerry`.`student_attend`.`is_attend` AS `is_attend` from `kerry`.`student_attend` where (`kerry`.`student_attend`.`is_attend` = '1')
1 row in set (0.00 sec)

mysql> 

那么我们使用trace跟踪分析一下优化器如何选择执行计划。看看其详细执行过程,如下所示


mysql> SET OPTIMIZER_TRACE="enabled=on",END_MARKERS_IN_JSON=on;
Query OK, 0 rows affected (0.00 sec)

mysql> SET OPTIMIZER_TRACE_MAX_MEM_SIZE=1000000;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from student_attend where is_attend='1';
Empty set (0.00 sec)

mysql> SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE \G;
*************************** 1. row ***************************
                            QUERY: select * from student_attend where is_attend='1'
                            TRACE: {
  "steps": [
    {
      "join_preparation": {
        "select#"1,
        "steps": [
          {
            "expanded_query""/* select#1 */ select `student_attend`.`id` AS `id`,`student_attend`.`std_id` AS `std_id`,`student_attend`.`class_id` AS `class_id`,`student_attend`.`is_attend` AS `is_attend` from `student_attend` where (`student_attend`.`is_attend` = '1')"
          }
        ] /* steps */
      } /* join_preparation */
    },
    {
      "join_optimization": {
        "select#"1,
        "steps": [
          {
            "condition_processing": {
              "condition""WHERE",
              "original_condition""(`student_attend`.`is_attend` = '1')",
              "steps": [
                {
                  "transformation""equality_propagation",
                  "resulting_condition""(`student_attend`.`is_attend` = '1')"
                },
                {
                  "transformation""constant_propagation",
                  "resulting_condition""(`student_attend`.`is_attend` = '1')"
                },
                {
                  "transformation""trivial_condition_removal",
                  "resulting_condition""(`student_attend`.`is_attend` = '1')"
                }
              ] /* steps */
            } /* condition_processing */
          },
          {
            "substitute_generated_columns": {
            } /* substitute_generated_columns */
          },
          {
            "table_dependencies": [
              {
                "table""`student_attend`",
                "row_may_be_null"false,
                "map_bit"0,
                "depends_on_map_bits": [
                ] /* depends_on_map_bits */
              }
            ] /* table_dependencies */
          },
          {
            "ref_optimizer_key_uses": [
              {
                "table""`student_attend`",
                "field""is_attend",
                "equals""'1'",
                "null_rejecting"true
              }
            ] /* ref_optimizer_key_uses */
          },
          {
            "rows_estimation": [
              {
                "table""`student_attend`",
                "range_analysis": {
                  "table_scan": {
                    "rows"10,
                    "cost"3.35
                  } /* table_scan */,
                  "potential_range_indexes": [
                    {
                      "index""PRIMARY",
                      "usable"false,
                      "cause""not_applicable"
                    },
                    {
                      "index""ix_student_attend_n1",
                      "usable"true,
                      "key_parts": [
                        "is_attend",
                        "id"
                      ] /* key_parts */
                    }
                  ] /* potential_range_indexes */,
                  "setup_range_conditions": [
                    {
                      "impossible_condition": {
                        "cause""value_out_of_range"
                      } /* impossible_condition */
                    }
                  ] /* setup_range_conditions */,
                  "impossible_range"true
                } /* range_analysis */,
                "rows"0,
                "cause""impossible_where_condition"
              }
            ] /* rows_estimation */
          }
        ] /* steps */,
        "empty_result": {
          "cause""no matching row in const table"
        } /* empty_result */
      } /* join_optimization */
    },
    {
      "join_execution": {
        "select#"1,
        "steps": [
        ] /* steps */
      } /* join_execution */
    }
  ] /* steps */
}
MISSING_BYTES_BEYOND_MAX_MEM_SIZE: 0
          INSUFFICIENT_PRIVILEGES: 0
1 row in set (0.00 sec)

ERROR
No query specified

mysql> 
mysql> SET optimizer_trace="enabled=off";
Query OK, 0 rows affected (0.01 sec)

从trace的详细信息看,这个过程中发生了隐式转换:下面这个过程就是发生了类型转换

由于发生类型转换过程中(字符串转换为bit类型)遇到了数据截断错误(从value_out_of_range等信息就可以看出),如下截图所示

而优化器应该是根据一定的逻辑判断,得到这个值不存在索引中,从而就判断没有匹配的记录,直接返回空的结果集了,根本不去走扫描全表或走索引查找等操作。

        "empty_result": {
          "cause""no matching row in const table"
        } /* empty_result */

当然这里仅仅是根据trace的信息做的一个判断,如有错误或不谨慎的地方,敬请谅解。毕竟没有深入分析过源码。

那么为什么没有索引的话,SQL语句的结果就是正确的呢? 难道没有发生类型转换吗? 难度没有发生数据截断错误吗?那么我们就继续trace跟踪分析看看,如下所示

mysql> drop index ix_student_attend_n1 on student_attend;
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> SET OPTIMIZER_TRACE="enabled=on",END_MARKERS_IN_JSON=on;
Query OK, 0 rows affected (0.00 sec)

mysql> SET OPTIMIZER_TRACE_MAX_MEM_SIZE=1000000;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from student_attend where is_attend='1';
+----+--------+----------+----------------------+
| id | std_id | class_id | is_attend            |
+----+--------+----------+----------------------+
|  1 |   1001 |        1 | 0x01                 |
|  3 |   1001 |        3 | 0x01                 |
|  4 |   1001 |        4 | 0x01                 |
|  5 |   1001 |        5 | 0x01                 |
|  7 |   1002 |        1 | 0x01                 |
|  8 |   1002 |        2 | 0x01                 |
+----+--------+----------+----------------------+
6 rows in set (0.00 sec)

mysql> SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE \G;
*************************** 1. row ***************************
                            QUERY: select * from student_attend where is_attend='1'
                            TRACE: {
  "steps": [
    {
      "join_preparation": {
        "select#"1,
        "steps": [
          {
            "expanded_query""/* select#1 */ select `student_attend`.`id` AS `id`,`student_attend`.`std_id` AS `std_id`,`student_attend`.`class_id` AS `class_id`,`student_attend`.`is_attend` AS `is_attend` from `student_attend` where (`student_attend`.`is_attend` = '1')"
          }
        ] /* steps */
      } /* join_preparation */
    },
    {
      "join_optimization": {
        "select#"1,
        "steps": [
          {
            "condition_processing": {
              "condition""WHERE",
              "original_condition""(`student_attend`.`is_attend` = '1')",
              "steps": [
                {
                  "transformation""equality_propagation",
                  "resulting_condition""(`student_attend`.`is_attend` = '1')"
                },
                {
                  "transformation""constant_propagation",
                  "resulting_condition""(`student_attend`.`is_attend` = '1')"
                },
                {
                  "transformation""trivial_condition_removal",
                  "resulting_condition""(`student_attend`.`is_attend` = '1')"
                }
              ] /* steps */
            } /* condition_processing */
          },
          {
            "substitute_generated_columns": {
            } /* substitute_generated_columns */
          },
          {
            "table_dependencies": [
              {
                "table""`student_attend`",
                "row_may_be_null"false,
                "map_bit"0,
                "depends_on_map_bits": [
                ] /* depends_on_map_bits */
              }
            ] /* table_dependencies */
          },
          {
            "ref_optimizer_key_uses": [
            ] /* ref_optimizer_key_uses */
          },
          {
            "rows_estimation": [
              {
                "table""`student_attend`",
                "table_scan": {
                  "rows"10,
                  "cost"0.25
                } /* table_scan */
              }
            ] /* rows_estimation */
          },
          {
            "considered_execution_plans": [
              {
                "plan_prefix": [
                ] /* plan_prefix */,
                "table""`student_attend`",
                "best_access_path": {
                  "considered_access_paths": [
                    {
                      "rows_to_scan"10,
                      "access_type""scan",
                      "resulting_rows"10,
                      "cost"1.25,
                      "chosen"true
                    }
                  ] /* considered_access_paths */
                } /* best_access_path */,
                "condition_filtering_pct"100,
                "rows_for_plan"10,
                "cost_for_plan"1.25,
                "chosen"true
              }
            ] /* considered_execution_plans */
          },
          {
            "attaching_conditions_to_tables": {
              "original_condition""(`student_attend`.`is_attend` = '1')",
              "attached_conditions_computation": [
              ] /* attached_conditions_computation */,
              "attached_conditions_summary": [
                {
                  "table""`student_attend`",
                  "attached""(`student_attend`.`is_attend` = '1')"
                }
              ] /* attached_conditions_summary */
            } /* attaching_conditions_to_tables */
          },
          {
            "finalizing_table_conditions": [
              {
                "table""`student_attend`",
                "original_table_condition""(`student_attend`.`is_attend` = '1')",
                "final_table_condition   ""(`student_attend`.`is_attend` = '1')"
              }
            ] /* finalizing_table_conditions */
          },
          {
            "refine_plan": [
              {
                "table""`student_attend`"
              }
            ] /* refine_plan */
          }
        ] /* steps */
      } /* join_optimization */
    },
    {
      "join_execution": {
        "select#"1,
        "steps": [
        ] /* steps */
      } /* join_execution */
    }
  ] /* steps */
}
MISSING_BYTES_BEYOND_MAX_MEM_SIZE: 0
          INSUFFICIENT_PRIVILEGES: 0
1 row in set (0.00 sec)

ERROR
No query specified

mysql> SET optimizer_trace="enabled=off";
Query OK, 0 rows affected (0.00 sec)

从上面trace信息来看,似乎执行计划先进行全表扫描,然后过滤记录,输出信息里面没有value_out_of_range这类信息,似乎没有发生数据截断。具体步骤跟之前的trace信息有很大不同。具体只看到了下面这些信息,但是更多信息我也看不出来。不清楚底层到底做了啥。

小结

关于bit类型的字段,我们写SQL的时候,不要使用字符串,避免发生隐式类型转换。正确的写法应该是下面这种方式

select * from student_attend where is_attend=b'1';

select * from student_attend where is_attend=1;

DBA在给bit类型创建索引的时候也必须谨慎处理,跟开发和Support人员多协商沟通,告知他们可能出现这种情况,因为你可能没法控制开发人员写出这样的SQL。