代码改变世界

如果只能选择优化一条查询,应该优化哪条?

2022-12-20 14:26  abce  阅读(65)  评论(0编辑  收藏  举报

如果只能选择优化一条查询,应该优化哪条?
这个问题很难回答,可能最佳的回答就是"看情况"。
但是,让我们来尝试给出问题的最好的答案。也许只是解决一条sql并不能解决问题,同时可能会找出更多的问题语句。

第一候选项是尝试优化解决执行时间(延迟)最大的查询。

看看sys schema都给我们提供了哪些工具表

root@localhost abce>use sys
Database changed
root@localhost sys>show tables like 'statements_with%';
+---------------------------------------------+
| Tables_in_sys (statements_with%)            |
+---------------------------------------------+
| statements_with_errors_or_warnings          |
| statements_with_full_table_scans            |
| statements_with_runtimes_in_95th_percentile |
| statements_with_sorting                     |
| statements_with_temp_tables                 |
+---------------------------------------------+
5 rows in set (0.00 sec)

root@localhost sys>

我们会使用到statements_with_runtimes_in_95th_percentile,为了能按照我们想要的方式执行排序,我会使用带有原始数据(不是人类可读的格式)的视图版本x$statements_with_runtimes_in_95th_percentile。(x$statements_with_runtimes_in_95th_percentile和statements_with_runtimes_in_95th_percentile中显示的单位不一样)

SELECT schema_name, 
       format_time(total_latency) tot_lat,   
       exec_count, 
	   format_time(total_latency/exec_count) latency_per_call, 
       query_sample_text 
  FROM sys.x$statements_with_runtimes_in_95th_percentile AS t1
  JOIN performance_schema.events_statements_summary_by_digest AS t2 
    ON t2.digest=t1.digest 
 WHERE schema_name NOT in ('performance_schema', 'sys') 
ORDER BY (total_latency/exec_count) desc LIMIT 1\G
*************************** 1. row ***************************
      schema_name: library
          tot_lat: 857.29 ms
       exec_count: 1
 latency_per_call: 857.29 ms
query_sample_text: INSERT INTO `books` (`doc`) VALUES ('{\"_id\": \"00005d44289d000000000000007d\", \"title\": \"lucky luke, tome 27 : l alibi\", \"isbn10\": \"2884710086\", \"isbn13\": \"978-2884710084\", \"langue\": \"français\", \"relié\": \"48 pages\", \"authors\": [\"Guylouis (Auteur)\", \"Morris (Illustrations)\"], \"editeur\": \"lucky comics (21 décembre 1999)\", \"collection\": \"lucky luke\", \"couverture\": \" ...
1 row in set (0.2838 sec)

这语句比较复杂,有点优化的难度,因为只是一个简单的insert语句,而且只是运行一次。插入慢可能是因为磁盘响应时间、索引太多等。

这也是我推荐你再看看以下两个表的原因:

·sys.schema_redundant_indexes
·sys.schema_unused_indexes

 

其次,尝试优化解决执行全表扫描的查询。

SELECT schema_name, 
       sum_rows_examined, 
	   (sum_rows_examined/exec_count) avg_rows_call,
       format_time(total_latency) tot_lat, exec_count,
       format_time(total_latency/exec_count) AS latency_per_call,
       query_sample_text 
  FROM sys.x$statements_with_full_table_scans AS t1
  JOIN performance_schema.events_statements_summary_by_digest AS t2 
    ON t2.digest=t1.digest 
 WHERE schema_name NOT in ('performance_schema', 'sys') 
ORDER BY (total_latency/exec_count) desc LIMIT 1\G
*************************** 1. row ***************************
       schema_name: wp_lefred
 sum_rows_examined: 268075
     avg_rows_call: 3277.0419
           tot_lat: 31.31 s
        exec_count: 124
  latency_per_call: 252.47 ms
 query_sample_text: SELECT count(*) as mytotal
                 FROM wp_posts
                 WHERE (post_content LIKE '%youtube.com/%' 
                   OR post_content LIKE '%youtu.be/%')
                 AND post_status = 'publish'
 1 row in set (0.0264 sec)

这个查询执行了124次,总耗时31.31s。平均每次调用耗费252.47ms。


第三,尝试优化解决用到temp表的查询。
创建临时表也是优化的目标

SELECT schema_name, 
       format_time(total_latency) tot_lat, 
	   exec_count, 
       format_time(total_latency/exec_count) latency_per_call, 
	   query_sample_text 
  FROM sys.x$statements_with_temp_tables AS t1
  JOIN  performance_schema.events_statements_summary_by_digest AS t2
    ON t2.digest=t1.digest 
 WHERE schema_name NOT in ('performance_schema', 'sys') AND disk_tmp_tables=1
ORDER BY 2 desc,(total_latency/exec_count) desc LIMIT 1\G

这里没有找到使用temp表的查询。