mysql查询INFORMATION_SCHEMA表很慢的性能优化

最近发现，我们有些环境的tomcat应用启动非常缓慢，大部分在3-5分钟，有个测试环境更加阶段，要十几分钟才能启动完成。经过仔细分析，是一个查询INFORMATION_SCHEMA库中数据字典信息的查询异常缓慢，该语句如下：

SELECT
    c.COLUMN_NAME,
    c.TABLE_NAME
FROM
    information_schema.TABLE_CONSTRAINTS AS t,
    information_schema.KEY_COLUMN_USAGE AS c
WHERE
    t.TABLE_NAME = c.TABLE_NAME
AND t.TABLE_SCHEMA = c.CONSTRAINT_SCHEMA
AND t.CONSTRAINT_SCHEMA = 'hs_tatrade2'
AND t.CONSTRAINT_TYPE = 'PRIMARY KEY'

以前从来都没遇到这种问题，也很少关心mysql数据字典查询的性能问题，因为几乎没有遇到过。

查看show processlist，一直在after opening table等待。。。。

看了下执行计划以及information_schema中表结构的定义，因为都是内存表，都没有索引，这两张表都只有数百条记录，按说即使没有索引也不会这么慢。。。

经网上搜寻，有人有不少帖子提及是因为innodb_stats_on_metadata=ON导致查询information_schema时更新统计信息的原因。经测试，不是这个原因（其实，我现在相信网上80%以上的所谓分析帖子都是理论上的测试，并不是真正遇到，尤其是所谓的很多专家）。

再次寻找到mysql官方文档，https://dev.mysql.com/doc/refman/5.7/en/information-schema-optimization.html，如下：

8.2.3 Optimizing INFORMATION_SCHEMA Queries

1) Try to use constant lookup values for database and table names in the WHERE clause

You can take advantage of this principle as follows:

To look up databases or tables, use expressions that evaluate to a constant, such as literal values, functions that return a constant, or scalar subqueries.
Avoid queries that use a nonconstant database name lookup value (or no lookup value) because they require a scan of the data directory to find matching database directory names.
Within a database, avoid queries that use a nonconstant table name lookup value (or no lookup value) because they require a scan of the database directory to find matching table files.

This principle applies to the INFORMATION_SCHEMA tables shown in the following table, which shows the columns for which a constant lookup value enables the server to avoid a directory scan. For example, if you are selecting from TABLES, using a constant lookup value for TABLE_SCHEMA in the WHERE clause enables a data directory scan to be avoided.

Table	Column to specify to avoid data directory scan	Column to specify to avoid database directory scan
`COLUMNS`	`TABLE_SCHEMA`	`TABLE_NAME`
`KEY_COLUMN_USAGE`	`TABLE_SCHEMA`	`TABLE_NAME`
`PARTITIONS`	`TABLE_SCHEMA`	`TABLE_NAME`
`REFERENTIAL_CONSTRAINTS`	`CONSTRAINT_SCHEMA`	`TABLE_NAME`
`STATISTICS`	`TABLE_SCHEMA`	`TABLE_NAME`
`TABLES`	`TABLE_SCHEMA`	`TABLE_NAME`
`TABLE_CONSTRAINTS`	`TABLE_SCHEMA`	`TABLE_NAME`
`TRIGGERS`	`EVENT_OBJECT_SCHEMA`	`EVENT_OBJECT_TABLE`
`VIEWS`	`TABLE_SCHEMA`	`TABLE_NAME`

意思就是查询上述这些表的时候，务必带上TABLE_SCHEMA=或者XXX_SCHEMA，以避免数据目录扫描。而我们的场景刚好是TABLE_CONSTRAINTS没有使用TABLE_SCHEMA，虽然关联使用了TABLE_SCHEMA，但这是没有用的。

经过将SQL更改为如下：

SELECT
    c.COLUMN_NAME,
    c.TABLE_NAME
FROM
    information_schema.TABLE_CONSTRAINTS AS t,
    information_schema.KEY_COLUMN_USAGE AS c
WHERE
    t.TABLE_NAME = c.TABLE_NAME
AND t.TABLE_SCHEMA = c.CONSTRAINT_SCHEMA
AND t.TABLE_SCHEMA = 'hs_tatrade2'
AND c.TABLE_SCHEMA = 'hs_tatrade2'
AND t.CONSTRAINT_TYPE = 'PRIMARY KEY'

瞬间就飞快了。

posted @ 2018-07-12 22:17 zhjh256 阅读(4555) 评论(0) 收藏举报

刷新页面返回顶部

mysql查询INFORMATION_SCHEMA表很慢的性能优化

8.2.3 Optimizing INFORMATION_SCHEMA Queries

公告