ClickHouse 语法优化细节(二)
5、聚合计算外推
聚合函数内的计算,会外推,例如:
EXPLAIN SYNTAX SELECT sum(UserID * 2) FROM datasets.visits_v1; //优化后效果 SELECT sum(UserID) * 2 FROM datasets.visits_v1
6 聚合函数消除
如果对聚合键,也就是 group by key 使用 min、max、any 聚合函数,则将函数消除,例如:
EXPLAIN SYNTAX SELECT sum(UserID * 2), max(VisitID), max(UserID) FROM datasets.visits_v1 GROUP BY UserID; //返回优化后的语句 SELECT sum(UserID) * 2, max(VisitID), UserID FROM datasets.visits_v1 GROUP BY UserID
7、删除重复的 order by key
EXPLAIN SYNTAX SELECT * FROM datasets.visits_v1 ORDER BY UserID ASC, UserID ASC, VisitID ASC, VisitID ASC; //返回优化后的语句: select …… FROM visits_v1 ORDER BY UserID ASC, VisitID ASC
8、删除重复的 limit by key
例如下面的语句,重复声明的 name 字段会被去重
EXPLAIN SYNTAX SELECT * FROM datasets.visits_v1 LIMIT 3 BY VisitID, VisitID LIMIT 10 //返回优化后的语句: select …… FROM datasets.visits_v1 LIMIT 3 BY VisitID LIMIT 10
9、删除重复的 USING Key
例如下面的语句,重复的关联键 id 字段会被去重:
EXPLAIN SYNTAX SELECT a.UserID, a.UserID, b.VisitID, a.URL, b.UserID FROM datasets.hits_v1 AS a LEFT JOIN datasets.visits_v1 AS b USING (UserID, UserID) //返回优化后的语句: SELECT UserID, UserID, VisitID, URL, b.UserID FROM datasets.hits_v1 AS a ALL LEFT JOIN datasets.visits_v1 AS b USING (UserID)
10、标量替换
如果子查询只返回一行数据,在被引用的时候用标量替换,例如下面语句中的 total_disk_usage 字段
EXPLAIN SYNTAX WITH ( SELECT sum(bytes) FROM system.parts WHERE active ) AS total_disk_usage SELECT (sum(bytes) / total_disk_usage) * 100 AS table_disk_usage, table FROM system.parts GROUP BY table ORDER BY table_disk_usage DESC LIMIT 10; //返回优化后的语句: WITH CAST(0, \'UInt64\') AS total_disk_usage SELECT (sum(bytes) / total_disk_usage) * 100 AS table_disk_usage, table FROM system.parts GROUP BY table ORDER BY table_disk_usage DESC LIMIT 10
11 三元运算优化
如果开启了 optimize_if_chain_to_multiif 参数,三元运算符会被替换成 multiIf 函数,例如
EXPLAIN SYNTAX SELECT number = 1 ? 'hello' : (number = 2 ? 'world' : 'atguigu') FROM numbers(10) settings optimize_if_chain_to_multiif = 1; //返回优化后的语句: SELECT multiIf(number = 1, \'hello\', number = 2, \'world\', \'atguigu\') FROM numbers(10) SETTINGS optimize_if_chain_to_multiif = 1;