ClickHouse 语法优化细节(二)

5、聚合计算外推

聚合函数内的计算,会外推,例如:
EXPLAIN SYNTAX
SELECT sum(UserID * 2)
FROM datasets.visits_v1;
//优化后效果
SELECT sum(UserID) * 2
FROM datasets.visits_v1

6 聚合函数消除

如果对聚合键,也就是 group by key 使用 min、max、any 聚合函数,则将函数消除,例如: 
EXPLAIN SYNTAX
SELECT sum(UserID * 2),
       max(VisitID),
       max(UserID)
FROM datasets.visits_v1
GROUP BY UserID;
//返回优化后的语句
SELECT sum(UserID) * 2,
       max(VisitID),
       UserID
FROM datasets.visits_v1
GROUP BY UserID

7、删除重复的 order by key

EXPLAIN SYNTAX
SELECT *
FROM datasets.visits_v1
ORDER BY UserID ASC,
         UserID ASC,
         VisitID ASC,
         VisitID ASC;
//返回优化后的语句:
select
    ……
FROM visits_v1
ORDER BY
    UserID ASC,
    VisitID ASC

8、删除重复的 limit by key

例如下面的语句,重复声明的 name 字段会被去重
EXPLAIN SYNTAX
SELECT *
FROM datasets.visits_v1
LIMIT 3 BY
    VisitID,
    VisitID
LIMIT 10
//返回优化后的语句:
select
   ……
FROM datasets.visits_v1
LIMIT 3 BY VisitID
LIMIT 10

9、删除重复的 USING Key

例如下面的语句,重复的关联键 id 字段会被去重:
EXPLAIN SYNTAX
SELECT
    a.UserID,
    a.UserID,
    b.VisitID,
    a.URL,
    b.UserID
FROM datasets.hits_v1 AS a
LEFT JOIN datasets.visits_v1 AS b USING (UserID, UserID)
//返回优化后的语句:
SELECT 
    UserID,
    UserID,
    VisitID,
    URL,
    b.UserID
FROM datasets.hits_v1 AS a
ALL LEFT JOIN datasets.visits_v1 AS b USING (UserID)

10、标量替换

如果子查询只返回一行数据,在被引用的时候用标量替换,例如下面语句中的 total_disk_usage 字段
EXPLAIN SYNTAX
WITH
    (
        SELECT sum(bytes)
        FROM system.parts
        WHERE active
    ) AS total_disk_usage
SELECT
    (sum(bytes) / total_disk_usage) * 100 AS table_disk_usage,
    table
FROM system.parts
GROUP BY table
ORDER BY table_disk_usage DESC
LIMIT 10//返回优化后的语句:
WITH CAST(0, \'UInt64\') AS total_disk_usage
SELECT
    (sum(bytes) / total_disk_usage) * 100 AS table_disk_usage,
    table
FROM system.parts
GROUP BY table
ORDER BY table_disk_usage DESC
LIMIT 10

11 三元运算优化

如果开启了 optimize_if_chain_to_multiif 参数,三元运算符会被替换成 multiIf 函数,例如
EXPLAIN SYNTAX
SELECT number = 1 ? 'hello' : (number = 2 ? 'world' : 'atguigu')
FROM numbers(10)
settings optimize_if_chain_to_multiif = 1//返回优化后的语句:
SELECT multiIf(number = 1, \'hello\', number = 2, \'world\', \'atguigu\')
FROM numbers(10)
SETTINGS optimize_if_chain_to_multiif = 1;
posted @ 2022-11-22 06:54  晓枫的春天  阅读(128)  评论(0编辑  收藏  举报