clickhouse数据库SQL关联查询优化一例

【背景说明】

业务反馈业务查询clickhouse报表视图报错:DB::Exception: Memory limit (total) exceeded: 12G

根据这个报错,业务SQL查询内存超过了限制,增加内存是个方法,如果多个会话查询占用内存的话还是会报错,同时业务查询也是超时,看看SQL层面能否优化

查看到业务多个SQL分组查询进行union all,并且格式都是差不多,以下是其中的一个,SQL部分屏蔽一些内容,使用客户端查询6分钟以上

select 
COUNT(DISTINCT v.INPATIENT_BO) BWZRC
,toYear(v.CURR_DATE) year1
,toMonth(v.CURR_DATE) mounth1
FROM 
DWD.DWD_INPATIENT_xxx v,DWD.DWD_ITEM_xxx dis 
where  
 v.INPATIENT_BO=dis.VISIT_ID
 and v.PAT_FLAG='0'
and dis.CHARGE_ITEM_ID IN('99xxx','99xxx')
and toYear(v.CURR_DATE) in ('2024') and toMonth(v.CURR_DATE)+1 in ('12')
group by toYear(v.CURR_DATE),toMonth(v.CURR_DATE);

查看SQL的话,一般关联查询跟子查询是可以互相转换的,并且关联查询写法也修改

这个SQL中由于dis表数据过大,可以先查询后作为一个表来进行查询,效果明显,改写后如下:

SELECT
    COUNT(DISTINCT v.INPATIENT_BO) AS BWZRC, 
    toYear(v.CURR_DATE) AS year1,
    toMonth(v.CURR_DATE) AS mounth1
FROM 
DWD.DWD_INPATIENT_xxx v,(SELECT VISIT_ID FROM DWD.DWD_ITEM_xxx  WHERE CHARGE_ITEM_ID IN ('99xxx', '99xxx'))dis
WHERE
v.INPATIENT_BO=dis.VISIT_ID
and v.PAT_FLAG='0'
and toYear(v.CURR_DATE) in ('2024') and toMonth(v.CURR_DATE)+1 in ('12')
group by toYear(v.CURR_DATE),toMonth(v.CURR_DATE);

改为子查询方式,也是非常快,毫秒级别

SELECT
    COUNT(DISTINCT v.INPATIENT_BO) AS BWZRC,
    toYear(v.CURR_DATE) AS year1,
    toMonth(v.CURR_DATE) AS mounth1
FROM 
DWD.DWD_INPATIENT_xxx v
WHERE  
v.PAT_FLAG = '0'
AND v.INPATIENT_BO IN (
    SELECT dis.VISIT_ID
    FROM DWD.DWD_ITEM_xxx dis
    WHERE dis.CHARGE_ITEM_ID IN ('99xxx', '99xxx')
)
AND toYear(v.CURR_DATE) = '2024'
AND toMonth(v.CURR_DATE) = '12'
GROUP BY toYear(v.CURR_DATE), toMonth(v.CURR_DATE);
posted @ 2024-12-13 16:54  zetan·chen  阅读(11)  评论(0编辑  收藏  举报