set ENABLE_EXPR_REWRITES = 0及GROUP BY expression must not contain aggregate functions等问题解决
概述
官网:
The ENABLE_EXPR_REWRITES query option controls whether to enable or disable the query compile time optimizations. These optimizations rewrite the expression trees to a more compact and optimized form that helps avoid redundant expression evaluation at run time. Performance optimizations controlled by this query option include:
- Constant folding (added in )
- Extracting common conjuncts from disjunctions (added in )
- Simplify conditionals with constant conditions (added in )
自Impala 2.8(CDH 5.10)版本引入,默认为1,即true。设置为0,即false,可以禁用性能优化。
问题
注:Java代码通过hive-jdbc提交SQL到CDH集群:
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>2.1.1-cdh6.2.1</version>
</dependency>
AnalysisException: select list expression not produced by aggregation output (missing from GROUP BY clause?)
报错SQL有点复杂,精简失败。在SQL可视化查询验证平台执行也是报错:
解决方案:在SQL第一行增加:set ENABLE_EXPR_REWRITES = 0;
AnalysisException: GROUP BY expression must not contain aggregate functions: cnt
报错SQL如下:
select concat(cast(cnt as string),'个城市使用') city_cnt,
cnt city_num,level1_name name,level2_name,count(1) cnt
from (
select commodity_id,count(1) cnt
from edw.dim_gy_area_commodity_chn
where tag in (0) and isactive = 1 and status = 0
and price < 10000 and is_valid = 1
group by 1 ) a
join edw.dim_gy_commodity_chn cc on a.commodity_id = cc.id and is_valid = 1
join edw.dim_gy_commodity_category ct on cc.category_code = ct.level3_code
group by 1,2,3,4
之前是执行成功的,在CDH集群升级到7.1.7版本后执行失败。
另外,这个SQL在公司内部的SQL执行平台(adhoc platform)是执行成功:
hive查询引擎是通过beeline
,impala查询引擎通过impala-shell
。程序代码是通过hive-jdbc提交SQL到大数据平台,版本号见文首。
解决方案:在SQL第一行增加:set ENABLE_EXPR_REWRITES = 0;
经过验证,升级hive-jdbc
版本到3.1.3000.7.1.7.67-1
,不能解决问题。
IllegalStateException: null
在公司adhoc查询平台也是执行失败:
经过简化的SQL如下:
drop table if exists project_b.insure_channel_campaign0 ;
create table project_b.insure_channel_campaign0 as
select
2 as campaign_link,
0 as is_pay
from project_b.insure_sem_channel_cost a
left join project_b.insure_order b
on a.dte = b.insert_date and b.channel_name = 'SEM' and b.campaign_name regexp '直加' and b.insert_date >= '2021-10-01'
where a.dte between '2021-10-01' and '2021-10-31'
group by 1,2
union all
select
2 as campaign_link,
0 as is_pay
from project_b.insure_sem_channel_cost a
left join project_b.insure_order b
on a.dte = b.insert_date and b.channel_name = 'SEM' and (b.campaign_name regexp '年金|开门红' or b.utm_campaign regexp '开门红') and b.insert_date >= '2021-10-01'
where a.dte between '2021-10-01' and '2021-12-31'
group by 1,2
;
另外把with子句抽取出来:
with core as (
-- 省略中间若干
)
where cnt >= 20
又报错:AnalysisException: GROUP BY: ordinal must be >= 1: -1
很诡异的是,后面执行这个SQL又没有报错!!!
不算结论的结论:一般地,group by 1,2,3,
遇到columns里面有常量数字的,就可能会报错。
解决方案:在SQL第一行增加:set ENABLE_EXPR_REWRITES = 0;
但是!!!有些用户写的SQL,通过设置set ENABLE_EXPR_REWRITES = 0;
,不一定可以解决问题:
解决方案:
联系用户手动修改其SQL:
具体而言 :
cast(2 as int) as campaign_link,
cast(0 as int) as is_pay
AnalysisException: GROUP BY: ordinal must be >= 1: 0
也是在旧版平台可以执行,升级CDH版本后不能执行的问题。SQL如下:
select a.dt
,0 as fin_guarantee_id
from fin.fin_di_warranty a
where substr(a.dt,1,7) = substr(to_date(date_sub(now(),1)),1,7)
group by 1,2;
解决方法:前置增加set ENABLE_EXPR_REWRITES = 0;
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· 没有Manus邀请码?试试免邀请码的MGX或者开源的OpenManus吧
· 园子的第一款AI主题卫衣上架——"HELLO! HOW CAN I ASSIST YOU TODAY
· 【自荐】一款简洁、开源的在线白板工具 Drawnix