set ENABLE_EXPR_REWRITES = 0及GROUP BY expression must not contain aggregate functions等问题解决

概述

官网:

The ENABLE_EXPR_REWRITES query option controls whether to enable or disable the query compile time optimizations. These optimizations rewrite the expression trees to a more compact and optimized form that helps avoid redundant expression evaluation at run time. Performance optimizations controlled by this query option include:

  • Constant folding (added in )
  • Extracting common conjuncts from disjunctions (added in )
  • Simplify conditionals with constant conditions (added in )

自Impala 2.8(CDH 5.10)版本引入,默认为1,即true。设置为0,即false,可以禁用性能优化。

问题

注:Java代码通过hive-jdbc提交SQL到CDH集群:

<dependency>
	<groupId>org.apache.hive</groupId>
	<artifactId>hive-jdbc</artifactId>
	<version>2.1.1-cdh6.2.1</version>
</dependency>

AnalysisException: select list expression not produced by aggregation output (missing from GROUP BY clause?)

报错SQL有点复杂,精简失败。在SQL可视化查询验证平台执行也是报错:
在这里插入图片描述
解决方案:在SQL第一行增加:set ENABLE_EXPR_REWRITES = 0;

AnalysisException: GROUP BY expression must not contain aggregate functions: cnt

报错SQL如下:

select concat(cast(cnt as string),'个城市使用') city_cnt,
cnt city_num,level1_name name,level2_name,count(1) cnt
from (
select commodity_id,count(1) cnt
from edw.dim_gy_area_commodity_chn
where tag in (0) and isactive = 1 and status = 0
and price < 10000 and is_valid = 1
group by 1 ) a
join edw.dim_gy_commodity_chn cc on a.commodity_id = cc.id and is_valid = 1
join edw.dim_gy_commodity_category ct on cc.category_code = ct.level3_code
group by 1,2,3,4

之前是执行成功的,在CDH集群升级到7.1.7版本后执行失败。

另外,这个SQL在公司内部的SQL执行平台(adhoc platform)是执行成功:
在这里插入图片描述
hive查询引擎是通过beeline,impala查询引擎通过impala-shell。程序代码是通过hive-jdbc提交SQL到大数据平台,版本号见文首。

解决方案:在SQL第一行增加:set ENABLE_EXPR_REWRITES = 0;

经过验证,升级hive-jdbc版本到3.1.3000.7.1.7.67-1,不能解决问题。

IllegalStateException: null

在公司adhoc查询平台也是执行失败:
在这里插入图片描述
经过简化的SQL如下:

drop table if exists project_b.insure_channel_campaign0 ;
create table project_b.insure_channel_campaign0 as
select
      2 as campaign_link,
      0 as is_pay
from project_b.insure_sem_channel_cost a
left join project_b.insure_order b 
on a.dte = b.insert_date and b.channel_name = 'SEM' and b.campaign_name regexp '直加' and b.insert_date >= '2021-10-01'
where a.dte between '2021-10-01' and '2021-10-31'
group by 1,2
union all
select
      2 as campaign_link,
      0 as is_pay
from project_b.insure_sem_channel_cost a
left join project_b.insure_order b 
on a.dte = b.insert_date and b.channel_name = 'SEM' and (b.campaign_name regexp '年金|开门红' or b.utm_campaign regexp '开门红') and b.insert_date >= '2021-10-01'
where a.dte between '2021-10-01' and '2021-12-31'
group by 1,2
;

另外把with子句抽取出来:

with core as (
-- 省略中间若干
)
where cnt >= 20

又报错:AnalysisException: GROUP BY: ordinal must be >= 1: -1

很诡异的是,后面执行这个SQL又没有报错!!!

不算结论的结论:一般地,group by 1,2,3,遇到columns里面有常量数字的,就可能会报错。

解决方案:在SQL第一行增加:set ENABLE_EXPR_REWRITES = 0;

但是!!!有些用户写的SQL,通过设置set ENABLE_EXPR_REWRITES = 0;,不一定可以解决问题:
在这里插入图片描述
解决方案:
联系用户手动修改其SQL:
在这里插入图片描述
具体而言 :

cast(2 as int)  as campaign_link,
cast(0 as int) as  is_pay

AnalysisException: GROUP BY: ordinal must be >= 1: 0

也是在旧版平台可以执行,升级CDH版本后不能执行的问题。SQL如下:

select a.dt
       ,0 as fin_guarantee_id
from fin.fin_di_warranty a
where substr(a.dt,1,7) = substr(to_date(date_sub(now(),1)),1,7)
group by 1,2;

解决方法:前置增加set ENABLE_EXPR_REWRITES = 0;

参考

posted @   johnny233  阅读(135)  评论(0编辑  收藏  举报  
相关博文:
阅读排行:
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· 没有Manus邀请码?试试免邀请码的MGX或者开源的OpenManus吧
· 园子的第一款AI主题卫衣上架——"HELLO! HOW CAN I ASSIST YOU TODAY
· 【自荐】一款简洁、开源的在线白板工具 Drawnix
点击右上角即可分享
微信分享提示