An additional MR job is introduced since the cardinality of grouping sets is more than hive.new.job.grouping.set.cardinality. This functionality is not supported with distincts. Either set hive.new.job.grouping.set.cardinality to a high number (higher than the number of rows per input row due to grouping sets in the query), or rewrite the query to not use distincts. The number of rows per input row due to grouping sets is 32

如果你的 grouping sets大于等于5个维度,将会报如上的错误;解决办法:

1.在你的hql语句前面加上 set hive.new.job.grouping.set.cardinality=xx;(例如我这里是5个维度,一共32个grouping sets,xx我写的64 )

2.可以通过在子查询中用group by去重,避免在聚合中用到distinct

 

 

posted on 2018-04-02 19:20  Deep-thinker  阅读(5161)  评论(0编辑  收藏  举报