hiveSql 迁移spark2.4时报错Error in query: Window function row_number() requires window to be ordered, please add ORDER BY clause
hivesql对语法检查较弱
像下面的语法 hive是可以通过的 partition by 后没有跟order by
row_number() over(partition by buvid,version_code,app_id) as rn
原因看下hive 源码(hive 已经做了补充)
spark中 看下代码
/** * Check and add order to [[AggregateWindowFunction]]s. */ object ResolveWindowOrder extends Rule[LogicalPlan] { def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions { case WindowExpression(wf: WindowFunction, spec) if spec.orderSpec.isEmpty => failAnalysis(s"Window function $wf requires window to be ordered, please add ORDER BY " + s"clause. For example SELECT $wf(value_expr) OVER (PARTITION BY window_partition " + s"ORDER BY window_ordering) from table") case WindowExpression(rank: RankLike, spec) if spec.resolved => val order = spec.orderSpec.map(_.child) WindowExpression(rank.withOrder(order), spec) } }
这里我们注释掉就行 如果对排序不是很看重 对结果集顺序无所谓
/** * Check and add order to [[AggregateWindowFunction]]s. */ object ResolveWindowOrder extends Rule[LogicalPlan] { def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions { // case WindowExpression(wf: WindowFunction, spec) if spec.orderSpec.isEmpty => // failAnalysis(s"Window function $wf requires window to be ordered, please add ORDER BY " + // s"clause. For example SELECT $wf(value_expr) OVER (PARTITION BY window_partition " + // s"ORDER BY window_ordering) from table") case WindowExpression(rank: RankLike, spec) if spec.resolved => val order = spec.orderSpec.map(_.child) WindowExpression(rank.withOrder(order), spec) } }