hivesql和prestosql对比
背景
最近很多时候需要将hivesql转化为prestosql ,这里面有很多不能直接复用需要调整func甚至改用其他逻辑。
为了后续方便查询,后面将总结以下经常用到的sql记录下来方便后续使用。
- 爆炸函数实现
hive:SELECT student,score FROM tests LATERAL VIEW explode(scores)t AS score presto:SELECT student,score FROM tests cross join unnest(scores)ast (score)
- map查询
presto:element_at(a,'aa') hive: a['aa']
- json解析
presto: json_extract_scalar(param, '$.tab') hive: get_json_object(param, '$.tab')
- grouping sets
hive:group by date,name grouping sets((date),(date,name),(name)) presto :group by grouping sets((date),(date,name),(name))
- 行转列
--presto select array_join(array_distinct(array_agg(name)),',') --hive select concat_ws(',',collect_set(cast(name as string)))
- 时间差计算
Presto:select date_diff('day', cast('2020-07-23 15:01:13' as timestamp), cast('2020-07-24 11:42:58' as timestamp)) Hive:select datediff('2020-07-24 11:42:58','2020-07-23 15:01:13');
- 同比环比
--presto 同环比时间获取公式 SELECT CAST('2020-12-12' AS TIMESTAMP) AS "当天", date_add('day', - 1, CAST('2020-12-12' AS TIMESTAMP)) AS "昨天", date_trunc('month', CAST('2020-12-12' AS TIMESTAMP)) AS "当月第一天", date_add('month', - 12, date_trunc('month', CAST('2020-12-12' AS TIMESTAMP))) AS "去年当月第一天", date_add('month', - 1, date_trunc('month', CAST('2020-12-12' AS TIMESTAMP))) AS "上第一天", date_add('month', - 12, CAST('2020-12-12' AS TIMESTAMP)) AS "去年当月当天", date_add('day',(DAY(CAST('2020-12-12' AS TIMESTAMP))-1),date_add('month', - 1, date_trunc('month', CAST('2020-12-12' AS TIMESTAMP)))) as "上月第一天至环比相同天数" --hive SELECT '2020-12-12' AS "当天", date_add('2020-12-12' , - 1) AS "昨天", TRUNC('2020-12-12', 'MM') AS "当月第一天", add_months(TRUNC('2020-12-12', 'MM'), - 12) AS "去年当月第一天", add_months(TRUNC('2020-12-12', 'MM'), - 1) AS "上月第一天", add_months('2020-12-12', - 12) AS "去年当月当天", date_add(add_months(TRUNC('2020-12-12', 'MM'), - 1),day('2020-12-12')-1) as "上月第一天至环比相同天数"
待续...