hive 获取 json 中的 key
需求
有一下 json 数组,要求获取其 key
[{"title":"笑傲江湖","author":"金庸"},{"title":"流星蝴蝶剑","author":"古龙"}]
需求实现
1、首先将 json 数组转换为 字符串 ;这里使用 字符串替换函数 translate
select translate('[{"title":"笑傲江湖","author":"金庸"},{"title":"流星蝴蝶剑","author":"古龙"}]', '[]{}""', '') json_str;
结果1
title:笑傲江湖,author:金庸,title:流星蝴蝶剑,author:古龙
2、使用 regexp_replace 将中间的 , 替换成 ;
select regexp_replace(translate('[{"title":"笑傲江湖","author":"金庸"},{"title":"流星蝴蝶剑","author":"古龙"}]', '[]{}""', ''), '\,', '\:') json_str;
结果2
title:笑傲江湖:author:金庸:title:流星蝴蝶剑:author:古龙
3、使用 lateral view posexplode 炸开结果
select * from (select regexp_replace(translate('[{"title":"笑傲江湖","author":"金庸"},{"title":"流星蝴蝶剑","author":"古龙"}]', '[]{}""', ''), '\,', '\:') json_str) t lateral view posexplode(split(json_str, ':')) t1 as rn, value;
结果3
title:笑傲江湖:author:金庸:title:流星蝴蝶剑:author:古龙 0 title title:笑傲江湖:author:金庸:title:流星蝴蝶剑:author:古龙 1 笑傲江湖 title:笑傲江湖:author:金庸:title:流星蝴蝶剑:author:古龙 2 author title:笑傲江湖:author:金庸:title:流星蝴蝶剑:author:古龙 3 金庸 title:笑傲江湖:author:金庸:title:流星蝴蝶剑:author:古龙 4 title title:笑傲江湖:author:金庸:title:流星蝴蝶剑:author:古龙 5 流星蝴蝶剑 title:笑傲江湖:author:金庸:title:流星蝴蝶剑:author:古龙 6 author title:笑傲江湖:author:金庸:title:流星蝴蝶剑:author:古龙 7 古龙
4、对以上结果进行过滤
select rn, value from (select * from (select regexp_replace( translate('[{"title":"笑傲江湖","author":"金庸"},{"title":"流星蝴蝶剑","author":"古龙"}]', '[]{}""', ''), '\,', '\:') json_str) t lateral view posexplode(split(json_str, ':')) t1 as rn, value) t where rn % 2 = 0;
最终结果
0 title 2 author 4 title 6 author
另一种实现方式,使用两层translate 替换后侧写,参考实现
select value from (select translate(info, ',', ':') info from (select translate('[{"title":"笑傲江湖","author":"金庸"},{"title":"流星蝴蝶剑","author":"古龙"}]', '{}[]"', '') info) t) t lateral view posexplode(split(info, ':')) t as rn, value where rn % 2 = 0 group by value;