|NO.Z.00014|——————————|BigDataEnd|——|Hadoop&Hive.V14|——|Hive.v14|Hive函数用法.v02|
一、UDTF函数【重要】
### --- UDTF函数【重要】
~~~ # UDTF :
~~~ User Defined Table-Generating Functions。
~~~ 用户定义表生成函数,一行输入,多行输出。
### --- explode,炸裂函数
~~~ 就是将一行中复杂的 array 或者 map 结构拆分成多行
hive (mydb)> select explode(array('A','B','C')) as col;
col
A
B
C
hive (mydb)> select explode(map('a', 8, 'b', 88, 'c', 888));
key value
a 8
b 88
c 888
~~~ UDTF's are not supported outside the SELECT clause, nornested in expressions
~~~ SELECT pageid, explode(adid_list) AS myCol... is not supported
~~~ SELECT explode(explode(adid_list)) AS myCol... is not supported
~~~ lateral view 常与 表生成函数explode结合使用
~~~ lateral view 语法:
lateralView: LATERAL VIEW udtf(expression) tableAlias AS
columnAlias (',' columnAlias)*
fromClause: FROM baseTable (lateralView)*
~~~ # lateral view 的基本使用
hive (mydb)> with t1 as (
select 'OK' cola, split('www.yanqi.com', '\\.') colb
)
select cola, colc
from t1
lateral view explode(colb) t2 as colc;
--输出参数
cola colc
OK www
OK yanqi
OK com
二、UDTF 案例1:
~~~ # 数据(uid tags):
[root@linux123 ~]# vim /home/hadoop/data/tab1.txt
1 1,2,3
2 2,3
3 1,2
--编写sql,实现如下结果:
1 1
1 2
1 3
2 2
2 3
3 1
3 2
~~~ # 建表加载数据
hive (mydb)> create table market(
id int,
storage string,
allocation string,
outdt string
)
row format delimited fields terminated by '\t';
hive (mydb)> load data local inpath '/home/hadoop/data/tab1.txt' into table market;
~~~ # SQL
hive (mydb)> select uid, tag
from t1
lateral view explode(split(tags,",")) t2 as tag;
三、DTF 案例2:
~~~ # 数据准备
[root@linux123 ~]# vim /home/hadoop/data/score.dat
lisi|Chinese:90,Math:80,English:70
wangwu|Chinese:88,Math:90,English:96
maliu|Chinese:99,Math:65,English:60
~~~ # 创建表
hive (mydb)> create table studscore(
name string
,score map<String,string>)
row format delimited
fields terminated by '|'
collection items terminated by ','
map keys terminated by ':';
~~~ # 加载数据
hive (mydb)> load data local inpath '/home/hadoop/data/score.dat' overwrite
into table studscore;
~~~ # 需求:找到每个学员的最好成绩
~~~ 第一步,使用 explode 函数将map结构拆分为多行
hive (mydb)> select explode(score) as (subject, socre) from studscore;
subject socre
Chinese 90
Math 80
English 70
Chinese 88
Math 90
English 96
Chinese 99
Math 65
English 60
~~~ # 但是这里缺少了学员姓名,加上学员姓名后出错。下面的语句有是错的
hive (mydb)> select name, explode(score) as (subject, socre) from studscore;
~~~ # 第二步:explode常与 lateral view 函数联用,这两个函数结合在一起能关联其他字段
hive (mydb)> select name, subject, score1 as score from studscore
lateral view explode(score) t1 as subject, score1;
name subject score
lisi Chinese 90
lisi Math 80
lisi English 70
wangwu Chinese 88
wangwu Math 90
wangwu English 96
maliu Chinese 99
maliu Math 65
maliu English 60
~~~ # 第三步:找到每个学员的最好成绩
hive (mydb)> select name, max(mark) maxscore
from (select name, subject, mark
from studscore lateral view explode(score) t1 as
subject, mark) t1
group by name;
--输出参数
name maxscore
wangwu 96
lisi 90
maliu 99
hive (mydb)> with tmp as (
select name, subject, mark
from studscore lateral view explode(score) t1 as subject,
mark
)
select name, max(mark) maxscore
from tmp
group by name;
--输出参数
name maxscore
wangwu 96
lisi 90
maliu 99
四、小结:
### --- 小结:
~~~ 将一行数据转换成多行数据,可以用于array和map类型的数据;
~~~ lateral view 与 explode 联用,解决 UDTF 不能添加额外列的问题
Walter Savage Landor:strove with none,for none was worth my strife.Nature I loved and, next to Nature, Art:I warm'd both hands before the fire of life.It sinks, and I am ready to depart
——W.S.Landor
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下
· 别再用vector<bool>了!Google高级工程师:这可能是STL最大的设计失误
· 单元测试从入门到精通