Hive分析函数

● 测试表 test

1. grouping sets 

① 未使用

② 使用grouping sets (与上面等价)

【代码实例】

查看代码
 --todo 方式一
--所有学校的人数总和
select
        '全学校' as school,
        '全年级' as grade,
        count(name) as num,
        1 as grouping__id
from test
union all
--所有学校中各年级的人数   分组id=2
select
        '全学校' as school,
        grade,
        count(name) as num,
        2 as grouping__id
from test
group by grade
union all
-- 各学校中各年级的人数   分组id=3
select
       school,
       grade,
       count(name) as num,
       3 as grouping__id
from test
group by school,grade;

---------------------------------------------------------------

--todo 方式二
select
        nvl(school,'全学校') as school,
        nvl(grade,'全年级') as grade,
        count(name) as num,
        grouping__id
from test
group by school,grade
grouping sets( (school,grade) , grade , () );

 

2. Cube

① cube 在一个group by 的聚合查询中,根据所有维度组合进行聚合(即是把所有的维度组合都放入grouping sets中)

 

 【代码示例】

查看代码
 --todo 方式一
select
        nvl(school,'全学校') as school,
        nvl(grade,'全年级') as grade,
        count(name) as num,
        grouping__id
from test
group by school,grade
grouping sets((school,grade),school,grade,())
order by grouping__id;

--------------------------------------------------

--todo 方式二
select
        nvl(school,'全学校') as school,
        nvl(grade,'全年级') as grade,
        count(name) as num,
        grouping__id
from test
group by school,grade
with cube
order by grouping__id;

 

3. rollup

● rollup 是cube 的一个子集,以左边的维度为主,进行聚合

● 即是按照,汇总所有维度->汇总左侧school维度->汇总左右维度的顺序,来聚合,不会再从右到左聚合了

 

【代码示例】

查看代码
 --todo 方式一
select
        nvl(school,'全学校') as school,
        nvl(grade,'全年级') as grade,
        count(name) as num,
        grouping__id
from test
group by school,grade
grouping sets((),school,(school,grade))
order by grouping__id;

----------------------------------------------

--todo 方式二
select
        nvl(school,'全学校') as school,
        nvl(grade,'全年级') as grade,
        count(name) as num,
        grouping__id
from test
group by school,grade
with rollup
order by grouping__id;

 

总结:

① grouping set是一个更加灵活的做分组维度聚合的函数,可以适应更多的场景

② cube和rollup在完全适用于场景时,有着更简单的写法和效率

 

posted @ 2024-05-13 23:12  白森  阅读(8)  评论(0编辑  收藏  举报