Week08_day01 (Hive开窗函数 row_number()的使用 (求出所有薪水前两名的部门))
数据准备:
7369,SMITH,CLERK,7902,1980-12-17,800,null,20 7499,ALLEN,SALESMAN,7698,1981-02-20,1600,300,30 7521,WARD,SALESMAN,7698,1981-02-22,1250,500,30 7566,JONES,MANAGER,7839,1981-04-02,2975,null,20, 7654,MARTIN,SALESMAN,7698,1981-09-28,1250,1400,30 7698,BLAKE,MANAGER,7839,1981-05-01,2850,null,30 7782,CLARK,MANAGER,7839,1981-06-09,2450,null,10 7788,SCOTT,ANALYST,7566,1987-04-19,3000,null,20 7839,KING,PRESIDENT,null,1981-11-17,5000,null,10 7844,TURNER,SALESMAN,7698,1981-09-08,1500,0,30 7876,ADAMS,CLERK,7788,1987-05-23,1100,null,20 7900,JAMES,CLERK,7698,1981-12-03,950,null,30 7902,FORD,ANALYST,7566,1981-12-03,3000,null,20 7934,MILLER,CLERK,7782,1982-01-23,1300,null,10
在Hive中创建表(当然建表语句肯定不是这个,这个是字段)
使用本地加载命令加载数据 load data local inpath '文件的绝对路径' into table emp2;
查看
现在有一个需求:求出所有薪水前两名的部门。
第一步,使用开窗函数 row_number()进行分组编号‘降序使用 DESC
select deptno,sal,row_number() over(partition by deptno order by sal desc) from emp2;
得到如下数据:
再对其进行分组,取出编号小于3的数据得到结果:
select w.deptno,w.sal from (select deptno,sal,row_number() over(partition by deptno order by sal desc) as rn from emp2) w where w.rn<3;