Oracle 分析函数Over(partition by...)以及开窗函数
Over(Partition by...) 为Oracle特有函数,用于计算基于组的某种聚合值。
它和聚合函数的不同之处是:对于每个组返回多行,而聚合函数对于每个组只返回一行。
说明
partition by: 按哪个字段划分组;
order by :按哪个字段排序;
常用:
row_number() over(partition by ... order by ...)
rank() over(partition by ... order by ...)
dense_rank() over(partition by ... order by ...)
count() over(partition by ... order by ...)
max() over(partition by ... order by ...)
min() over(partition by ... order by ...)
sum() over(partition by ... order by ...)
avg() over(partition by ... order by ...)
first_value() over(partition by ... order by ...)
last_value() over(partition by ... order by ...)
lag() over(partition by ... order by ...)
lead() over(partition by ... order by ...)
示例:
1. 查询表中每个emptypeid中wages最高的个体(rank()的用法)。
一般的写法:
select p.* from py_0325 p inner join (select max(wages) wages,emptypeid from py_0325 p group by p.emptypeid ) des on p.emptypeid = des.emptypeid and p.wages = des.wages
用over()函数
select * from ( select p.empid,p.orgid,p.emptypeid,p.wages, rank() over(partition by p.emptypeid order by p.wages desc) ranks from py_0325 p ) des where des.ranks = 1
用dense_over()函数
select * from ( select p.empid,p.orgid,p.emptypeid,p.wages, dense_rank() over(partition by p.emptypeid order by p.wages desc) ranks from py_0325 p ) des where des.ranks = 1
2. 查询表中每个emptypeid中wages最高的个体与最低个体的差额(max()和min()的用法)。
一般的写法:
select des.maxwages - des.minwages chae,des.emptypeid from( select max(wages) maxwages,min(wages) minwages ,emptypeid from py_0325 p group by p.emptypeid ) des
分析函数写法:
select distinct p.emptypeid, max(p.wages) over(partition by p.emptypeid ) - min(p.wages) over(partition by p.emptypeid ) chae from py_0325 p
注意:要加order by 的话
MAX() OVER(PARTITION BY .. ORDER BY .. DESC) 排序规则只能为desc,否则不起作用,将查询到目前为止排序值最高字段的对应值
MIN() OVER(PARTITION BY .. ORDER BY .. ASC ) 排序规则只能为asc,否则不起作用,将查询到目前为止排序值最低的字段的对应值
3、lead()/lag() over(partition by ... order by ...) 取前面/后面第n行记录
说明:
lead(列名,n,m): 当前记录后面第n行记录的<列名>的值,没有则默认值为m;如果不带参数n,m,则查找当前记录后面第一行的记录<列名>的值,没有则默认值为null。
lag(列名,n,m): 当前记录前面第n行记录的<列名>的值,没有则默认值为m;如果不带参数n,m,则查找当前记录前面第一行的记录<列名>的值,没有则默认值为null。
使用 lead() 和 lag() 时,必须要带order by否则非法
举例:
查询emptypeid分组下个人工资比自己高一位,低一位的差额:
select p.empid,p.orgid,p.emptypeid,p.wages, lead(p.wages,1,0) over(partition by p.emptypeid order by p.wages ) - lag(p.wages,1,0) over(partition by p.emptypeid order by p.wages ) chae from py_0325 p
4、FIRST_VALUE/LAST_VALUE() OVER(PARTITION BY ...) 取首尾记录
select p.empid,p.orgid,p.emptypeid,p.wages, FIRST_VALUE(p.wages) over(partition by p.emptypeid order by p.wages ) FIRST_VALUE, LAST_VALUE(p.wages) over(partition by p.emptypeid order by p.wages ) LAST_VALUE from py_0325 p
5、ROW_NUMBER() OVER(PARTITION BY.. ORDER BY ..) 排序(应用:分页)
row_num() 和 rank() dense_rank()的区别:
row_num() : 1,2,3,4,5
rank() :1,1,3,4,5
dense_rank():1,1,2,3,4
函数举例:
select p.wages, row_number() over(partition by p.emptypeid order by p.wages desc) nums from py_0325 p
6、sum/avg/count() over(partition by ..)
select p.*, sum(p.wages) over(partition by p.emptypeid ) sums, count(p.wages) over(partition by p.emptypeid ) counts, avg(p.wages) over(partition by p.emptypeid ) avgs from py_0325 p
计算表中最少条数之和是总条数之后50%的数据条数
select max(counts) from (
select pp.itemid,pp.salesprice,
row_number() over(order by pp.salesprice desc) counts,
sum(pp.salesprice) over(order by pp.salesprice desc)/sum(pp.salesprice) over() sums
from dict_item_charge pp )
where sums < 0.5
7、 rows/range between … preceding and … following 上下范围内求值
说明:unbounded:不受控制的,无限的
preceding:在...之前
following:在...之后
rows between … preceding and … following
举例1:获取分部门工资最高值
select p.empid, p.orgid, p.emptypeid, p.wages,
--MAX(p.wages) OVER(PARTITION BY p.orgid) aa last_value(p.wages) over(partition by p.orgid order by p.wages rows between unbounded preceding and unbounded following ) aa from py_0325 p
举例2:对各部门进行分组,并附带显示第一行至当前行的汇总
select p.empid, p.orgid, p.emptypeid, p.wages, sum(p.wages) over(partition by p.orgid order by p.empid rows between unbounded preceding and current row ) aa from py_0325 p
示例3:当前行至最后一行的汇总
select p.empid, p.orgid, p.emptypeid, p.wages, sum(p.wages) over(partition by p.orgid order by p.empid rows between current row and unbounded following) aa from py_0325 p
示例4:当前行的上一行(rownum-1)到当前行的汇总
select p.empid, p.orgid, p.emptypeid, p.wages, sum(p.wages) over(partition by p.orgid order by p.empid rows between 1 preceding and current row) aa from py_0325 p
示例5:当前行的上一行(rownum-1)到当前行的下两行(rownum+2)的汇总
select p.empid, p.orgid, p.emptypeid, p.wages, sum(p.wages) over(partition by p.orgid order by p.empid rows between 1 preceding and 2 following) aa from py_0325 p
其他
1、NULLS FIRST/LAST 将空值字段记录放到最前或最后显示
说明:
通过RANK()、DENSE_RANK()、ROW_NUMBER()对记录进行全排列、分组排列取值,但有时候,会遇到空值的情况,空值会影响得到的结果的正确性
NULLS FIRST/LAST 可以帮助我们在处理含有空值的排序排列中,将空值字段记录放到最前或最后显示,帮助我们得到期望的结果。
select p.empid, p.orgid, p.emptypeid, p.wages, rank() over(partition by p.orgid order by p.wages nulls last ) aa from py_0325 p
2、NTILE(n)
把数据分组等分,效果如下: