lead()/lag()函数
lag与lead函数是跟偏移量相关的两个分析函数
通过这两个函数可以在一次查询中取出同一字段的前N行的数据(lag)和后N行的数据(lead)作为独立的列,从而更方便地进行进行数据过滤
该操作可代替表的自联接,且效率更高
lag()/lead()
lead(field, num, defaultvalue)
field: 需要查找的字段
num: 往后查找的num行的数据
defaultvalue: 没有符合条件的默认值
over()
表示lag()与lead()操作的数据都在over()的范围内,里面可以使用以下子句
partition by 语句(用于分组)
order by 语句()用于排序)
如:over(partition by a order by b) 表示以a字段进行分组,再以b字段进行排序,对数据进行查询
示例:
数据集
with dataset as ( select '001' as id, 'Jack' as name, '大连' as city, '100' as sales from dual union all select '002' as id, 'Tom' as name, '大连' as city, '98' as sales from dual union all select '003' as id, 'John' as name, '大连' as city, '125' as sales from dual union all select '004' as id, 'Larry' as name, '大连' as city, '130' as sales from dual union all select '005' as id, 'Levi' as name, '沈阳' as city, '115' as sales from dual union all select '006' as id, 'Tomas' as name, '沈阳' as city, '170' as sales from dual union all select '007' as id, 'Jimmy' as name, '沈阳' as city, '130' as sales from dual union all select '008' as id, 'Robert' as name, '大连' as city, '103' as sales from dual union all select '009' as id, 'William' as name, '大连' as city, '118' as sales from dual union all select '010' as id, 'Joe' as name, '沈阳' as city, '108' as sales from dual )
获取当前记录的员工id,及销量仅次于该员工的员工id
select t.id , lead(t.id, 1, null) over(order by t.sales desc) next_record_id, t.name, t.city, t.sales from ( select id, name, city, to_number(sales) as sales from dataset ) t
结果:
获取当前记录的员工id,及销量仅高于该员工的员工id
select t.id , lag(t.id, 1, null) over(order by t.sales desc) next_record_id, t.name, t.city, t.sales from ( select id, name, city, to_number(sales) as sales from dataset ) t
结果:
获取当前记录的员工id,及按照城市分组且销量仅次于该员工的员工id
select t.id , lead(t.id, 1, null) over(partition by t.city order by t.sales desc) next_record_id, t.name, t.city, t.sales from ( select id, name, city, to_number(sales) as sales from dataset ) t
结果:
获取当前记录的员工id,及按照城市分组且销量仅次于该员工的员工id(销量差小于10的忽略)
select tt.* from ( select t.id, t.name, t.sales, lead(t.sales,1, null) over(partition by t.city order by sales desc ) next_sales, (t.sales - lead(t.sales,1, null) over(partition by t.city order by sales desc )) as diff, t.city from ( select id, name, city, to_number(sales) as sales from dataset ) t ) tt where tt.diff >= 10 or tt.diff is null
结果: