笔记:Oracle SQL 高级查询简介 (2) 分析函数

1、评级函数

(1). 排序rank()、dense_rank()

遇到重复的,rank()下一个加2,dense_rank() 下一个加1  

 

select s.prod_id, sum(s.amount_sold),
rank() over (order by sum(s.amount_sold) desc) as rank,
dense_rank() over (order by sum(s.amount_sold) desc) as dense_rank
from all_sales s where s.year=1998 and s.amount_sold is not null
group by s.prod_id
order by s.prod_id; 


结果:

 

 

   PROD_ID SUM(S.AMOUNT_SOLD)       RANK DENSE_RANK
---------- ------------------ ---------- ----------
        13          936197.53          7          7
        14         2733887.43          2          2
        15         1368317.88          5          5
        16              11.99         60         60
        17         2239127.88          3          3
        18         5477218.04          1          1
        19          182670.35         20         20
        20          990525.95          6          6
        21         1535187.44          4          4
        22           31853.11         54         54
        23           85211.28         36         36
        24          163929.27         22         22
        25          522713.71         13         13
        26          567533.83         12         12
        27          107968.24         30         30
        28          644480.02          9          9
        29          578374.62         11         11
        30            59391.8         48         48
        31           64464.83         45         45
        32           124081.8         26         26

 


还可以通过 NULLS  LAST 或 NULLS  FIRST  控制null放在首位或末位

 

select s.prod_id, sum(s.amount_sold),
rank() over (order by sum(s.amount_sold) desc nulls last) as rank,
dense_rank() over (order by sum(s.amount_sold) desc nulls last) as dense_rank
from all_sales s where s.year=1998 and s.amount_sold is not null
group by s.prod_id
order by s.prod_id; 

 


与PARTITION BY 子句结合使用 

 

select s.prod_id, s.month_id, sum(s.amount_sold),
rank() over (partition by s.month_id order by sum(s.amount_sold) desc) as rank,
dense_rank() over (partition by s.month_id order by sum(s.amount_sold) desc) as dense_rank
from all_sales s where s.year=1998 and s.amount_sold is not null
group by s.prod_id, month_id
order by s.prod_id, month_id; 


结果:

 

 

   	PROD_ID	MONTH_ID	SUM(S.AMOUNT_SOLD)	RANK	DENSE_RANK
1	13	01	125575.64	6	6
2	13	02	122325.21	5	5
3	13	03	61649.5	6	6
4	13	04	17404.26	16	16
5	13	05	61649.5	        6	6
6	13	06	20004	        16	16
7	13	07	164719.38	4	4
8	13	08	125010.33	5	5
9	13	10	112205.27	6	6
10	13	12	125654.44	5	5
11	14	01	239773.24	3	3
12	14	02	278879.97	2	2


rank()、dense_rank() 等同样可以与rollup、cube、grouping sets 等函数合用

 


(2). CUME_DIST 和 PERCENT_RANK 函数

cume_dist 计算某个值相对于一组值中的位置,即 cumulative distribution (累积分布)。

percent_rank 某个值相对于一组值的百分比

 

select s.prod_id, sum(s.amount_sold),
cume_dist() over (  order by sum(s.amount_sold) desc) as cume_dist,
percent_rank() over (  order by sum(s.amount_sold) desc) as percent_rank
from all_sales s where s.year=1998 and s.amount_sold is not null
group by s.prod_id
order by s.prod_id; 

 


结果:

   	PROD_ID	SUM(S.AMOUNT_SOLD)	CUME_DIST	PERCENT_RANK
1	13	936197.53	0.116666666666667	0.101694915254237
2	14	2733887.43	0.0333333333333333	0.0169491525423729
3	15	1368317.88	0.0833333333333333	0.0677966101694915
4	16	11.99	        1	                1
5	17	2239127.88	0.05	                0.0338983050847458
6	18	5477218.04	0.0166666666666667	0
7	19	182670.35	0.333333333333333	0.322033898305085
8	20	990525.95	0.1	                0.0847457627118644
9	21	1535187.44	0.0666666666666667	0.0508474576271186
10	22	31853.11	0.9	                0.898305084745763
11	23	85211.28	0.6	                0.593220338983051
12	24	163929.27	0.366666666666667	0.355932203389831
13	25	522713.71	0.216666666666667	0.203389830508475
14	26	567533.83	0.2	                0.186440677966102
15	27	107968.24	0.5	                0.491525423728814
16	28	644480.02	0.15	                0.135593220338983
17	29	578374.62	0.183333333333333	0.169491525423729
18	30	59391.8	        0.8	                0.796610169491525
19	31	64464.83	0.75	                0.745762711864407
20	32	124081.8	0.433333333333333	0.423728813559322
21	33	110987.48	0.483333333333333	0.474576271186441
22	34	106525.01	0.516666666666667	0.508474576271186
23	35	269009.61	0.283333333333333	0.271186440677966
24	36	131170.12	0.416666666666667	0.406779661016949
25	37	293152.28	0.266666666666667	0.254237288135593
26	38	61209.1	        0.783333333333333	0.779661016949152
27	39	149108.82	0.4	                0.389830508474576
28	40	412274.43	0.25	               0.23728813559322 
29	41	101928.66	0.533333333333333	0.525423728813559
30	42	96450.49	0.566666666666667	0.559322033898305
31	43	63514.58	0.766666666666667	0.76271186440678
32	44	54659.47	0.833333333333333	0.830508474576271
33	45	122265.54	0.45	                0.440677966101695
34	46	73544.8    	0.65	                0.644067796610169
35	47	70288.37	0.716666666666667	0.711864406779661
36	48	69656.88	0.733333333333333	0.728813559322034
37	113	91540.88	0.583333333333333	0.576271186440678
38	114	71070.25	0.683333333333333	0.677966101694915
39	115	10505.43	0.933333333333333	0.932203389830508
40	116	56622.64	0.816666666666667	0.813559322033898            
41	117	40338.17	0.85            	0.847457627118644
42	118	226875.98	0.3	                0.288135593220339
43	119	39241.12	0.866666666666667 	0.864406779661017
44	120	27333.37	0.916666666666667	0.915254237288135
45	123	163865.43	0.383333333333333	0.372881355932203
46	124	624.82          0.983333333333333	0.983050847457627
47	125	117214.73	0.466666666666667	0.457627118644068
48	126	172907.76	0.35            	0.338983050847458
49	127	611329.86	0.166666666666667	0.152542372881356
50	128	221494.36	0.316666666666667	0.305084745762712
51	129	496483.76	0.233333333333333	0.220338983050847
52	130	691798.97	0.133333333333333	0.11864406779661
53	131	70645.28	0.7	                0.694915254237288
54	132	83353.36	0.616666666666667	0.610169491525424
55	133	76010        	0.633333333333333	0.627118644067797
56	136	8989.83         0.95            	0.949152542372881
57	140	97259.84	0.55	                0.542372881355932
58	146	35771.66	0.883333333333333	0.88135593220339
59	147	1685.89	        0.966666666666667	0.966101694915254
60	148	72058.92	0.666666666666667	0.661016949152542


(3) 分片NTILE

ntile(n), 划分为n个分片,查找属于哪个分片

 

select s.prod_id, sum(s.amount_sold),
ntile(4) over (  order by sum(s.amount_sold) desc) as ntile
from all_sales s where s.year=1998 and s.amount_sold is not null
group by s.prod_id
order by s.prod_id; 


结果:

 

 

   	PROD_ID	SUM(S.AMOUNT_SOLD)	NTILE
1	13	936197.53	1
2	14	2733887.43	1
3	15	1368317.88	1
4	16	11.99	        4
5	17	2239127.88	1
6	18	5477218.04	1
7	19	182670.35	2
8	20	990525.95	1
9	21	1535187.44	1
10	22	31853.11	4
11	23	85211.28	3
12	24	163929.27	2
13	25	522713.71	1
14	26	567533.83	1
15	27	107968.24	2
16	28	644480.02	1
17	29	578374.62	1
18	30	59391.8	        4
19	31	64464.83	3
20	32	124081.8	2
21	33	110987.48	2
22	34	106525.01	3
23	35	269009.61	2
24	36	131170.12	2
25	37	293152.28	2
26	38	61209.1	        4
27	39	149108.82	2
28	40	412274.43	1
29	41	101928.66	3
30	42	96450.49	3
31	43	63514.58	4
32	44	54659.47	4
33	45	122265.54	2
34	46	73544.8	        3
35	47	70288.37	3
36	48	69656.88	3
37	113	91540.88	3
38	114	71070.25	3
39	115	10505.43	4
40	116	56622.64	4
41	117	40338.17	4
42	118	226875.98	2
43	119	39241.12	4
44	120	27333.37	4
45	123	163865.43	2
46	124	624.82	        4
47	125	117214.73	2
48	126	172907.76	2
49	127	611329.86	1
50	128	221494.36	2
51	129	496483.76	1
52	130	691798.97	1
53	131	70645.28	3
54	132	83353.36	3
55	133	76010	        3
56	136	8989.83	        4
57	140	97259.84	3
58	146	35771.66	4
59	147	1685.89 	4
60	148	72058.92	3

 


(4). ROW_NUMBER
row_number 从1开始,每个分组返回一个数字。

 

select s.prod_id, sum(s.amount_sold),
row_number()  over (  order by sum(s.amount_sold) desc) as row_number
from all_sales s where s.year=1998 and s.amount_sold is not null
group by s.prod_id
order by s.prod_id; 


结果:

 

 

   	PROD_ID	SUM(S.AMOUNT_SOLD)	ROW_NUMBER
1	13	936197.53	7
2	14	2733887.43	2
3	15	1368317.88	5
4	16	11.99	        60
5	17	2239127.88	3
6	18	5477218.04	1
7	19	182670.35	20
8	20	990525.95	6
9	21	1535187.44	4
10	22	31853.11	54
11	23	85211.28	36
12	24	163929.27	22
13	25	522713.71	13
14	26	567533.83	12
15	27	107968.24	30
16	28	644480.02	9
17	29	578374.62	11
18	30	59391.8  	48
19	31	64464.83	45
20	32	124081.8	26
21	33	110987.48	29
22	34	106525.01	31
23	35	269009.61	17
24	36	131170.12	25
25	37	293152.28	16
26	38	61209.1 	47
27	39	149108.82	24
28	40	412274.43	15
29	41	101928.66	32
30	42	96450.49	34
31	43	63514.58	46
32	44	54659.47	50
33	45	122265.54	27
34	46	73544.8  	39
35	47	70288.37	43
36	48	69656.88	44
37	113	91540.88	35
38	114	71070.25	41
39	115	10505.43	56
40	116	56622.64	49
41	117	40338.17	51
42	118	226875.98	18
43	119	39241.12	52
44	120	27333.37	55
45	123	163865.43	23
46	124	624.82 	        59
47	125	117214.73	28
48	126	172907.76	21
49	127	611329.86	10
50	128	221494.36	19
51	129	496483.76	14
52	130	691798.97	8
53	131	70645.28	42
54	132	83353.36	37
55	133	76010           38
56	136	8989.83	        57
57	140	97259.84	33
58	146	35771.66	53
59	147	1685.89	        58
60	148	72058.92	40

 


2、反百分点函数

使用反百分比函数可以获得对应某个百分点的值。

percenttile_disc、percentile_cont  与 cume_dist、percent_rank 作用相反。

 

select percentile_cont(0.6) within group (  order by sum(s.amount_sold) desc) as percentile_cont,
percentile_disc(0.6) within group (  order by sum(s.amount_sold) desc) as percentile_disc
from all_sales s where s.year=1998 and s.amount_sold is not null
group by s.prod_id
order by s.prod_id; 


结果:

 

 

   	PERCENTILE_CONT	PERCENTILE_DISC
1	84468.112	85211.28


 

3、窗口函数

(1)、累积和: rows between unbounded preceding and current row

计算某年1月到12月累计销量:

select s.month_id, sum(s.amount_sold), 
sum(sum(s.amount_sold)) over 
           (order by s.month_id  rows between unbounded preceding and current row)  
           as cumulative_amount
from all_sales s where s.year=1998 and s.amount_sold is not null
group by s.month_id
order by s.month_id; 

 

结果

 

   	MONTH_ID	SUM(S.AMOUNT_SOLD)	CUMULATIVE_AMOUNT
1	01	2277420.49	2277420.49
2	02	2372690.87	4650111.36
3	03	1830572.64	6480684
4	04	1975978.3	8456662.3
5	05	1748287.23	10204949.53
6	06	1869728.61	12074678.14
7	07	1932282.28	14006960.42
8	08	1972532.63	15979493.05
9	09	2167008.19	18146501.24
10	10	2236464.53	20382965.77
11	11	1959664.22	22342629.99
12	12	1741284.96	24083914.95

 

 

(2)、移动平均值 rows between N preceding and current row

某年当月与前三个月之间的移动平均值:

 

select s.month_id, sum(s.amount_sold), 
avg(sum(s.amount_sold)) over 
           (order by s.month_id  rows between 3 preceding and current row)  
           as moving_avg
from all_sales s where s.year=1998 and s.amount_sold is not null
group by s.month_id
order by s.month_id; 


结果:

 

 

1	01	2277420.49	2277420.49
2	02	2372690.87	2325055.68
3	03	1830572.64	2160228
4	04	1975978.3	2114165.575
5	05	1748287.23	1981882.26
6	06	1869728.61	1856141.695
7	07	1932282.28	1881569.105
8	08	1972532.63	1880707.6875
9	09	2167008.19	1985387.9275
10	10	2236464.53	2077071.9075
11	11	1959664.22	2083917.3925
12	12	1741284.96	2026105.475

 


(3)、中心平均值: between N preceding and N following

计算当月与前后1月的中心平均值

 

select s.month_id, sum(s.amount_sold), 
avg(sum(s.amount_sold)) over 
           (order by s.month_id  rows between 1 preceding and 1 following )  
           as moving_avg
from all_sales s where s.year=1998 and s.amount_sold is not null
group by s.month_id
order by s.month_id; 

 


结果:

   	MONTH_ID	SUM(S.AMOUNT_SOLD)	MOVING_AVG
1	01	2277420.49	2325055.68
2	02	2372690.87	2160228
3	03	1830572.64	2059747.27
4	04	1975978.3	1851612.72333333
5	05	1748287.23	1864664.71333333
6	06	1869728.61	1850099.37333333
7	07	1932282.28	1924847.84
8	08	1972532.63	2023941.03333333
9	09	2167008.19	2125335.11666667
10	10	2236464.53	2121045.64666667
11	11	1959664.22	1979137.90333333
12	12	1741284.96	1850474.59


(4)、FIRST_VALUE 、 LAST_VALUE

FIRST_VALUE 、 LAST_VALUE 获取窗口的首行、末行。

 

select s.month_id, sum(s.amount_sold), 
first_value(sum(s.amount_sold)) over 
           (order by s.month_id  rows between 1 preceding and 1 following )  
           as last_month_amount,
last_value(sum(s.amount_sold)) over 
           (order by s.month_id  rows between 1 preceding and 1 following )  
           as next_month_amount           
from all_sales s where s.year=1998 and s.amount_sold is not null
group by s.month_id
order by s.month_id; 


结果:

 

 

   	MONTH_ID	SUM(S.AMOUNT_SOLD)	LAST_MONTH_AMOUNT	NEXT_MONTH_AMOUNT
1	01	2277420.49	2277420.49	2372690.87
2	02	2372690.87	2277420.49	1830572.64
3	03	1830572.64	2372690.87	1975978.3
4	04	1975978.3	1830572.64	1748287.23
5	05	1748287.23	1975978.3	1869728.61
6	06	1869728.61	1748287.23	1932282.28
7	07	1932282.28	1869728.61	1972532.63
8	08	1972532.63	1932282.28	2167008.19
9	09	2167008.19	1972532.63	2236464.53
10	10	2236464.53	2167008.19	1959664.22
11	11	1959664.22	2236464.53	1741284.96
12	12	1741284.96	1959664.22	1741284.96

 


4、报表函数

包括 sum、avg、max、min、count、variance、stddev、ratio_to_report等。


(1)、总计报表

计算每月销量总和以及所有产品销量总和

 

select s.prod_id, s.month_id, 
sum(sum(s.amount_sold)) over (partition by s.month_id) as month_total,
sum(sum(s.amount_sold)) over (partition by s.prod_id) as prod_total
from all_sales s where s.year=1998 and s.amount_sold is not null
group by s.prod_id, s.month_id
order by s.prod_id, s.month_id; 

 


结果:

 

   , PROD_ID, MONTH_ID, MONTH_TOTAL, PROD_TOTAL
1  13  01  2277420.49  936197.53
2  13  02  2372690.87  936197.53
3  13  03  1830572.64  936197.53
4  13  04  1975978.3    936197.53
5  13  05  1748287.23  936197.53
6  13  06  1869728.61  936197.53
7  13  07  1932282.28  936197.53
8  13  08  1972532.63  936197.53
9  13  10  2236464.53  936197.53
10  13  12  1741284.96  936197.53
11  14  01  2277420.49  2733887.43
12  14  02  2372690.87  2733887.43
13  14  03  1830572.64  2733887.43
14  14  04  1975978.3  2733887.43
15  14  05  1748287.23  2733887.43
16  14  06  1869728.61  2733887.43
17  14  07  1932282.28  2733887.43
18  14  08  1972532.63  2733887.43
19  14  09  2167008.19  2733887.43
20  14  10  2236464.53  2733887.43
21  14  11  1959664.22  2733887.43
22  14  12  1741284.96  2733887.43
23  15  01  2277420.49  1368317.88
24  15  02  2372690.87  1368317.88
25  15  03  1830572.64  1368317.88
26  15  04  1975978.3  1368317.88
27  15  05  1748287.23  1368317.88
28  15  06  1869728.61  1368317.88
29  15  07  1932282.28  1368317.88
30  15  08  1972532.63  1368317.88
31  15  09  2167008.19  1368317.88
32  15  10  2236464.53  1368317.88
33  15  11  1959664.22  1368317.88
34  15  12  1741284.96  1368317.88
35  16  03  1830572.64  11.99
36  17  01  2277420.49  2239127.88
37  17  02  2372690.87  2239127.88
38  17  03  1830572.64  2239127.88
39  17  04  1975978.3  2239127.88
40  17  05  1748287.23  2239127.88
41  17  06  1869728.61  2239127.88
42  17  07  1932282.28  2239127.88
43  17  08  1972532.63  2239127.88
44  17  09  2167008.19  2239127.88
45  17  10  2236464.53  2239127.88
46  17  11  1959664.22  2239127.88
47  17  12  1741284.96  2239127.88
48  18  01  2277420.49  5477218.04
49  18  02  2372690.87  5477218.04
50  18  03  1830572.64  5477218.04
51  18  04  1975978.3  5477218.04
52  18  05  1748287.23  5477218.04
53  18  06  1869728.61  5477218.04
54  18  07  1932282.28  5477218.04
55  18  08  1972532.63  5477218.04
56  18  09  2167008.19  5477218.04
57  18  10  2236464.53  5477218.04
58  18  11  1959664.22  5477218.04
59  18  12  1741284.96  5477218.04
60  19  01  2277420.49  182670.35

 


(2)、RATIO_TO_REPORT 函数

RATIO_TO_REPORT 可用来计算某个值在一组值的总和中所占的比例。

 

计算每个产品每个月销量总和,以及该类型在整月中的比例。

 

select s.prod_id, s.month_id, 
sum(s.amount_sold) as prod_month_total,
ratio_to_report(sum(s.amount_sold)) over (partition by s.month_id) as prod_ratio
from all_sales s where s.year=1998 and s.amount_sold is not null
group by s.prod_id, s.month_id
order by s.prod_id, s.month_id; 
 

结果:

 


   	PROD_ID	MONTH_ID	PROD_MONTH_TOTAL	PROD_RATIO
1	13	01	125575.64	0.0551394178419814
2	13	02	122325.21	0.0515554771785336
3	13	03	61649.5	        0.0336777130024187
4	13	04	17404.26	0.00880792061329823
5	13	05	61649.5 	0.0352627983217609
6	13	06	20004	        0.0106988789137692
7	13	07	164719.38	0.085246023163862
8	13	08	125010.33	0.0633755447685547
9	13	10	112205.27	0.0501708247525839
10	13	12	125654.44	0.0721619050795684
11	14	01	239773.24	0.105282814944727
12	14	02	278879.97	0.117537422816483


5、LAG与LEAD函数

 

LAG 与 LEAD 函数获得位于距当前指定距离处那条记录中的数据。

如获得前一个月 与后一个月的数据:

 

select s.month_id, sum(s.amount_sold), 
lag(sum(s.amount_sold)) over (  order by s.month_id) as last_month_sum,
lead(sum(s.amount_sold)) over ( order by s.month_id) as next_month_sum
from all_sales s where s.year=1998 and s.amount_sold is not null
group by s.month_id
order by s.month_id; 

 

结果:

 

   MONTH_ID	SUM(S.AMOUNT_SOLD) LAST_MONTH_SUM  NEXT_MONTH_SUM
1	01	2277420.49		        2372690.87
2	02	2372690.87	2277420.49	1830572.64
3	03	1830572.64	2372690.87	1975978.3
4	04	1975978.3	1830572.64	1748287.23
5	05	1748287.23	1975978.3	1869728.61
6	06	1869728.61	1748287.23	1932282.28
7	07	1932282.28	1869728.61	1972532.63
8	08	1972532.63	1932282.28	2167008.19
9	09	2167008.19	1972532.63	2236464.53
10	10	2236464.53	2167008.19	1959664.22
11	11	1959664.22	2236464.53	1741284.96
12	12	1741284.96	1959664.22	


 

6、FIRST函数与LAST函数

first 和 last 返回排序分组中的第一个或者最后一个值。

如查询某年销量最高的、最低的月份。

 

select min(s.month_id) keep (dense_rank first order by sum(s.amount_sold)) as min_month,
max(s.month_id) keep (dense_rank last order by sum(s.amount_sold)) as max_month
from all_sales s where s.year=1998 and s.amount_sold is not null
group by s.month_id
order by s.month_id; 

 


结果:

    MIN_MONTHMAX_MONTH
112                02


7、线性回归函数



8、假设评级及分布函数

假象评级与分布函数可以计算一条新记录在表中的排名和百分比,与 rank(),dense_rank(), percent_rank(),  cume_dist() 等连用。

例如,查找 sum(amount) 为20004 的假想排名和百分比排名

 

select rank(20004) within group (  order by sum(s.amount_sold) desc) as rank,
percent_rank(20004) within group (  order by sum(s.amount_sold) desc) as pencent_rank
from all_sales s where s.year=1998 and s.amount_sold is not null
group by s.prod_id
order by s.prod_id; 

 

 

结果:

   RANK PENCENT_RANK
1560.916666666666667



 

posted @ 2015-10-03 23:14  lihui1625  阅读(260)  评论(0编辑  收藏  举报