Oracle 分析函数详解(Analytic Functions)--示例部分

Analytic functions are commonly used in data warehousing environments. In the list of analytic functions that follows, functions followed by an asterisk (*) allow the full syntax, including the windowing_clause.

分析函数一般用于数据仓库环境。以下是分析函数列表,其中带星号的表示支持窗口语句windowing_clause.

AVG *
CORR *
COVAR_POP *
COVAR_SAMP *
COUNT *
CUME_DIST
DENSE_RANK
FIRST
FIRST_VALUE *
LAG
LAST
LAST_VALUE *
LEAD
MAX *
MIN *
NTILE
PERCENT_RANK
PERCENTILE_CONT
PERCENTILE_DISC
RANK
RATIO_TO_REPORT
REGR_ (Linear Regression) Functions *
ROW_NUMBER
STDDEV *
STDDEV_POP *
STDDEV_SAMP *
SUM *
VAR_POP *
VAR_SAMP *

VARIANCE *

-------------------------------------------------------------------

1、AVG   为聚合函数用于求平均:

SELECT manager_id, last_name, hire_date, salary,
   AVG(salary) OVER (PARTITION BY manager_id ORDER BY hire_date 
   ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS c_mavg
   FROM employees;

MANAGER_ID LAST_NAME                 HIRE_DATE     SALARY     C_MAVG
---------- ------------------------- --------- ---------- ----------
       100 Kochhar                   21-SEP-89      17000      17000
       100 De Haan                   13-JAN-93      17000      15000
       100 Raphaely                  07-DEC-94      11000 11966.6667
       100 Kaufling                  01-MAY-95       7900 10633.3333
       100 Hartstein                 17-FEB-96      13000 9633.33333
       100 Weiss                     18-JUL-96       8000 11666.6667
       100 Russell                   01-OCT-96      14000 11833.3333

2、CORR 返回一对表达式的相关系数:

SELECT employee_id, job_id, 
   TO_CHAR((SYSDATE - hire_date) YEAR TO MONTH ) "Yrs-Mns",     salary, 
   CORR(SYSDATE-hire_date, salary)
   OVER(PARTITION BY job_id) AS "Correlation"
FROM employees
WHERE department_id in (50, 80)
ORDER BY job_id, employee_id;

EMPLOYEE_ID JOB_ID     Yrs-Mns     SALARY Correlation
----------- ---------- ------- ---------- -----------
        145 SA_MAN     +08-07       14000  .912385598
        146 SA_MAN     +08-04       13500  .912385598
        147 SA_MAN     +08-02       12000  .912385598
        148 SA_MAN     +05-07       11000  .912385598
        149 SA_MAN     +05-03       10500  .912385598
        150 SA_REP     +08-03       10000   .80436755
        151 SA_REP     +08-02        9500   .80436755
        152 SA_REP     +07-09        9000   .80436755
        153 SA_REP     +07-01        8000   .80436755
        154 SA_REP     +06-05        7500   .80436755
        155 SA_REP     +05-06        7000   .80436755

3、COVAR_POP  返回一对表达式的总体协方差;

4、COVAR_SAMP 返回一对表达式的样本协方差;

5、COUNT 返回总行数:(每行对应的数据窗口是之前行幅度值不超过50,之后行幅度值不超过150)

SELECT last_name, salary,
   COUNT(*) OVER (ORDER BY salary RANGE BETWEEN 50 PRECEDING
      AND 150 FOLLOWING) AS mov_count FROM employees;

LAST_NAME                     SALARY  MOV_COUNT
------------------------- ---------- ----------
Olson                           2100          3
Markle                          2200          2
Philtanker                      2200          2
Landry                          2400          8
Gee                             2400          8
Colmenares                      2500         10
Patel                           2500         10
. . .

6、dense_rank 返回排名,用于TOPN查询:

查询假设薪资15500 、佣金5%的员工在employees表中排名

SELECT DENSE_RANK(15500, .05) WITHIN GROUP 
   (ORDER BY salary DESC, commission_pct) "Dense Rank" 
   FROM employees;

         Dense Rank
-------------------
                  3
SELECT d.department_name, e.last_name, e.salary, DENSE_RANK() 
   OVER (PARTITION BY e.department_id ORDER BY e.salary) AS drank
   FROM employees e, departments d
   WHERE e.department_id = d.department_id
   AND d.department_id IN ('30', '40');

DEPARTMENT_NAME         LAST_NAME              SALARY      DRANK
----------------------- ------------------ ---------- ----------
Purchasing              Colmenares               2500          1
Purchasing              Himuro                   2600          2
Purchasing              Tobias                   2800          3
Purchasing              Baida                    2900          4
Purchasing              Khoo                     3100          5
Purchasing              Raphaely                11000          6
Human Resources         Marvis                   6500          1

7、first 当所查字段不是排序字段时返回分组范围内最大、最小值:


SELECT last_name, department_id, salary,
   MIN(salary) KEEP (DENSE_RANK FIRST ORDER BY commission_pct)
      OVER (PARTITION BY department_id) "Worst",
   MAX(salary) KEEP (DENSE_RANK LAST ORDER BY commission_pct)
      OVER (PARTITION BY department_id) "Best"
    FROM employees
    ORDER BY department_id, salary;

LAST_NAME           DEPARTMENT_ID     SALARY      Worst       Best
------------------- ------------- ---------- ---------- ----------
Whalen                         10       4400       4400       4400
Fay                            20       6000       6000      13000
Hartstein                      20      13000       6000      13000
. . .
Gietz                         110       8300       8300      12000
Higgins                       110      12000       8300      12000
Grant                                   7000       7000       7000

SELECT last_name, department_id, salary,
   MIN(salary) KEEP (DENSE_RANK FIRST ORDER BY commission_pct)
      OVER (PARTITION BY department_id) "Worst",
   MAX(salary) KEEP (DENSE_RANK LAST ORDER BY commission_pct)
      OVER (PARTITION BY department_id) "Best"
    FROM employees
    ORDER BY department_id, salary;


8、fist_value 返回一组有序值中第一个值

SELECT department_id, last_name, salary, FIRST_VALUE(last_name)
  OVER (ORDER BY salary ASC ROWS UNBOUNDED PRECEDING) AS lowest_sal
  FROM (SELECT * FROM employees WHERE department_id = 90
    ORDER BY employee_id);

DEPARTMENT_ID LAST_NAME         SALARY LOWEST_SAL
------------- ------------- ---------- -------------------------
           90 Kochhar            17000 Kochhar
           90 De Haan            17000 Kochhar
           90 King               24000 Kochhar

9、lag与lead函数是跟偏移量相关的两个分析函数,通过这两个函数我们可以取到当前行列的偏移N行列的值 lag可以看着是正的向上的偏移 lead可以认为负的向下的偏移

SELECT last_name, hire_date, salary,
   LAG(salary, 1, 0) OVER (ORDER BY hire_date) AS prev_sal
   FROM employees
   WHERE job_id = 'PU_CLERK';

select deptno,
       sal a,
       lag(sal, 1, null) over(partition by deptno order by deptno) b
  from scott.emp


SELECT last_name, hire_date, 
   LEAD(hire_date, 1) OVER (ORDER BY hire_date) AS "NextHired" 
   FROM employees WHERE department_id = 30;

10、min/max 分别用于返回分组最小值/最大值:

SELECT manager_id, last_name, salary
   FROM (SELECT manager_id, last_name, salary, 
      MAX(salary) OVER (PARTITION BY manager_id) AS rmax_sal
      FROM employees) WHERE salary = rmax_sal;

SELECT manager_id, last_name, hire_date, salary,
   MIN(salary) OVER(PARTITION BY manager_id ORDER BY hire_date
   RANGE UNBOUNDED PRECEDING) AS p_cmin
   FROM employees;

11、rank 类似于dense_rank 区别在于其排名数字不连续

SELECT RANK(15500) WITHIN GROUP 
   (ORDER BY salary DESC) "Rank of 15500" 
   FROM employees;

SELECT department_id, last_name, salary, commission_pct,
   RANK() OVER (PARTITION BY department_id
   ORDER BY salary DESC, commission_pct) "Rank"
   FROM employees WHERE department_id = 80;

12、row_number 和rownum差不多,功能更强一点(可以在各个分组内从1开始排序)

SELECT department_id, last_name, employee_id, ROW_NUMBER()
   OVER (PARTITION BY department_id ORDER BY employee_id) AS emp_id
   FROM employees;

13、 RATIO_TO_REPORT 用来计算当前记录的指标expr占开窗函数over中包含记录的所有同一指标的百分比. 这里如果开窗函数的统计结果为null或者为0,就是说占用比率的被除数为0或者为null, 则得到的结果也为0

SELECT last_name, salary, RATIO_TO_REPORT(salary) OVER () AS rr
   FROM employees
   WHERE job_id = 'PU_CLERK';

14、SUM 

SELECT manager_id, last_name, salary,
   SUM(salary) OVER (PARTITION BY manager_id ORDER BY salary
   RANGE UNBOUNDED PRECEDING) l_csum
   FROM employees;


to be continue...

------------------------

Dylan    Presents.














------------------------------------

Dylan   Presents.

posted @ 2014-02-26 21:07  一锤子技术员  阅读(5)  评论(0编辑  收藏  举报  来源