Oracle字符串行专列(字符串聚合技术)
原文链接:http://oracle-base.com/articles/misc/string-aggregation-techniques.php
1 String Aggregation Techniques 字符串聚合技术
On occasion it is necessary to aggregate data from a number ofrows into a single row, giving a list of data associated with a specific value.Using the SCOTT.EMP table as an example, we might want to retrieve a list ofemployees for each department. Below is a list of the base data and the type ofoutput we would like to return from an aggregate query.
偶尔我们需要将多行数据聚合成一行,给出指定数据值列表。下面我们以EMP表为例,我们希望按照部门聚合其下面的所有员工。
Base Data:
DEPTNO ENAME
---------- ----------
20 SMITH
30 ALLEN
30 WARD
20 JONES
30 MARTIN
30 BLAKE
10 CLARK
20 SCOTT
10 KING
30 TURNER
20 ADAMS
30 JAMES
20 FORD
10 MILLER
Desired Output:
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 SMITH,FORD,ADAMS,SCOTT,JONES
30 ALLEN,BLAKE,MARTIN,TURNER,JAMES,WARD
This article is based on a thread from asktom.oracle.com and contains several methods toachieve the desired results.
我们将采用以下方法:
- LISTAGG Analytic Function in 11g Release 2
- WM_CONCAT Built-in Function (Not Supported)
- User-Defined Aggregate Function
- Specific Function
- Generic Function using Ref Cursor
- ROW_NUMBER() and SYS_CONNECT_BY_PATH functions in Oracle 9i
- COLLECT function in Oracle 10g
1.1 LISTAGG AnalysticFunction in 11g Release 2
The LISTAGG
analyticfunction was introduced in Oracle 11g Release 2, making it very easy toaggregate strings. The nice thing about this function is it also allows us toorder the elements in the concatenated list. If you are using 11g Release 2 youshould use this function for string aggregation.
LISTAGG函数由11gr2引入,使得字符串聚合操作更加容易。并且允许我们制定排序规则。
COLUMN employees FORMAT A50
SELECT deptno, LISTAGG(ename, ',') WITHIN GROUP (ORDER BY ename) AS employees
FROM emp
GROUP BY deptno;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 ADAMS,FORD,JONES,SCOTT,SMITH
30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
3 rows selected.
1.2 WM_CONCAT Built-inFunction (Not Supported)
If you are not running 11g Release 2, but are running a version ofthe database where the WM_CONCAT
functionis present, then it is a zero effort solution as it performs the aggregationfor you. It is actually an example of a user defined aggregate functiondescribed below, but Oracle have done all the work for you.
如果我们使用的是11g r2之前版本,那么可以选择使用WM_CONCAT内置函数。
COLUMN employees FORMAT A50
SELECT deptno, wm_concat(ename) AS employees
FROM emp
GROUP BY deptno;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 SMITH,FORD,ADAMS,SCOTT,JONES
30 ALLEN,BLAKE,MARTIN,TURNER,JAMES,WARD
3 rows selected.
Note. WM_CONCAT
isan undocumented function and as such is not supported by Oracle for userapplications (MOS Note ID 1336219.1). If this concerns you,use a User-Defined Aggregate Function described below.
不过注意WM_CONCAT属于非文档型函数并且使用该函数的应用程序将不被ORACLE支持(出了事Oracle不负任何责任)。如果这一点困扰到你,那么可以选择下面的用户自定义聚合函数。
1.3 User-Defined AggregateFunction
The WM_CONCAT
functiondescribed above is an example of a user-defined aggregate function that Oraclehave already created for you. If you don't want to use WM_CONCAT
, you can create your ownuser-defined aggregate function as described at asktom.oracle.com.
CREATE OR REPLACE TYPE t_string_agg AS OBJECT
(
g_string VARCHAR2(32767),
STATIC FUNCTION ODCIAggregateInitialize(sctx IN OUT t_string_agg)
RETURN NUMBER,
MEMBER FUNCTION ODCIAggregateIterate(self IN OUT t_string_agg,
value IN VARCHAR2 )
RETURN NUMBER,
MEMBER FUNCTION ODCIAggregateTerminate(self IN t_string_agg,
returnValue OUT VARCHAR2,
flags IN NUMBER)
RETURN NUMBER,
MEMBER FUNCTION ODCIAggregateMerge(self IN OUT t_string_agg,
ctx2 IN t_string_agg)
RETURN NUMBER
);
/
SHOW ERRORS
CREATE OR REPLACE TYPE BODY t_string_agg IS
STATIC FUNCTION ODCIAggregateInitialize(sctx IN OUT t_string_agg)
RETURN NUMBER IS
BEGIN
sctx := t_string_agg(NULL);
RETURN ODCIConst.Success;
END;
MEMBER FUNCTION ODCIAggregateIterate(self IN OUT t_string_agg,
value IN VARCHAR2 )
RETURN NUMBER IS
BEGIN
SELF.g_string := self.g_string || ',' || value;
RETURN ODCIConst.Success;
END;
MEMBER FUNCTION ODCIAggregateTerminate(self IN t_string_agg,
returnValue OUT VARCHAR2,
flags IN NUMBER)
RETURN NUMBER IS
BEGIN
returnValue := RTRIM(LTRIM(SELF.g_string, ','), ',');
RETURN ODCIConst.Success;
END;
MEMBER FUNCTION ODCIAggregateMerge(self IN OUT t_string_agg,
ctx2 IN t_string_agg)
RETURN NUMBER IS
BEGIN
SELF.g_string := SELF.g_string || ',' || ctx2.g_string;
RETURN ODCIConst.Success;
END;
END;
/
SHOW ERRORS
CREATE OR REPLACE FUNCTION string_agg (p_input VARCHAR2)
RETURN VARCHAR2
PARALLEL_ENABLE AGGREGATE USING t_string_agg;
/
SHOW ERRORS
The aggregate function is implemented using a type and type body,and is used within a query.
此函数使用TYPE和TYPE BODY实现。
COLUMN employees FORMAT A50
SELECT deptno, string_agg(ename) AS employees
FROM emp
GROUP BY deptno;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 SMITH,FORD,ADAMS,SCOTT,JONES
30 ALLEN,BLAKE,MARTIN,TURNER,JAMES,WARD
3 rows selected.
1.4 Specific Function
One approach is to write a specific function to solve theproblems. The get_employees function listed below returns a list of employeesfor the specified department.
一条路是写一个特定的函数。例如下面的get_employees方法:
CREATE OR REPLACE FUNCTION get_employees (p_deptno in emp.deptno%TYPE)
RETURN VARCHAR2
IS
l_text VARCHAR2(32767) := NULL;
BEGIN
FOR cur_rec IN (SELECT ename FROM emp WHERE deptno = p_deptno) LOOP
l_text := l_text || ',' || cur_rec.ename;
END LOOP;
RETURN LTRIM(l_text, ',');
END;
/
SHOW ERRORS
The function can then be incorporated into a query as follows.
COLUMN employees FORMAT A50
SELECT deptno,
get_employees(deptno) AS employees
FROM emp
GROUP by deptno;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 SMITH,JONES,SCOTT,ADAMS,FORD
30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
3 rows selected.
To reduce the number of calls to the function, and thereby improveperformance, we might want to filter the rows in advance.
为了减少函数调用次数,我们可以提前做好重复值过滤:
COLUMN employees FORMAT A50
SELECT e.deptno,
get_employees(e.deptno) AS employees
FROM (SELECT DISTINCT deptno
FROM emp) e;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 SMITH,JONES,SCOTT,ADAMS,FORD
30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
3 rows selected.
1.5 Generic Function usingRef Cursor
An alternative approach is to write a function to concatenatevalues passed using a ref cursor. This is essentially the same as the previousexample, except that the cursor is passed in making it generic, as shown below.
使用游标变量的通用型函数。游标变量作为函数入参。这个方法类似上面方法,只不过是使用游标变量而已。
CREATE OR REPLACE FUNCTION concatenate_list (p_cursor IN SYS_REFCURSOR)
RETURN VARCHAR2
IS
l_return VARCHAR2(32767);
l_temp VARCHAR2(32767);
BEGIN
LOOP
FETCH p_cursor
INTO l_temp;
EXIT WHEN p_cursor%NOTFOUND;
l_return := l_return || ',' || l_temp;
END LOOP;
RETURN LTRIM(l_return, ',');
END;
/
SHOW ERRORS
The CURSOR function is used to allow a query to be passed to thefunction as a ref cursor, as shown below. 这样就允许查询语句作为游标变量参数传递。
COLUMN employees FORMAT A50
SELECT e1.deptno,
concatenate_list(CURSOR(SELECT e2.ename FROM emp e2 WHERE e2.deptno = e1.deptno)) employees
FROM emp e1
GROUP BY e1.deptno;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 SMITH,JONES,SCOTT,ADAMS,FORD
30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
3 rows selected.
Once again, the total number of function calls can be reduced byfiltering the distinct values, rather than calling the function for each row.
同样,为了减少函数调用次数,我们提前进行重复值过滤。
COLUMN employees FORMAT A50
SELECT deptno,
concatenate_list(CURSOR(SELECT e2.ename FROM emp e2 WHERE e2.deptno = e1.deptno)) employees
FROM (SELECT DISTINCT deptno
FROM emp) e1;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 SMITH,JONES,SCOTT,ADAMS,FORD
30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
3 rows selected.
1.6 ROW_NUMBER() andSYS_CONNECT_BY_PATH functions in Oracle 9i
An example on williamrobertson.net uses the ROW_NUMBER()
and SYS_CONNECT_BY_PATH
functionsto achieve the same result without the use of PL/SQL or additional typedefinitions.
另外一个例子使用ROW_NUMBER()函数和SYS_CONNECT_BY_PATH层次查询函数实现。
SELECT deptno,
LTRIM(MAX(SYS_CONNECT_BY_PATH(ename,','))
KEEP (DENSE_RANK LAST ORDER BY curr),',') AS employees
FROM (SELECT deptno,
ename,
ROW_NUMBER() OVER (PARTITION BY deptno ORDER BY ename) AS curr,
ROW_NUMBER() OVER (PARTITION BY deptno ORDER BY ename) -1 AS prev
FROM emp)
GROUP BY deptno
CONNECT BY prev = PRIOR curr AND deptno = PRIOR deptno
START WITH curr = 1;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 ADAMS,FORD,JONES,SCOTT,SMITH
30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
3 rows selected.
1.7 COLLECT function inOracle 10g
An example on oracle-developer.net uses the COLLECT
functionin Oracle 10g to get the same result. This method requires a table type and afunction to convert the contents of the table type to a string. I've alteredhis method slightly to bring it in line with this article.
最后这个例子使用Oracle10g引入的COLLECT函数。使用这个方法要求先定义一个表类型和一个用于将表类型转换为字符串的函数。
CREATE OR REPLACE TYPE t_varchar2_tab AS TABLE OF VARCHAR2(4000);
/
CREATE OR REPLACE FUNCTION tab_to_string (p_varchar2_tab IN t_varchar2_tab,
p_delimiter IN VARCHAR2 DEFAULT ',') RETURN VARCHAR2 IS
l_string VARCHAR2(32767);
BEGIN
FOR i IN p_varchar2_tab.FIRST .. p_varchar2_tab.LAST LOOP
IF i != p_varchar2_tab.FIRST THEN
l_string := l_string || p_delimiter;
END IF;
l_string := l_string || p_varchar2_tab(i);
END LOOP;
RETURN l_string;
END tab_to_string;
/
The query below shows the COLLECT
functionin action.
COLUMN employees FORMAT A50
SELECT deptno,
tab_to_string(CAST(COLLECT(ename) AS t_varchar2_tab)) AS employees
FROM emp
GROUP BY deptno;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 SMITH,JONES,SCOTT,ADAMS,FORD
30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
3 rows selected.
-----------------------------------
Dylan Present.