几种构造序列的方法比较

整理自 http://www.itpub.net/thread-1796279-1-1.html

1. select rownum rn from all_objects where rownum<=xx; --已经不被推荐使用，因为all_objects视图复杂，在真实案例中容易引起错误.

2. select rownum rn from dual connect by rownum<=xx;

3. select rownum rn from xmltable('1 to xx'); --10G 之后引入了XMLtable

4. with b as (select 1 r from dual union all select 2 from dual), --单纯的表配上笛卡尔积
    c as (select rownum r from b,b,b,b,b),
    d as (select rownum r from c,c,c,c,c,b)
    select count(*) from d;

5. --一个用于构造序列的函数，来自asktom

create or replace function generator (n pls_integer) return sys.odcinumberlist pipelined is
m pls_integer := trunc(n / 10);
r pls_integer := n - 10 * m;
begin
for i in 1 .. m loop
    pipe row (null);
    pipe row (null);
    pipe row (null);
    pipe row (null);
    pipe row (null);
    pipe row (null);
    pipe row (null);
    pipe row (null);
    pipe row (null);
    pipe row (null);
end loop;
for i in 1 .. r loop
    pipe row (null);
end loop;
end;
/

以下是性能比较：

先看connect by的方法
lastwinner@lw> select count(*) from (select rownum rn from dual connect by rownum<=power(2,19));

COUNT(*)
----------
524288

已用时间: 00: 00: 00.20
lastwinner@lw> select count(*) from (select rownum rn from dual connect by rownum<=power(2,20));
select count(*) from (select rownum rn from dual connect by rownum<=power(2,20))
*
第 1 行出现错误:
ORA-30009: CONNECT BY 操作内存不足

可见直接用connect by去构造较大的序列时，消耗的资源很多，速度也快不到哪儿去。实际上2^20并不是一个很大的数字，就是1M而已。
但xmltable方式就不会耗这么多资源
lastwinner@lw> select count(*) from (select rownum rn from xmltable('1 to 1048576'));

COUNT(*)
----------
1048576

已用时间: 00: 00: 00.95

其实除了上述三种办法，我们还可以使用笛卡尔积来构造序列。如果换成笛卡尔连接的方式，那么构造2^20时，connect by也ok

lastwinner@lw> with a as (select rownum rn from dual connect by rownum<=power(2,10))
2 select count(*) from (select rownum rn from a, a);

COUNT(*)
----------
1048576

已用时间: 00: 00: 00.09

我们试着将1M加大到1G，在connect by方式下
lastwinner@lw> with a as (select rownum rn from dual connect by rownum<=power(2,10))
2 select count(*) from (select rownum rn from a, a, a);

COUNT(*)
----------
1073741824

已用时间: 00: 01: 07.37

耗时高达1分钟还多，再看看xmltable方式，考虑到1M的时候耗时就达到0.95秒，因此这里只测试1/16*1G，即64M的情况
lastwinner@lw> select count(*) from (select rownum rn from xmltable('1 to 67108864'));

COUNT(*)
----------
67108864

已用时间: 00: 00: 37.00

如果直接构造到1G，那么时间差不多是16*37s这个级别。但如果通过笛卡尔积＋xmltable的方式来构造

lastwinner@lw> with a as (select rownum rn from xmltable('1 to 1024'))
2 select count(*) from (select rownum rn from a, a, a);

COUNT(*)
----------
1073741824

已用时间: 00: 01: 07.95

这时间和connect by的差不多。以上测试，总的可见，在构造较大序列时，笛卡尔积的方式是最佳的，
单纯使用connect by会遭遇内存不足，而单独使用xmltable则会耗费较多的时间。

现在再看看基本用纯表连接的方式来构造同样大小的序列，先来1M的

lastwinner@lw> with b as (select 1 r from dual union all select 2 from dual),
2 c as (select rownum r from b,b,b,b,b,
3 b,b,b,b,b,
4 b,b,b,b,b,
5 b,b,b,b,b)
6 select count(*) from c;

COUNT(*)
----------
1048576

已用时间: 00: 00: 00.33

再来64M的
lastwinner@lw> ed
已写入 file afiedt.buf

1 with b as (select 1 r from dual union all select 2 from dual),
2 c as (select rownum r from b,b,b,b,b,
3 b,b,b,b,b,
4 b,b,b,b,b,
5 b,b,b,b,b,
6 b,b,b,b,b,b)
7* select count(*) from c
lastwinner@lw> /

COUNT(*)
----------
67108864

已用时间: 00: 00: 16.62

这个速度并不快，但已经比直接xmltable快了。
其实64M，即64*2^20可以表示为(2^5)^5*2，那我们来改写一下64M的sql
with b as (select 1 r from dual union all select 2 from dual),
c as (select rownum r from b,b,b,b,b),
d as (select rownum r from c,c,c,c,c,b)
select count(*) from d;

lastwinner@lw> with b as (select 1 r from dual union all select 2 from dual),
2 c as (select rownum r from b,b,b,b,b),
3 d as (select rownum r from c,c,c,c,c,b)
4 select count(*) from d;

COUNT(*)
----------
67108864

已用时间: 00: 00: 04.53

可以看到，从16s到4s，已经快了很多。这个示例告诉我们，中间表c 在提高速度方面起到了很好的作用。
但在构造到1G时，还是要慢一些
lastwinner@lw> ed
已写入 file afiedt.buf

1 with b as (select 1 r from dual union all select 2 from dual),
2 c as (select rownum r from b,b,b,b,b),
3 d as (select rownum r from c,c,c,c,c,c)
4* select count(*) from d
lastwinner@lw> /

COUNT(*)
----------
1073741824

已用时间: 00: 01: 11.48

尝试相对较快的写法，多一层中间表
lastwinner@lw> ed
已写入 file afiedt.buf

1 with b as (select 1 r from dual union all select 2 from dual),
2 c as (select rownum r from b,b,b),
3 d as (select rownum r from c,c,c),
4 e as (select rownum r from d,d,d,c)
5* select count(*) from e
lastwinner@lw> /

COUNT(*)
----------
1073741824

已用时间: 00: 01: 06.89

更快一点（思路，32^2=1024, 1G=2^30=(2^5)^6=((2^5)^2)^3 。）

lastwinner@lw> ed
已写入 file afiedt.buf

1 with b as (select 1 r from dual union all select 2 from dual),
2 c as (select rownum r from b,b,b,b,b),
3 d as (select rownum r from c,c),
4 e as (select rownum r from d,d,d)
5* select count(*) from e
lastwinner@lw> /

COUNT(*)
----------
1073741824

已用时间: 00: 01: 05.21

这时候我们将2^5＝32换成直接构造出来的方式
lastwinner@lw> ed
已写入 file afiedt.buf

1 with b as (select rownum r from dual connect by rownum<=power(2,5)),
2 c as (select rownum r from b,b),
3 d as (select rownum r from c,c,c)
4* select count(*) from d
lastwinner@lw> /

COUNT(*)
----------
1073741824

已用时间: 00: 01: 05.07

可见所耗费的时间差不多。
由此我们还可以得出，表连接的代价其实也是昂贵的，适当的减少表连接的次数，适当的使用with里的中间表，能有效提高系统性能。
再重复一下刚才构造64M（2^26）的场景
lastwinner@lw> ed
已写入 file afiedt.buf

COUNT(*)
----------
67108864

已用时间: 00: 00: 16.62

总共25次的表连接，1层嵌套，让速度非常慢。提高一下（26=4*3*2+2*2），总共8次表连接，3层嵌套。
with b as (select 1 r from dual union all select 2 from dual),
c as (select rownum r from b,b,b,b),
d as (select rownum r from c,c,c),
e as (select rownum r from d,d,b,b)
select count(*) from e;

lastwinner@lw> ed
已写入 file afiedt.buf

1 with b as (select 1 r from dual union all select 2 from dual),
2 c as (select rownum r from b,b,b,b),
3 d as (select rownum r from c,c,c),
4 e as (select rownum r from d,d,b,b)
5* select count(*) from e
lastwinner@lw> /

COUNT(*)
----------
67108864

已用时间: 00: 00: 04.00

效率提升4倍。要注意在这个案例中并非表连接越少越好，嵌套层数也是需要关注的指标。执行计划有兴趣的同学自己去看吧，我就不列了，上例中，系统生成的中间表有3个。

最终结论，构造较大序列时，例如同样是构造出64M的序列，oracle在处理时，用表连接的方式明显占优。但考虑到书写的便利性，因此在构造较小序列的时候，比如不超过1K的序列，那么直接用connect by或xmltable的方式就好了。

--一个用于构造序列的函数，来自asktom

SQL> select count(*) from table(generator(67108864));

COUNT(*)
----------
67108864

Elapsed: 00:00:06.68

SQL> with b as (select 1 r from dual union all select 2 from dual),
2 c as (select rownum r from b,b,b,b),
3 d as (select rownum r from c,c,c),
4 e as (select rownum r from d,d,b,b)
5 select count(*) from e;

COUNT(*)
----------
67108864

Elapsed: 00:00:06.32

WITH
E1(N) AS (
            SELECT 1 FROM DUAL UNION ALL SELECT 1 FROM DUAL UNION ALL SELECT 1 FROM DUAL UNION ALL
            SELECT 1 FROM DUAL UNION ALL SELECT 1 FROM DUAL UNION ALL SELECT 1 FROM DUAL UNION ALL
            SELECT 1 FROM DUAL UNION ALL SELECT 1 FROM DUAL UNION ALL SELECT 1 FROM DUAL UNION ALL SELECT 1 FROM DUAL
           ),                          -- 1*10^1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), -- 1*10^4 or 10,000 rows
E8(N) AS (SELECT 1 FROM E4 a, E4 b) -- 1*10^8 or 100,000,000 rows
SELECT COUNT(*) FROM E8
;

COUNT(*)
----------
100000000

Elapsed: 00:00:07.05

select count(*) from table(generator(100000000));

COUNT(*)
----------
100000000

Elapsed: 00:00:10.88

posted @ 2014-02-16 20:17 princessd8251 阅读(459) 评论(0) 收藏举报

刷新页面返回顶部