巧用dblink 实现多进程并行查询

概述

对于分区表的大数据统计分析,由于数据量巨大,往往需要采用并行。但是数据库并行的效率相比分进程分表统计还是有比较大的差距。本文通过巧用dblink,实现分进程分分区统计数据。

例子

kingbase=# \d t751
                      分区表 "public.t751"
 栏位 |            类型             | 校对规则 |  可空的  | 预设
------+-----------------------------+----------+----------+------
 id   | bigint                      |          | not null |
 code | text                        |          |          |
 c1   | timestamp without time zone |          |          |
 c2   | text                        |          |          |
 c3   | numeric                     |          |          |
 c4   | integer                     |          |          |
 c5   | integer                     |          |          |
 c6   | integer                     |          |          |
 v2   | numeric                     |          |          |
 v3   | timestamp without time zone |          |          |
分区键值: HASH (c4, c5, c6)
分区的数量:1000(可以使用 \d+ 来列出它们)
表空间:"nvmtbs01"

kingbase=# select pg_size_pretty(sum(pg_relation_size(relid))) from pg_partition_tree('t751');
 pg_size_pretty
----------------
 20 GB

并行方式访问

kingbase=# explain analyze  select  c4, c5, c6, sum(v2) v2 from t751 group by c4, c5, c6;
                                                                      QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize GroupAggregate  (cost=5656234.57..14706860.46 rows=8000000 width=44) (actual time=15764.781..19110.336 rows=11781 loops=1)
   Group Key: t751_872.c4, t751_872.c5, t751_872.c6
   ->  Gather Merge  (cost=5656234.57..13806860.46 rows=64000000 width=44) (actual time=15764.142..19103.239 rows=11902 loops=1)
         Workers Planned: 8
         Workers Launched: 7
         ->  Partial GroupAggregate  (cost=5655234.43..5911484.32 rows=8000000 width=44) (actual time=15478.852..18479.480 rows=1488 loops=8)
               Group Key: t751_872.c4, t751_872.c5, t751_872.c6
               ->  Sort  (cost=5655234.43..5686484.41 rows=12499991 width=22) (actual time=15478.239..17066.239 rows=12500000 loops=8)
                     Sort Key: t751_872.c4, t751_872.c5, t751_872.c6
                     Sort Method: quicksort  Memory: 1408511kB
                     Worker 0:  Sort Method: quicksort  Memory: 1371267kB
                     Worker 1:  Sort Method: quicksort  Memory: 1364497kB
                     Worker 2:  Sort Method: quicksort  Memory: 1374769kB
                     Worker 3:  Sort Method: quicksort  Memory: 1355381kB
                     Worker 4:  Sort Method: quicksort  Memory: 1352989kB
                     Worker 5:  Sort Method: quicksort  Memory: 1343974kB
                     Worker 6:  Sort Method: quicksort  Memory: 1386845kB
                     ->  Parallel Append  (cost=0.00..3156369.51 rows=12499991 width=22) (actual time=0.034..5045.596 rows=12500000 loops=8)
                           ->  Parallel Seq Scan on t751_872  (cost=0.00..5849.56 rows=81756 width=22) (actual time=0.025..60.070 rows=196215 loops=1)
                           ->  Parallel Seq Scan on t751_470  (cost=0.00..5753.10 rows=80410 width=22) (actual time=0.068..73.100 rows=192984 loops=1)
...
                           ->  Parallel Seq Scan on t751_416  (cost=0.00..556.80 rows=10380 width=22) (actual time=0.009..5.736 rows=17646 loops=1)
                           ->  Parallel Seq Scan on t751_885  (cost=0.00..315.77 rows=5877 width=22) (actual time=0.010..2.460 rows=9991 loops=1)
 Planning Time: 26.450 ms
 Execution Time: 19180.545 ms

  

注意:这里实际采用的是groupagg , 如果采用hashagg,执行效率会高很多

使用dblink方式

创建函数

create or replace function f_t_part(para int default 0)
    returns table
            (
                c4 int,
                c5 int,
                c6 int,
                v2 numeric
            )
    language plpgsql
    immutable
    parallel safe
as
$$
declare
    vtabnam text;
    atabnam text[];
begin
    perform dblink_disconnect(conn)
    from unnest(dblink_get_connections()) conn;

    perform dblink_connect('conn_' || sn, 'dbname=' || current_database())
    from generate_series(1, para) sn;

    if para <= 0 then
        for vtabnam in select relid from pg_partition_tree('t751')
            loop
                for c4,c5,c6,v2 in execute format('select c4, c5, c6, sum(v2) v2 from only %s group by c4, c5, c6 ;',vtabnam)
                    loop
                        return next ;
                    end loop;
            end loop;
    else
        for atabnam in select string_to_array(string_agg(relid, ','), ',') relid
                       from pg_partition_tree('t751') with ordinality
                               , lateral ( select ordinality / para sn)
                       where isleaf
                       group by sn
            loop

                perform dblink_is_busy(conn) = 0 from unnest(dblink_get_connections()) conn;

                perform conn,tabnam, dblink_send_query(conn,format('select c4, c5, c6, sum(v2) v2 from only %s group by c4, c5, c6 ;',tabnam)) sq
                from (select unnest(dblink_get_connections()) conn, unnest(atabnam) tabnam) v
                where tabnam is not null;

                for c4,c5,c6,v2 in
                    select tab.* from unnest(dblink_get_connections()) conn , dblink_get_result(conn) as tab(c4 int, c5 int, c6 int, v2 numeric)
                    loop
                        return next ;
                    end loop;
            end loop;
    end if;

    perform dblink_disconnect(conn) from unnest(dblink_get_connections()) conn;
    return;
end;
$$;					   

  

执行效率

ingbase=# explain analyze select * from ft751_01();
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Function Scan on ft751_01  (cost=0.25..10.25 rows=1000 width=44) (actual time=32738.962..32739.288 rows=11781 loops=1)
 Planning Time: 0.055 ms
 Execution Time: 32741.655 ms
(3 行记录)

kingbase=# explain analyze ^Jselect * from ft751_01(16);
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Function Scan on ft751_01  (cost=0.25..10.25 rows=1000 width=44) (actual time=5397.644..5398.189 rows=11781 loops=1)
 Planning Time: 0.044 ms
 Execution Time: 5398.887 ms

  

结论

实际验证,二者性能上差距并不大。用dblink 方式主要是可以更灵活的使用并行数量。这里的目的主要是向大家提供灵活使用并行的一种方法。

 

posted @ 2023-02-28 09:59  KINGBASE研究院  阅读(100)  评论(0编辑  收藏  举报