分区智能连接
KingbaseES 有两个参数 enable_partitionwise_join , enable_partitionwise_aggregate 用于分区智能连接和聚集,默认这两个参数是 off 的。开启这两个参数,对于采用hash join 或 hash aggregate 的执行方式有性能提升。
我们来看以下例子:
1、准备数据
drop table t1; create table t1(id1 integer,name1 text) partition by hash(id1); create table t1_p1 partition of t1 for values with (MODULUS 4, REMAINDER 0); create table t1_p2 partition of t1 for values with (MODULUS 4, REMAINDER 1); create table t1_p3 partition of t1 for values with (MODULUS 4, REMAINDER 2); create table t1_p4 partition of t1 for values with (MODULUS 4, REMAINDER 3); insert into t1 select generate_series(1,10000000),'abc'||generate_series(1,10000000); drop table t2; create table t2(id2 integer,name1 text) partition by hash(id2); create table t2_p1 partition of t2 for values with (MODULUS 4, REMAINDER 0); create table t2_p2 partition of t2 for values with (MODULUS 4, REMAINDER 1); create table t2_p3 partition of t2 for values with (MODULUS 4, REMAINDER 2); create table t2_p4 partition of t2 for values with (MODULUS 4, REMAINDER 3); insert into t2 select generate_series(1,3000000),'abc'||generate_series(1,3000000);
2、查看执行计划
不使用智能连接:把所有分区的数据取回,以整个表为单位进行hash 及 hash join。因为hash操作是整表,需要的内存更大。下例红色部分 Batches : 8,说明hash 的结果被临时存放在。
test=# show enable_partitionwise_join ; enable_partitionwise_join --------------------------- off (1 row) test=# explain analyze select count(*) from t1,t2 where id1=id2; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------- Finalize Aggregate (cost=234428.54..234428.55 rows=1 width=8) (actual time=16961.762..17036.424 rows=1 loops=1) -> Gather (cost=234428.32..234428.53 rows=2 width=8) (actual time=16953.417..17036.393 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 -> Partial Aggregate (cost=233428.32..233428.33 rows=1 width=8) (actual time=16948.908..16949.002 rows=1 loops=3) -> Parallel Hash Join (cost=55477.00..230303.32 rows=1250000 width=0) (actual time=2828.860..16859.289 rows=1000000 loops=3) Hash Cond: (t1_p4.id1 = t2_p4.id2) -> Parallel Append (cost=0.00..116556.00 rows=4166666 width=4) (actual time=0.030..781.854 rows=3333333 loops=3) -> Parallel Seq Scan on t1_p4 (cost=0.00..23946.36 rows=1042336 width=4) (actual time=0.049..413.020 rows=2501606 loops=1) -> Parallel Seq Scan on t1_p2 (cost=0.00..23934.28 rows=1041828 width=4) (actual time=0.028..415.877 rows=2500388 loops=1) -> Parallel Seq Scan on t1_p1 (cost=0.00..23923.50 rows=1041350 width=4) (actual time=0.027..138.329 rows=833080 loops=3) -> Parallel Seq Scan on t1_p3 (cost=0.00..23918.52 rows=1041152 width=4) (actual time=0.012..391.160 rows=2498765 loops=1) -> Parallel Hash (cost=34969.00..34969.00 rows=1250000 width=4) (actual time=467.790..467.792 rows=1000000 loops=3) Buckets: 1048576 Batches: 8 Memory Usage: 22880kB -> Parallel Append (cost=0.00..34969.00 rows=1250000 width=4) (actual time=0.046..239.138 rows=1000000 loops=3) -> Parallel Seq Scan on t2_p4 (cost=0.00..7190.62 rows=312962 width=4) (actual time=0.097..128.951 rows=751110 loops=1) -> Parallel Seq Scan on t2_p2 (cost=0.00..7188.90 rows=312890 width=4) (actual time=0.026..138.163 rows=750935 loops=1) -> Parallel Seq Scan on t2_p1 (cost=0.00..7169.82 rows=312082 width=4) (actual time=0.009..41.807 rows=249665 loops=3) -> Parallel Seq Scan on t2_p3 (cost=0.00..7169.66 rows=312066 width=4) (actual time=0.010..115.019 rows=748959 loops=1) Planning Time: 0.211 ms Execution Time: 17036.510 ms (21 rows)
使用智能连接:因为两张表的分区模式是一样的(分区类型、分区列的数据类型、分区数量),能够保证相同的数据在同一分区。因此,只要对应的分区进行连接就可以。由于单个分区的数据量更少,更有可能使用内存排序。本例红色部分batches : 1,也就是所有的hash 的结果都在内存。
test=# show enable_partitionwise_join ; enable_partitionwise_join --------------------------- on (1 row) test=# explain analyze select count(*) from t1,t2 where id1=id2; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------- Finalize Aggregate (cost=171275.21..171275.22 rows=1 width=8) (actual time=2432.577..2446.396 rows=1 loops=1) -> Gather (cost=171274.99..171275.20 rows=2 width=8) (actual time=2429.590..2446.387 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 -> Partial Aggregate (cost=170274.99..170275.00 rows=1 width=8) (actual time=2423.596..2423.618 rows=1 loops=3) -> Parallel Append (cost=11100.02..167149.99 rows=1250000 width=0) (actual time=377.736..2351.993 rows=1000000 loops=3) -> Parallel Hash Join (cost=11102.65..40261.78 rows=312962 width=0) (actual time=393.036..1741.121 rows=751110 loops=1) Hash Cond: (t1_p4.id1 = t2_p4.id2) -> Parallel Seq Scan on t1_p4 (cost=0.00..23946.36 rows=1042336 width=4) (actual time=0.044..406.850 rows=2501606 loops=1) -> Parallel Hash (cost=7190.62..7190.62 rows=312962 width=4) (actual time=387.994..387.995 rows=751110 loops=1) Buckets: 1048576 Batches: 1 Memory Usage: 37600kB -> Parallel Seq Scan on t2_p4 (cost=0.00..7190.62 rows=312962 width=4) (actual time=0.064..140.704 rows=751110 loops=1) -> Parallel Hash Join (cost=11100.02..40244.87 rows=312890 width=0) (actual time=381.391..1729.623 rows=750935 loops=1) Hash Cond: (t1_p2.id1 = t2_p2.id2) -> Parallel Seq Scan on t1_p2 (cost=0.00..23934.28 rows=1041828 width=4) (actual time=0.046..408.713 rows=2500388 loops=1) -> Parallel Hash (cost=7188.90..7188.90 rows=312890 width=4) (actual time=375.787..375.787 rows=750935 loops=1) Buckets: 1048576 Batches: 1 Memory Usage: 37600kB -> Parallel Seq Scan on t2_p2 (cost=0.00..7188.90 rows=312890 width=4) (actual time=0.054..136.383 rows=750935 loops=1) -> Parallel Hash Join (cost=11070.84..40199.75 rows=312082 width=0) (actual time=120.170..558.199 rows=249665 loops=3) Hash Cond: (t1_p1.id1 = t2_p1.id2) -> Parallel Seq Scan on t1_p1 (cost=0.00..23923.50 rows=1041350 width=4) (actual time=0.037..139.094 rows=833080 loops=3) -> Parallel Hash (cost=7169.82..7169.82 rows=312082 width=4) (actual time=117.353..117.369 rows=249665 loops=3) Buckets: 1048576 Batches: 1 Memory Usage: 37536kB -> Parallel Seq Scan on t2_p1 (cost=0.00..7169.82 rows=312082 width=4) (actual time=0.034..41.733 rows=249665 loops=3) -> Parallel Hash Join (cost=11070.49..40193.60 rows=312066 width=0) (actual time=358.778..1698.493 rows=748959 loops=1) Hash Cond: (t1_p3.id1 = t2_p3.id2) -> Parallel Seq Scan on t1_p3 (cost=0.00..23918.52 rows=1041152 width=4) (actual time=0.015..406.430 rows=2498765 loops=1) -> Parallel Hash (cost=7169.66..7169.66 rows=312066 width=4) (actual time=350.709..350.709 rows=748959 loops=1) Buckets: 1048576 Batches: 1 Memory Usage: 37504kB -> Parallel Seq Scan on t2_p3 (cost=0.00..7169.66 rows=312066 width=4) (actual time=0.020..113.071 rows=748959 loops=1) Planning Time: 0.201 ms Execution Time: 2446.453 ms (32 rows)
3、注意事项
- 关联表的hash分区模式必须完全相同:列类型、长度必须相同;hash 分区数量必须相同;
- 对于hash 分区表,分区数量越多,每个分区数据量越少,在连接时,越可能采用内存hash。
KINGBASE研究院