postgres 并行扫描--索引的正确打开方式
一,并行顺序扫描
01,了解顺序扫描
了解并行前应该先学下顺序扫描,也就是全表扫描,每次检索都是全表跑一边,这就导致大表扫描期间就会消耗大量的CPU ,内存 磁盘I/o,这个对数据库影响还是很大的. OLTP 事务型的数据库一般都要避免这个扫描方式
02,创建环境
CREATE TABLE test_bigl ( ID int4, NAME CHARACTER VARYING ( 32 ), create_time timestamp without TIME ZONE DEFAULT clock_TIMESTAMP ( ) ); --创建表结构 INSERT INTO test_bigl(id,name) SELECT n,n||'test' FROM generate_series(1,50000000) n; --插入百万级别的数据
03, 实操了解扫描与并行扫描
先来一个操作: kingledb=> EXPLAIN SELECT * FROM test_bigl WHERE name='test'; QUERY PLAN ------------------------------------------------------------------------------ Gather (cost=1000.00..619228.77 rows=1 width=24) Workers Planned: 2 -> Parallel Seq Scan on test_bigl (cost=0.00..618228.67 rows=1 width=24) Filter: ((name)::text = 'test'::text) (4 rows)
查看这个加深的 能了解到这个是顺序扫描
kingledb=> EXPLAIN ANALYZE SELECT * FROM test_bigl WHERE name='test'; QUERY PLAN -------------------------------------------------------------------------------------- ---------------------------------------- Gather (cost=1000.00..619228.77 rows=1 width=24) (actual time=2957.652..2961.262 row s=0 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on test_bigl (cost=0.00..618228.67 rows=1 width=24) (actual time=2951.512..2951.512 rows=0 loops=3) Filter: ((name)::text = 'test'::text) Rows Removed by Filter: 16666667 Planning time: 0.098 ms Execution time: 2961.299 ms (8 rows)
这个能查看到执行计划中两个增加的选项:
Workers Planned: 执行计划预估的并行进程数
Workers Launched: 查询实际获得的并行进程数
发现这两个都是为2
Parallel Seq Scan on test_bigl --这个代表这这个进行了并行的循序扫描
Planning time: 0.098 ms -->这个是sql生成执行计划的时间
Execution time: 2961.299 ms -->这个是sql实际执行时间
其实到这里你就发现我这个一直开启了并行度,现在我们关闭它
kingledb=> set max_parallel_workers_per_gather = 0; SET 0 代表关闭 其他正整数代表 开启的并行度
我们再次执行下
kingledb=> EXPLAIN ANALYZE SELECT * FROM test_bigl WHERE name='test'; QUERY PLAN -------------------------------------------------------------------------------------- --------------------------- Seq Scan on test_bigl (cost=0.00..982812.00 rows=1 width=24) (actual time=10028.226. .10028.226 rows=0 loops=1) Filter: ((name)::text = 'test'::text) Rows Removed by Filter: 50000000 Planning time: 0.237 ms Execution time: 10028.266 ms (5 rows) --可以发现上面加深颜色的都不存在了,而且执行时间慢了5倍!!!,相当恐怖
二,并行索引扫描
01,索引扫描
也就是在表上创建索引~~~
02,操作观察
--创建索引
kingledb=# CREATE INDEX index_test_bigl on kingle.test_bigl USING btree (id); CREATE INDEX kingledb=#
--不开启并行度查看索引扫描 kingledb=# EXPLAIN ANALYZE SELECT count(name) FROM kingle.test_bigl WHERE id <100000 ; QUERY PLAN -------------------------------------------------------------------------------------- -------------------------------------------------------- Aggregate (cost=3684.05..3684.06 rows=1 width=8) (actual time=54.445..54.445 rows=1 loops=1) -> Index Scan using index_test_bigl on test_bigl (cost=0.56..3442.86 rows=96474 w idth=12) (actual time=0.056..35.887 rows=99999 loops=1) Index Cond: (id < 100000) Planning time: 1.212 ms Execution time: 54.614 ms (5 rows)
--开启并行度扫描 kingledb=# EXPLAIN ANALYZE SELECT count(name) FROM kingle.test_bigl WHERE id <1000000 ; QUERY PLAN -------------------------------------------------------------------------------------- --------------------------------------------------------------------------------- Finalize Aggregate (cost=30200.04..30200.05 rows=1 width=8) (actual time=191.625..19 1.625 rows=1 loops=1) -> Gather (cost=30199.62..30200.03 rows=4 width=8) (actual time=191.585..195.111 rows=5 loops=1) Workers Planned: 4 Workers Launched: 4 -> Partial Aggregate (cost=29199.62..29199.63 rows=1 width=8) (actual time= 178.688..178.688 rows=1 loops=5) -> Parallel Index Scan using index_test_bigl on test_bigl (cost=0.56. .28564.97 rows=253860 width=12) (actual time=0.073..148.722 rows=200000 loops=5) Index Cond: (id < 1000000) Planning time: 0.204 ms Execution time: 195.215 ms (9 rows)
三,并行index-only扫描
kingledb=# EXPLAIN ANALYZE SELECT "count"(*) from kingle.test_bigl WHERE id <1000000; QUERY PLAN -------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------- Finalize Aggregate (cost=30200.04..30200.05 rows=1 width=8) (actual time=184.775..18 4.776 rows=1 loops=1) -> Gather (cost=30199.62..30200.03 rows=4 width=8) (actual time=181.043..184.904 rows=5 loops=1) Workers Planned: 4 Workers Launched: 4 -> Partial Aggregate (cost=29199.62..29199.63 rows=1 width=8) (actual time= 170.658..170.658 rows=1 loops=5) -> Parallel Index Only Scan using index_test_bigl on test_bigl (cost= 0.56..28564.97 rows=253860 width=0) (actual time=0.114..142.576 rows=200000 loops=5) Index Cond: (id < 1000000) Heap Fetches: 206058 Planning time: 0.340 ms Execution time: 185.024 ms (10 rows)
四,并行bitmap heap扫描
kingledb=# EXPLAIN ANALYZE SELECT count(*) from kingle.test_bigl WHERE id<1000000 or id >490000000 kingledb-# ; QUERY PLAN -------------------------------------------------------------------------------------- --------------------------------------------------------------------------- Finalize Aggregate (cost=382523.64..382523.65 rows=1 width=8) (actual time=209.846.. 209.847 rows=1 loops=1) -> Gather (cost=382523.22..382523.63 rows=4 width=8) (actual time=206.653..210.28 5 rows=5 loops=1) Workers Planned: 4 Workers Launched: 4 -> Partial Aggregate (cost=381523.22..381523.23 rows=1 width=8) (actual tim e=197.044..197.044 rows=1 loops=5) -> Parallel Bitmap Heap Scan on test_bigl (cost=19268.67..380888.57 r ows=253860 width=0) (actual time=85.352..170.740 rows=200000 loops=5) Recheck Cond: ((id < 1000000) OR (id > 490000000)) Heap Blocks: exact=1478 -> BitmapOr (cost=19268.67..19268.67 rows=1015441 width=0) (act ual time=91.773..91.773 rows=0 loops=1) -> Bitmap Index Scan on index_test_bigl (cost=0.00..18756 .37 rows=1015441 width=0) (actual time=91.749..91.749 rows=999999 loops=1) Index Cond: (id < 1000000) -> Bitmap Index Scan on index_test_bigl (cost=0.00..4.57 rows=1 width=0) (actual time=0.014..0.014 rows=0 loops=1) Index Cond: (id > 490000000) Planning time: 0.456 ms Execution time: 210.420 ms (15 rows) kingledb=#
人生就像一滴水,非要落下才后悔!
--kingle