postgresql/lightdb vacuum对性能的影响及彻底理解表膨胀

先来看测试结果
zjh@postgres=# create table big_table(id int,v text); CREATE TABLE zjh@postgres=# ALTER TABLE big_table SET (autovacuum_enabled = off); # 关闭自动vacuum ALTER TABLE zjh@postgres=# insert into big_table select id, rpad('x',64,'x') from generate_series(1,1000000) id; INSERT 0 1000000 zjh@postgres=# zjh@postgres=# \timing on Timing is on. zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 76.603 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 60.682 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 60.963 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 60.632 ms ### 稳定的执行时间 zjh@postgres=# update big_table set v = v, id = id; #生成50%死元祖 UPDATE 1000000 Time: 1006.034 ms (00:01.006) zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 307.994 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 75.222 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 115.800 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 109.309 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 76.994 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 76.219 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 75.804 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 75.834 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 76.684 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 76.299 ms ## 稳定后带死元祖的执行时间,25%以上的额外时延 zjh@postgres=# update big_table set v = v, id = id; UPDATE 1000000 Time: 1923.425 ms (00:01.923) zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 352.238 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 103.585 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 211.861 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 144.573 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 99.129 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 100.284 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 99.148 ms ## 两次update后,额外负载65% zjh@postgres=# vacuum big_table ; VACUUM Time: 214.800 ms zjh@postgres=# zjh@postgres=# vacuum big_table ; VACUUM Time: 11.348 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 88.478 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 85.893 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 87.403 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 85.340 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 85.990 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 86.514 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 85.684 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 84.336 ms # vacuum之后,仍然还有40%的额外负载 zjh@postgres=# vacuum full big_table ; # full vacuum VACUUM Time: 416.220 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 66.535 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 63.514 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 63.039 ms ## 重组织后,负载基本消除,虽然也有一点点,但基本可以忽略不计了。 zjh@postgres=# update big_table set v = v, id = id; UPDATE 1000000 Time: 1116.814 ms (00:01.117) zjh@postgres=# update big_table set v = v, id = id; UPDATE 1000000 Time: 2250.193 ms (00:02.250) zjh@postgres=# update big_table set v = v, id = id; UPDATE 1000000 Time: 1264.835 ms (00:01.265) zjh@postgres=# update big_table set v = v, id = id; UPDATE 1000000 Time: 1266.069 ms (00:01.266) zjh@postgres=# update big_table set v = v, id = id; UPDATE 1000000 Time: 2000.205 ms (00:02.000) zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 384.475 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 174.367 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 364.749 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 178.216 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 176.623 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 170.568 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 170.408 ms zjh@postgres=# zjh@postgres=# zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 172.460 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 171.467 ms ## 五次update之后 zjh@postgres=# update big_table set v = v, id = id; UPDATE 1000000 Time: 1114.980 ms (00:01.115) zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 455.640 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 197.581 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 335.761 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 212.413 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 195.236 ms ## 六次update之后

zjh@postgres=# select pg_size_pretty(pg_relation_size('big_table'));
pg_size_pretty
----------------
675 MB    -- 可见膨胀是非常厉害的。
(1 row)

zjh@postgres=# vacuum big_table ;
VACUUM
Time: 662.148 ms
zjh@postgres=# select pg_size_pretty(pg_relation_size('big_table'));
 pg_size_pretty 
----------------
 675 MB     #  不回收,但是新的DML可以复用
(1 row)

Time: 0.406 ms
zjh@postgres=# vacuum full big_table ;
VACUUM
Time: 568.773 ms
zjh@postgres=# select pg_size_pretty(pg_relation_size('big_table'));
 pg_size_pretty 
----------------
 96 MB            # 回收,其实就是重组织
(1 row)
zjh@postgres=# vacuum big_table ;
VACUUM
Time: 40.442 ms
zjh@postgres=# 
zjh@postgres=# select pg_size_pretty(pg_relation_size('big_table'));
 pg_size_pretty 
----------------
 204 MB
(1 row)

Time: 0.331 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 98.444 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 105.852 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 104.016 ms
zjh@postgres=# select pg_size_pretty(pg_relation_size('big_table'));
 pg_size_pretty 
----------------
 204 MB
(1 row)

Time: 0.273 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 106.985 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 106.612 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 106.837 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 107.296 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 105.588 ms
zjh@postgres=# select pg_size_pretty(pg_relation_size('big_table'));
 pg_size_pretty 
----------------
 204 MB
(1 row)

Time: 0.347 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 103.695 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 104.665 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 106.016 ms
zjh@postgres=# select pg_size_pretty(pg_relation_size('big_table'));
 pg_size_pretty 
----------------
 204 MB
(1 row)

Time: 0.310 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 104.338 ms    # 此时所有死元祖回收的空间已经用完。
zjh@postgres=# select pg_size_pretty(pg_relation_size('big_table'));
 pg_size_pretty 
----------------
 212 MB    # 即使大量的死元祖,也会导致空间膨胀
(1 row)

Time: 0.446 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 109.384 ms
zjh@postgres=# select pg_size_pretty(pg_relation_size('big_table'));
 pg_size_pretty 
----------------
 222 MB    # 即使大量的死元祖,也会导致空间膨胀
(1 row) Time: 0.266 ms

  从上可知,因为未清理的死元祖会占用额外空间、也会导致vm未被置为可见,所以不仅I/O更高、占用的buffer更多、也需要额外判断mvcc是否满足,进而对性能的影响还是比较大。只读特性确保避免意外操作导致表被更新,进而产生额外不必要的死元祖、vaccum活动以及vm被误干扰。

  所以,对于大表做大量的update/delete(能truncate或insert select,就不要update/delete)之后,最好执行一次vacuum full xxx,释放空间。

postgresql中更新操作的物理实现

  8.2版本引入了HOT特性,极大地提高了索引更新的性能。

 

 

   8.2及之后,支持HOT,索引条目未更新不会插入新的,如下:

  

 

   同时也会不定期的做碎片清理,如下:

  

 

   所以在10+新版中,CPU最多多两次指针访问+空间浪费,一段时间后,只会多一次指针访问+0空间浪费。相比这一次指针访问,索引叶子节点中定位索引键就好几次比较了,所以几乎可以忽略不计。

  最后,vacuum除了能够解决清理死元组消除表膨胀外,冻结事务ID也是其职责,这影响到事务回卷,可参见 postgresql中的事务回卷原理及预防措施

  具体可参考下https://zhuanlan.zhihu.com/p/379706959

posted @ 2022-10-23 12:23  zhjh256  阅读(228)  评论(0编辑  收藏  举报