PostgreSQL的大对象以及空间使用 (2)
2020-10-11 10:22 abce 阅读(489) 评论(1) 编辑 收藏 举报在上一篇文章中,展示了插入大对象后,pg_largeobject表中有多少大对象使用空间。
让我们再深入研究一下:
该表有2个大对象(总共1024个记录):
1 2 3 4 5 | lob_test=# select pg_relation_size( 'pg_largeobject' ); pg_relation_size ------------------ 1441792 (1 row) |
再来添加一个随机填充的文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | lob_test=# \lo_import '/tmp/randoms' ; lo_import 16493 lob_test=# select pg_relation_size( 'pg_largeobject' ); pg_relation_size ------------------ 2842624 (1 row) lob_test=# select oid, * from pg_largeobject_metadata; oid | lomowner | lomacl -------+----------+-------- 16491 | 10 | 16492 | 10 | 16493 | 10 | (3 rows ) |
如果您阅读了Frits Hoogland(https://fritshoogland.wordpress.com/category/postgresql/)撰写的一系列不错的博客文章,则应该了解pageinspect扩展名和t_infomask 16位掩码。
让我们安装它并检查pg_largeobjects页面的内容:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | lob_test=# select * from page_header(get_raw_page( 'pg_largeobject' ,0)); lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid -------------+----------+-------+-------+-------+---------+----------+---------+----------- 18/38004C10 | 0 | 0 | 452 | 488 | 8192 | 8192 | 4 | 0 (1 row) -- same result (lower 452, upper 488) for blocks 1...3 lob_test=# select * from page_header(get_raw_page( 'pg_largeobject' ,4)); lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid -------------+----------+-------+-------+-------+---------+----------+---------+----------- 18/380179F8 | 0 | 0 | 360 | 2144 | 8192 | 8192 | 4 | 0 (1 row) lob_test=# select * from page_header(get_raw_page( 'pg_largeobject' ,5)); lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid -------------+----------+-------+-------+-------+---------+----------+---------+----------- 18/381386E0 | 0 | 0 | 36 | 1928 | 8192 | 8192 | 4 | 0 (1 row) -- same result for the remaining blocks |
已经知道的一些数字,但是我们更宁愿将所有部分放在一起分析。我们知道:page header占用24个字节,并且行指针为每个元组使用4个字节。
前4个页的偏移量较低,为452个字节,这意味着我们有(452-24)/ 4 = 107个元组。
第5页(page number是4)的lower为360:(360-24)/ 4 = 84元组。
让我们检查是否正确:
1 2 3 4 5 6 7 8 9 10 11 12 | lob_test=# select generate_series as page, ( select count (*) from heap_page_items(get_raw_page( 'pg_largeobject' ,generate_series))) as tuples from generate_series(0,5); page | tuples ------+-------- 0 | 107 1 | 107 2 | 107 3 | 107 4 | 84 5 | 3 (6 rows ) |
现在,让我们删除那1Mb的文件,并再次检查空间:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | lob_test=# \lo_unlink 16492 lo_unlink 16492 lob_test=# select pg_relation_size( 'pg_largeobject' ); pg_relation_size ------------------ 2842624 (1 row) lob_test=# select oid, * from pg_largeobject_metadata; oid | lomowner | lomacl -------+----------+-------- 16491 | 10 | 16493 | 10 | (2 rows ) lob_test=# select generate_series as pageno, ( select count (*) from heap_page_items(get_raw_page( 'pg_largeobject' ,generate_series)) ) from generate_series(0,12); pageno | count --------+------- 0 | 107 1 | 107 2 | 107 3 | 107 4 | 84 5 | 3 6 | 3 7 | 3 8 | 3 9 | 3 10 | 3 11 | 3 12 | 3 |
1 2 3 | […] a tuple is the latest version of its row iff XMAX is invalid or t_ctid points to itself ( in which case , if XMAX is valid, the tuple is either locked or deleted). […] ( from htup_details.h lines 87-89). |
我们必须对第12位(2048,或0x0800)检查信息进行检查。
1 | #define HEAP_XMAX_INVALID 0x0800 /* t_xmax invalid/aborted */ |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | lob_test=# select generate_series as pageno, ( select count (*) from heap_page_items(get_raw_page( 'pg_largeobject' ,generate_series)) where t_infomask:: bit (16) & x '0800' :: bit (16) = x '0800' :: bit (16)) from generate_series(0,12); pageno | count --------+------- 0 | 107 1 | 107 2 | 107 3 | 107 4 | 84 5 | 0 6 | 0 7 | 0 8 | 0 9 | 0 10 | 0 11 | 0 12 | 0 |
大对象被分割成压缩的chunk,其内部行为与常规行相同!
1 2 3 4 5 6 7 | lob_test=# \lo_import '/tmp/randoms' ; lo_import 16520 lob_test=# select pg_relation_size( 'pg_largeobject' ); pg_relation_size ------------------ 4235264 (1 row) |
将元组标记为可重用是vacuum的工作:
1 2 3 4 5 6 7 8 | lob_test=# vacuum pg_largeobject; VACUUM lob_test=# select pg_relation_size( 'pg_largeobject' ); pg_relation_size ------------------ 4235264 (1 row) |
常规的vacuum不会释放空出空间,但空间现在可以重复使用:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | lob_test=# select generate_series as pageno, ( select count (*) from heap_page_items(get_raw_page( 'pg_largeobject' ,generate_series)) where t_infomask:: bit (16) & x '0800' :: bit (16) = x '0800' :: bit (16)) from generate_series(0,12); pageno | count --------+------- 0 | 107 1 | 107 2 | 107 3 | 107 4 | 84 5 | 0 6 | 0 7 | 0 8 | 0 9 | 0 10 | 0 11 | 0 12 | 0 lob_test=# \lo_import '/tmp/randoms' ; lo_import 16521 lob_test=# lob_test=# select pg_relation_size( 'pg_largeobject' ); pg_relation_size ------------------ 4235264 (1 row) -- same size as before! lob_test=# select generate_series as pageno, ( select count (*) from heap_page_items(get_raw_page( 'pg_largeobject' ,generate_series)) where t_infomask:: bit (16) & x '0800' :: bit (16) = x '0800' :: bit (16)) from generate_series(0,12); pageno | count --------+------- 0 | 107 1 | 107 2 | 107 3 | 107 4 | 84 5 | 3 6 | 3 7 | 3 8 | 3 9 | 3 10 | 3 11 | 3 12 | 3 |
如果我们再次unlink 这个lob对象,执行full vacuum,空闲就被释放了:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | lob_test=# \lo_unlink 16521 lo_unlink 16521 lob_test=# select pg_relation_size( 'pg_largeobject' ); pg_relation_size ------------------ 4235264 (1 row) lob_test=# vacuum full pg_largeobject; VACUUM lob_test=# select pg_relation_size( 'pg_largeobject' ); pg_relation_size ------------------ 2842624 (1 row) |
原文:
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 浏览器原生「磁吸」效果!Anchor Positioning 锚点定位神器解析
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· .NET10 - 预览版1新功能体验(一)