postgresql在linux上会cache哪些内容
2021-06-10 10:11 abce 阅读(396) 评论(0) 编辑 收藏 举报linux提供的基础指标容易产生误导,很多人看了之后也不知道真正衡量的是什么。
1 2 3 4 5 6 7 8 9 10 | wget https://ftp.postgresql.org/pub/source/v11.12/postgresql-11.12.tar.gz ./configure --enable-debug make make install adduser postgres mkdir /usr/ local /pgsql/data chown postgres /usr/ local /pgsql/data su - postgres /usr/ local /pgsql/bin/initdb -D /usr/ local /pgsql/data /usr/ local /pgsql/bin/pg_ctl -D /usr/ local /pgsql/data -l logfile start |
下载并安装linux-ftools:
linux-ftools提供的fincore工具可以显示文件是如何映射到文件系统缓存的。
1 2 | wget https://download.opensuse.org/repositories/home:/bmanojlovic/RHEL_7/x86_64/linux-ftools-1.3-5.1.x86_64.rpm rpm -ivh linux-ftools-1.3-5.1.x86_64.rpm |
pg_prewarm extension可以快速地将表加载到cache中。
1 2 3 | cd contrib make & & make install /usr/ local /pgsql/bin/psql postgres <<< " create extension pg_prewarm; " |
准备数据,每个scale=64
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | $ /usr/ local /pgsql/bin/pgbench -i -s 64 postgres dropping old tables... NOTICE: table "pgbench_accounts" does not exist, skipping NOTICE: table "pgbench_branches" does not exist, skipping NOTICE: table "pgbench_history" does not exist, skipping NOTICE: table "pgbench_tellers" does not exist, skipping creating tables... generating data... 100000 of 6400000 tuples (1%) done (elapsed 0.08 s, remaining 5.25 s) 200000 of 6400000 tuples (3%) done (elapsed 0.22 s, remaining 6.70 s) 300000 of 6400000 tuples (4%) done (elapsed 0.36 s, remaining 7.22 s) 400000 of 6400000 tuples (6%) done (elapsed 0.46 s, remaining 6.86 s) 500000 of 6400000 tuples (7%) done (elapsed 0.60 s, remaining 7.03 s) 600000 of 6400000 tuples (9%) done (elapsed 0.72 s, remaining 6.99 s) 700000 of 6400000 tuples (10%) done (elapsed 0.81 s, remaining 6.63 s) 800000 of 6400000 tuples (12%) done (elapsed 0.94 s, remaining 6.59 s) 900000 of 6400000 tuples (14%) done (elapsed 1.04 s, remaining 6.35 s) 1000000 of 6400000 tuples (15%) done (elapsed 1.16 s, remaining 6.25 s) 1100000 of 6400000 tuples (17%) done (elapsed 1.30 s, remaining 6.27 s) 1200000 of 6400000 tuples (18%) done (elapsed 1.40 s, remaining 6.06 s) 1300000 of 6400000 tuples (20%) done (elapsed 1.52 s, remaining 5.98 s) 1400000 of 6400000 tuples (21%) done (elapsed 1.67 s, remaining 5.96 s) 1500000 of 6400000 tuples (23%) done (elapsed 1.76 s, remaining 5.76 s) 1600000 of 6400000 tuples (25%) done (elapsed 1.89 s, remaining 5.67 s) 1700000 of 6400000 tuples (26%) done (elapsed 2.05 s, remaining 5.66 s) 1800000 of 6400000 tuples (28%) done (elapsed 2.13 s, remaining 5.45 s) 1900000 of 6400000 tuples (29%) done (elapsed 2.27 s, remaining 5.38 s) 2000000 of 6400000 tuples (31%) done (elapsed 2.41 s, remaining 5.30 s) 2100000 of 6400000 tuples (32%) done (elapsed 2.50 s, remaining 5.12 s) 2200000 of 6400000 tuples (34%) done (elapsed 2.63 s, remaining 5.02 s) 2300000 of 6400000 tuples (35%) done (elapsed 2.73 s, remaining 4.87 s) 2400000 of 6400000 tuples (37%) done (elapsed 2.85 s, remaining 4.76 s) 2500000 of 6400000 tuples (39%) done (elapsed 2.99 s, remaining 4.66 s) 2600000 of 6400000 tuples (40%) done (elapsed 3.09 s, remaining 4.51 s) 2700000 of 6400000 tuples (42%) done (elapsed 3.20 s, remaining 4.38 s) 2800000 of 6400000 tuples (43%) done (elapsed 3.34 s, remaining 4.30 s) 2900000 of 6400000 tuples (45%) done (elapsed 3.43 s, remaining 4.13 s) 3000000 of 6400000 tuples (46%) done (elapsed 3.56 s, remaining 4.04 s) 3100000 of 6400000 tuples (48%) done (elapsed 3.64 s, remaining 3.88 s) 3200000 of 6400000 tuples (50%) done (elapsed 3.78 s, remaining 3.78 s) 3300000 of 6400000 tuples (51%) done (elapsed 3.91 s, remaining 3.67 s) 3400000 of 6400000 tuples (53%) done (elapsed 4.01 s, remaining 3.54 s) 3500000 of 6400000 tuples (54%) done (elapsed 4.14 s, remaining 3.43 s) 3600000 of 6400000 tuples (56%) done (elapsed 4.28 s, remaining 3.33 s) 3700000 of 6400000 tuples (57%) done (elapsed 4.38 s, remaining 3.20 s) 3800000 of 6400000 tuples (59%) done (elapsed 4.50 s, remaining 3.08 s) 3900000 of 6400000 tuples (60%) done (elapsed 4.65 s, remaining 2.98 s) 4000000 of 6400000 tuples (62%) done (elapsed 4.75 s, remaining 2.85 s) 4100000 of 6400000 tuples (64%) done (elapsed 4.89 s, remaining 2.74 s) 4200000 of 6400000 tuples (65%) done (elapsed 4.99 s, remaining 2.61 s) 4300000 of 6400000 tuples (67%) done (elapsed 5.14 s, remaining 2.51 s) 4400000 of 6400000 tuples (68%) done (elapsed 5.29 s, remaining 2.40 s) 4500000 of 6400000 tuples (70%) done (elapsed 5.43 s, remaining 2.29 s) 4600000 of 6400000 tuples (71%) done (elapsed 5.53 s, remaining 2.16 s) 4700000 of 6400000 tuples (73%) done (elapsed 5.67 s, remaining 2.05 s) 4800000 of 6400000 tuples (75%) done (elapsed 5.80 s, remaining 1.93 s) 4900000 of 6400000 tuples (76%) done (elapsed 5.89 s, remaining 1.80 s) 5000000 of 6400000 tuples (78%) done (elapsed 6.02 s, remaining 1.69 s) 5100000 of 6400000 tuples (79%) done (elapsed 6.12 s, remaining 1.56 s) 5200000 of 6400000 tuples (81%) done (elapsed 6.24 s, remaining 1.44 s) 5300000 of 6400000 tuples (82%) done (elapsed 6.37 s, remaining 1.32 s) 5400000 of 6400000 tuples (84%) done (elapsed 6.47 s, remaining 1.20 s) 5500000 of 6400000 tuples (85%) done (elapsed 6.59 s, remaining 1.08 s) 5600000 of 6400000 tuples (87%) done (elapsed 6.68 s, remaining 0.95 s) 5700000 of 6400000 tuples (89%) done (elapsed 6.82 s, remaining 0.84 s) 5800000 of 6400000 tuples (90%) done (elapsed 6.96 s, remaining 0.72 s) 5900000 of 6400000 tuples (92%) done (elapsed 7.10 s, remaining 0.60 s) 6000000 of 6400000 tuples (93%) done (elapsed 7.20 s, remaining 0.48 s) 6100000 of 6400000 tuples (95%) done (elapsed 7.33 s, remaining 0.36 s) 6200000 of 6400000 tuples (96%) done (elapsed 7.42 s, remaining 0.24 s) 6300000 of 6400000 tuples (98%) done (elapsed 7.54 s, remaining 0.12 s) 6400000 of 6400000 tuples (100%) done (elapsed 7.69 s, remaining 0.00 s) vacuuming... creating primary keys... done. |
大概1.8G
1 2 3 4 | $ du -sh data/ 1.8G data/ $ grep ^shared data/postgresql.conf shared_buffers = 2048MB |
为了做实验,这里我重启了一下操作系统。
1 2 3 4 | $ free -mw total used free shared buffers cache available Mem: 15885 478 14927 9 2 477 15230 Swap: 10239 0 10239 |
free命令会显示所有的空闲的、已经使用的物理内存和swap内存,包含内核使用的buffers和caches。信息来自/proc/meminfo 其中:
·total:全部的内存(/proc/meminfo中的MemTotal和SwapTotal)
重启后,系统几乎没有使用共享内存。477MB被cache了,14927MB是空闲的。
1 2 3 4 5 6 7 | $ /usr/ local /pgsql/bin/pg_ctl -D /usr/ local /pgsql/data -l logfile start waiting for server to start.... done server started $ free -mw total used free shared buffers cache available Mem: 15885 480 14853 72 2 549 15165 Swap: 10239 0 10239 |
启动postgresql实例后,几乎没有什么改变。因为linux采用的是lazy分配SHM的方式:shared buffers在使用的时候才会分配,因此这里只是增加了一点点。
1 2 3 4 5 6 7 8 9 10 11 | $ /usr/ local /pgsql/bin/psql postgres <<< " select pg_size_pretty(current_setting('block_size')::int*pg_prewarm('pgbench_accounts','buffer')); " pg_size_pretty ---------------- 820 MB (1 row) $ free -mw total used free shared buffers cache available Mem: 15885 479 13207 892 2 2196 14344 Swap: 10239 0 10239 $ |
在postgresql中,会在用户空间(userspace)分配一个共享段用于管理对页的并发写和频繁的读。但是与大多数RDBMS不同,这里没有进一步的优化,比如读的预取,写的重新排序。这意味着我们必须有足够的空间用于文件系统缓存。频繁读取的数据没有放入shared_buffers中,就需要从该共享段进行缓存。这就是postgresql的双缓存,双缓存都被统计在free命令的"cache"部分。
为了更好的预测性能。必须要理解文件系统缓存。可以借助工具linux-ftools。以下是linux-fincore获得的缓存的文件的信息:
1 2 | $ linux-fincore -s $(find /usr/ local /pgsql/data/base -type f) | grep ^total total cached size : 860,930,048 |
这里是821MB。不幸的是,你需要传递文件列表进行详细检查,所以这里只用了total。
1 2 3 4 5 6 7 | $ linux-fincore -s -S 1048576 $(find /usr/ local /pgsql/data/base -type f) filename size total_pages min_cached page cached_pages cached_size cached_perc -------- ---- ----------- --------------- ------------ ----------- ----------- /usr/ local /pgsql/data/base/13287/16400 859,496,448 209,838 0 209,838 859,496,448 100.00 /usr/ local /pgsql/data/base/13287/16408 143,777,792 35,102 -1 0 0 0.00 --- total cached size : 859,496,448 |
也可以借助pg_buffercache extension,快速地获得文件系统的大小
1 2 3 4 5 6 7 8 | $ /usr/ local /pgsql/bin/psql postgres <<< " select relname,relkind,current_setting('block_size')::int*relpages/1024/1024 as size_MB,current_setting('block_size')::int*buffers/1024/1024 as shared_mb,relfilenode,current_setting('data_directory')||'/'||pg_relation_filepath(c.oid) as file_path from pg_class c left outer join (select relfilenode, count(*) as buffers from pg_buffercache group by relfilenode) b using(relfilenode) where relpages>100;" | awk '/[/]base[/]/{"linux-fincore -s "$NF"* | grep ^total | cut -d: -f2" | getline f;printf "%6dM %s\n",gensub(/,/,"","g",f)/1024/1024,$0;next}{printf "%6s %s\n","",$0}' relname | relkind | size_mb | shared_mb | relfilenode | file_path -----------------------+---------+---------+-----------+-------------+---------------------------------------- 819M pgbench_accounts | r | 819 | 819 | 16400 | /usr/ local /pgsql/data/base/13287/16400 0M pgbench_accounts_pkey | i | 137 | | 16408 | /usr/ local /pgsql/data/base/13287/16408 (2 rows ) $ |
释放刷新cache
1 2 3 4 5 | # echo 3 > /proc/sys/vm/drop_caches; # free -mw total used free shared buffers cache available Mem: 15885 465 14322 894 0 1096 14362 Swap: 10239 0 10239 |
shared buffer仍然保持着。
1 2 3 4 5 6 7 8 | $ /usr/ local /pgsql/bin/pg_ctl -D /usr/ local /pgsql/data -l logfile stop waiting for server to shut down.... done server stopped $ free -mw total used free shared buffers cache available Mem: 15885 463 15210 9 0 211 15251 Swap: 10239 0 10239 $ |
如果你还想知道,这cache中的211MB是哪些内容
1 2 3 4 5 6 7 8 9 10 11 | $ find / -type f - exec linux-fincore {} \; 2>/dev/ null | awk '/^[/]/{gsub(/,/,"");m=$(NF-1)/1024/1024;gsub(/ */," ");if(m>1)printf "%10d MB %s\n",m,$0}' | sort -h | tee all .log | sort -n | tail -10 2 MB /usr/lib64/dri/vmwgfx_dri.so 11038656 2695 0 708 2899968 26.27 2 MB /usr/lib64/gnome-shell/libgnome-shell.so 2844392 695 0 625 2560000 89.93 2 MB /usr/lib64/libc-2.17.so 2156352 527 0 527 2158592 100.00 2 MB /usr/lib64/libgtk-3.so.0.2200.26 7503704 1832 0 592 2424832 32.31 6 MB /usr/lib64/gjs/libmozjs-52.so 24401928 5958 0 1646 6742016 27.63 6 MB /usr/ local /pgsql/bin/postgres 33821416 8258 0 1559 6385664 18.88 8 MB /usr/lib/locale/locale-archive 106176928 25923 0 2048 8388608 7.90 12 MB /usr/lib64/libLLVM-5.0-rhel.so 37261600 9098 0 3181 13029376 34.96 13 MB /usr/bin/containerd 49102360 11988 0 3516 14401536 29.33 25 MB /usr/bin/dockerd 105055544 25649 0 6503 26636288 25.35 |
这需要执行一些时间,但是可以显示cache的都是哪些文件。
到此,就可以理解cache统计信息的含义了。
我还添加了性能指标“tps”——每秒的事务量。在shared_buffers的范围内运行这个仅查询的工作负载会显示出稳定的性能(在单线程中约为13500 tps)。
当需要从文件系统缓存中拷贝一些页面时,性能会缓慢降低。 当到达超出物理RAM容量的范围时,物理读取就会涉及到,tps就会降低很多。 这里有14GB的RAM,当scale达到了570,除了4GB的shared_buffers外,还有8GB用于频繁读取数据。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 浏览器原生「磁吸」效果!Anchor Positioning 锚点定位神器解析
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· .NET10 - 预览版1新功能体验(一)