postgresql 发生 oom 的分析之二 cgroup
os:centos 7.4
postgresql:10.4
接上一篇blog:postgresql 发生 oom 的分析之一
本 blog 使用 cgroup 控制 os 的 memory,先简单介绍下cgroups:
CGroups 是一种对进程资源管理和控制的统一框架,它提供的是一种机制,而具体的策略(Policy)是通过子系统(subsystem)来完成的。
机制和策略是Linux操作系统中一种经典的设计思想,所谓机制就是“我要提供哪种功能”,而策略则是“我要怎样来实现这种功能”。
子系统是CGroups对进程组进行资源控制的具体行为。
安装
# yum install libcgroup libcgroup-devel libcgroup-tools
# systemctl status cgconfig.service
# cat /usr/lib/systemd/system/cgconfig.service
[Unit]
Description=Control Group configuration service
# The service should be able to start as soon as possible,
# before any 'normal' services:
DefaultDependencies=no
Conflicts=shutdown.target
Before=basic.target shutdown.target
[Service]
Type=oneshot
RemainAfterExit=yes
Delegate=yes
ExecStart=/usr/sbin/cgconfigparser -l /etc/cgconfig.conf -L /etc/cgconfig.d -s 1664
ExecStop=/usr/sbin/cgclear -l /etc/cgconfig.conf -L /etc/cgconfig.d -e
[Install]
WantedBy=sysinit.target
# systemctl start cgconfig.service
# ls -l /etc |grep -i cg
-rw-r--r-- 1 root root 676 Apr 11 10:33 cgconfig.conf
drwxr-xr-x 2 root root 6 Apr 11 10:33 cgconfig.d
-rw-r--r-- 1 root root 234 Apr 11 10:33 cgrules.conf
-rw-r--r-- 1 root root 131 Apr 11 10:33 cgsnapshot_blacklist.conf
# df -hT
Filesystem Type Size Used Avail Use% Mounted on
/dev/mapper/centos-root xfs 47G 23G 25G 48% /
devtmpfs devtmpfs 2.0G 0 2.0G 0% /dev
tmpfs tmpfs 2.0G 8.0K 2.0G 1% /dev/shm
tmpfs tmpfs 2.0G 8.9M 2.0G 1% /run
tmpfs tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
/dev/sda1 xfs 1014M 160M 855M 16% /boot
tmpfs tmpfs 396M 4.0K 396M 1% /run/user/42
tmpfs tmpfs 396M 28K 396M 1% /run/user/0
注意
tmpfs tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
配置
cgroups 可以控制的资源
# cat /proc/cgroups
#subsys_name hierarchy num_cgroups enabled
cpuset 4 1 1
cpu 3 84 1
cpuacct 3 84 1
memory 6 85 1
devices 7 84 1
freezer 2 1 1
net_cls 10 1 1
blkio 11 84 1
perf_event 5 1 1
hugetlb 8 1 1
pids 9 1 1
net_prio 10 1 1
# lssubsys -am
cpuset /sys/fs/cgroup/cpuset
cpu,cpuacct /sys/fs/cgroup/cpu,cpuacct
memory /sys/fs/cgroup/memory
devices /sys/fs/cgroup/devices
freezer /sys/fs/cgroup/freezer
net_cls,net_prio /sys/fs/cgroup/net_cls,net_prio
blkio /sys/fs/cgroup/blkio
perf_event /sys/fs/cgroup/perf_event
hugetlb /sys/fs/cgroup/hugetlb
pids /sys/fs/cgroup/pids
本次配置控制内存
# cd /etc/cgconfig.d
# vi postgresql.conf
group postgresql {
perm {
task{
uid=postgres;
gid=postgres;
}
admin{
uid=root;
gid=root;
}
} memory {
memory.limit_in_bytes=2000M;
}
}
# vi /etc/cgrules.conf
postgres memory postgresql/
# systemctl stop cgconfig.service
# systemctl start cgconfig.service
# free -m
total used free shared buff/cache available
Mem: 3951 656 2454 9 841 3203
Swap: 2047 0 2047
启动postgresql
# su - postgres
$ /usr/pgsql-10/bin/pg_ctl start -D /var/lib/pgsql/10/data
# free -m
total used free shared buff/cache available
Mem: 3951 658 2433 24 860 3185
Swap: 2047 0 2047
插入大数据
postgres=# insert into test01 values(generate_series(1,5000000),repeat( chr(int4(random()*26)+65),1000));
os层面,该进程的内存使用率一致维持在 50% 左右
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 18612 4.0 49.1 2292316 935160 ? Ds 06:10 0:14 postgres: peiyb: postgres postgres [local] INSERT
没坚持多久,进程也被杀掉了,查看os日志
# dmesg
[26191.627741] Task in /postgresql killed as a result of limit of /postgresql
[26191.627743] memory: usage 2048000kB, limit 2048000kB, failcnt 2126672
[26191.627744] memory+swap: usage 4136808kB, limit 9007199254740988kB, failcnt 0
[26191.627745] kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
[26191.627746] Memory cgroup stats for /postgresql: cache:258152KB rss:1789848KB rss_huge:0KB mapped_file:247288KB swap:2088808KB inactive_anon:561100KB active_anon:1486740KB inactive_file:0KB active_file:0KB unevictable:0KB
[26191.627753] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
[26191.627806] [18551] 1002 18551 113848 3607 50 217 0 postgres
[26191.627808] [18559] 1002 18559 42656 141 32 243 0 postgres
[26191.627811] [24714] 1002 24714 29155 463 14 346 0 bash
[26191.627812] [24956] 1002 24956 113951 44739 169 220 0 postgres
[26191.627814] [24957] 1002 24957 113887 36498 165 211 0 postgres
[26191.627815] [24958] 1002 24958 113848 2228 38 218 0 postgres
[26191.627817] [24959] 1002 24959 113983 395 39 242 0 postgres
[26191.627818] [24960] 1002 24960 43221 199 33 202 0 postgres
[26191.627820] [24961] 1002 24961 113955 216 36 288 0 postgres
[26191.627821] [24973] 1002 24973 39182 692 31 40 0 psql
[26191.627823] [24974] 1002 24974 532267 272088 989 193541 0 postgres
[26191.627824] [24997] 1002 24997 39182 543 31 198 0 psql
[26191.627826] [24998] 1002 24998 661215 275338 1240 324846 0 postgres
[26191.627827] Memory cgroup out of memory: Kill process 24998 (postgres) score 580 or sacrifice child
[26191.627830] Killed process 24998 (postgres) total-vm:2644860kB, anon-rss:885472kB, file-rss:2572kB, shmem-rss:213308kB
可以看到 Memory cgroup out of memory
cgroup 可以用来限定一类或一个进程的资源达到什么程度时就杀掉进程,此时 os 可能还有不少空闲内存。
而 os oom killer 出现时,意味着 os 几乎没有可用的内存了。
参考:
https://www.cnblogs.com/easton-wang/p/7656205.html
https://www.cnblogs.com/doscho/p/6041036.html
http://www.oracle.com/technetwork/articles/servers-storage-admin/resource-controllers-linux-1506602.html