服务器上mysqld,java的进程Out of Memory,被kernel kill 掉了

/var/log/messages 里面日志如下

Aug 10 19:47:16 VM-0-7-centos kernel: 8936 total pagecache pages
Aug 10 19:47:16 VM-0-7-centos kernel: 0 pages in swap cache
Aug 10 19:47:16 VM-0-7-centos kernel: Swap cache stats: add 0, delete 0, find 0/0
Aug 10 19:47:16 VM-0-7-centos kernel: Free swap  = 0kB
Aug 10 19:47:16 VM-0-7-centos kernel: Total swap = 0kB
Aug 10 19:47:16 VM-0-7-centos kernel: 2097016 pages RAM
Aug 10 19:47:16 VM-0-7-centos kernel: 0 pages HighMem/MovableOnly
Aug 10 19:47:16 VM-0-7-centos kernel: 94855 pages reserved
Aug 10 19:47:16 VM-0-7-centos kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
Aug 10 19:47:16 VM-0-7-centos kernel: [13357]    27 13357  1244419   814184    2246        0             0 mysqld
Aug 10 19:47:16 VM-0-7-centos kernel: Out of memory: Kill process 13357 (mysqld) score 407 or sacrifice child
Aug 10 19:47:16 VM-0-7-centos kernel: Killed process 13357 (mysqld), UID 27, total-vm:4977676kB, anon-rss:3256736kB, file-rss:0kB, shmem-rss:0kB
Aug 10 19:47:16 VM-0-7-centos kernel: VM Periodic Tas invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Aug 10 19:47:16 VM-0-7-centos kernel: VM Periodic Tas cpuset=/ mems_allowed=0
Aug 10 19:47:16 VM-0-7-centos kernel: CPU: 0 PID: 18863 Comm: VM Periodic Tas Kdump: loaded Not tainted 3.10.0-1127.19.1.el7.x86_64 #1
Aug 10 19:47:16 VM-0-7-centos kernel: Hardware name: Smdbmds KVM, BIOS seabios-1.9.1-qemu-project.org 04/01/2014
Aug 10 19:47:16 VM-0-7-centos kernel: Call Trace:

Aug 10 19:47:16 VM-0-7-centos kernel: Out of memory: Kill process 13388 (mysqld) score 407 or sacrifice child
Aug 10 19:47:16 VM-0-7-centos kernel: Killed process 13388 (mysqld), UID 27, total-vm:4977676kB, anon-rss:3257420kB, file-rss:0kB, shmem-rss:0kB
Aug 10 19:47:16 VM-0-7-centos kernel: logback-3 invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Aug 10 19:47:16 VM-0-7-centos kernel: logback-3 cpuset=/ mems_allowed=0
Aug 10 19:47:16 VM-0-7-centos kernel: CPU: 1 PID: 26485 Comm: logback-3 Kdump: loaded Not tainted 3.10.0-1127.19.1.el7.x86_64 #1
Aug 10 19:47:16 VM-0-7-centos kernel: Hardware name: Smdbmds KVM, BIOS seabios-1.9.1-qemu-project.org 04/01/2014

Aug 10 19:47:16 VM-0-7-centos kernel: [18851]  1000 18851  1442473   397376    1068        0             0 java
Aug 10 19:47:16 VM-0-7-centos kernel: Out of memory: Kill process 18851 (java) score 199 or sacrifice child
Aug 10 19:47:16 VM-0-7-centos kernel: Killed process 18851 (java), UID 1000, total-vm:5769892kB, anon-rss:1589504kB, file-rss:0kB, shmem-rss:0kB
Aug 10 19:47:16 VM-0-7-centos kernel: node /home/hjda invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Aug 10 19:47:16 VM-0-7-centos kernel: node /home/hjda cpuset=/ mems_allowed=0
Aug 10 19:47:16 VM-0-7-centos kernel: CPU: 3 PID: 8277 Comm: node /home/hjda Kdump: loaded Not tainted 3.10.0-1127.19.1.el7.x86_64 #1
Aug 10 19:47:16 VM-0-7-centos kernel: Hardware name: Smdbmds KVM, BIOS seabios-1.9.1-qemu-project.org 04/01/2014

Aug 10 19:47:16 VM-0-7-centos kernel: Out of memory: Kill process 18855 (VM Thread) score 201 or sacrifice child
Aug 10 19:47:16 VM-0-7-centos kernel: Killed process 18855 (VM Thread), UID 1000, total-vm:5769892kB, anon-rss:1612704kB, file-rss:0kB, shmem-rss:0kB
Aug 10 19:47:16 VM-0-7-centos kernel: node /home/hjda invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Aug 10 19:47:16 VM-0-7-centos kernel: node /home/hjda cpuset=/ mems_allowed=0
Aug 10 19:47:16 VM-0-7-centos kernel: CPU: 3 PID: 8277 Comm: node /home/hjda Kdum

Aug 10 19:47:16 VM-0-7-centos kernel: Out of memory: Kill process 18863 (VM Periodic Tas) score 201 or sacrifice child
Aug 10 19:47:16 VM-0-7-centos systemd: mysqld.service: main process exited, code=killed, status=9/KILL
Aug 10 19:47:16 VM-0-7-centos systemd: Unit mysqld.service entered failed state.
Aug 10 19:47:16 VM-0-7-centos systemd: mysqld.service failed.
Aug 10 19:47:16 VM-0-7-centos systemd: mysqld.service holdoff time over, scheduling restart.
Aug 10 19:47:16 VM-0-7-centos systemd: Stopped MySQL Server.
Aug 10 19:47:16 VM-0-7-centos systemd: Starting MySQL Server...
Aug 10 19:47:26 VM-0-7-centos systemd: Started MySQL Server.

 可以看到一开始是: VM Periodic Tas invoked oom-killer ,而VM Periodic Tag可能是JVM线程的东东。 Killed process 13357 (mysqld) 是指把分数最不理想的mysqld给kill了

后面则是 logback-3 invoked oom-killer ,killed process 13388 (mysqld)。再后面则是node invoked oom-killer. 自从ssr之后,对后台的压力增加了。

 

不过看监控在发生oom-killer时系统内存占用大概是6.8G左右,还有空闲。但还是发生了。而且看日志之前已经发生多次。

看网上的解释:The “OOM Killer” or “Out of Memory Killer” is a process that the Linux kernel employs when the system is critically low on memory.

 

看网上有人说建立swap分区会好一些,那我只好建了一个400M的swap分区文件。

========================================

已上并没有什么卵用,下面分析一下oom kill的log

Aug 23 22:26:50 VM-0-7-centos kernel: VM Thread invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
...........

Aug 23 22:26:51 VM-0-7-centos kernel: Mem-Info:
Aug 23 22:26:51 VM-0-7-centos kernel: active_anon:1550473 inactive_anon:321719 isolated_anon:0#012 active_file:8324 inactive_file:8839 isolated_file:32#012 unevictable:0 dirty:6 writeback:5 unstable:0#012 slab_reclaimable:29739 slab_unreclaimable:18215#012 mapped:655 shmem:157 pagetables:10440 bounce:0#012 free:27277 free_pcp:690 free_cma:0
Aug 23 22:26:51 VM-0-7-centos kernel: Node 0 DMA free:15892kB min:132kB low:164kB high:196kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:16kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Aug 23 22:26:51 VM-0-7-centos kernel: lowmem_reserve[]: 0 2830 7805 7805
Aug 23 22:26:51 VM-0-7-centos kernel: Node 0 DMA32 free:48256kB min:24468kB low:30584kB high:36700kB active_anon:2125528kB inactive_anon:555044kB active_file:20808kB inactive_file:22456kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3129192kB managed:2898760kB mlocked:0kB dirty:12kB writeback:8kB mapped:1824kB shmem:244kB slab_reclaimable:62172kB slab_unreclaimable:28556kB kernel_stack:2928kB pagetables:15376kB unstable:0kB bounce:0kB free_pcp:1308kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:32 all_unreclaimable? no
Aug 23 22:26:51 VM-0-7-centos kernel: lowmem_reserve[]: 0 0 4974 4974
Aug 23 22:26:51 VM-0-7-centos kernel: Node 0 Normal free:45948kB min:42976kB low:53720kB high:64464kB active_anon:4076364kB inactive_anon:731832kB active_file:12232kB inactive_file:13024kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:5242880kB managed:5093976kB mlocked:0kB dirty:12kB writeback:12kB mapped:796kB shmem:384kB slab_reclaimable:56784kB slab_unreclaimable:44288kB kernel_stack:9600kB pagetables:26384kB unstable:0kB bounce:0kB free_pcp:912kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:39131 all_unreclaimable? yes
Aug 23 22:26:51 VM-0-7-centos kernel: lowmem_reserve[]: 0 0 0 0
Aug 23 22:26:51 VM-0-7-centos kernel: Node 0 DMA: 1*4kB (U) 0*8kB 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15892kB
Aug 23 22:26:51 VM-0-7-centos kernel: Node 0 DMA32: 1784*4kB (UEM) 938*8kB (UEM) 773*16kB (UEM) 412*32kB (UEM) 107*64kB (UE) 5*128kB (UEM) 0*256kB 1*512kB (M) 1*1024kB (M) 0*2048kB 0*4096kB = 49216kB
Aug 23 22:26:51 VM-0-7-centos kernel: Node 0 Normal: 1003*4kB (UEM) 1872*8kB (UEM) 589*16kB (UEM) 386*32kB (UEM) 52*64kB (UEM) 11*128kB (EM) 2*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 46012kB
Aug 23 22:26:51 VM-0-7-centos kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Aug 23 22:26:51 VM-0-7-centos kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Aug 23 22:26:51 VM-0-7-centos kernel: 26283 total pagecache pages
Aug 23 22:26:51 VM-0-7-centos kernel: 9173 pages in swap cache
Aug 23 22:26:51 VM-0-7-centos kernel: Swap cache stats: add 216402, delete 207233, find 812854/834105
Aug 23 22:26:51 VM-0-7-centos kernel: Free swap = 0kB
Aug 23 22:26:51 VM-0-7-centos kernel: Total swap = 409596kB
Aug 23 22:26:51 VM-0-7-centos kernel: 2097016 pages RAM
Aug 23 22:26:51 VM-0-7-centos kernel: 0 pages HighMem/MovableOnly
Aug 23 22:26:51 VM-0-7-centos kernel: 94855 pages reserved

Aug 23 22:26:51 VM-0-7-centos kernel: [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
Aug 23 22:26:51 VM-0-7-centos kernel: [ 661] 999 661 180376 811 67 1501 0 polkitd
Aug 23 22:26:51 VM-0-7-centos kernel: [ 664] 998 664 2145 7 10 29 0 lsmd
Aug 23 22:26:51 VM-0-7-centos kernel: [ 666] 81 666 15390 297 32 122 -900 dbus-daemon
Aug 23 22:26:51 VM-0-7-centos kernel: [ 670] 38 670 12345 42 28 140 0 ntpd
Aug 23 22:26:51 VM-0-7-centos kernel: [ 671] 0 671 1097 1 8 33 0 acpid
Aug 23 22:26:51 VM-0-7-centos kernel: [ 674] 0 674 7103 290 19 309 0 systemd-logind
Aug 23 22:26:51 VM-0-7-centos kernel: [ 940] 0 940 25726 1 49 512 0 dhclient
Aug 23 22:26:51 VM-0-7-centos kernel: [ 1144] 0 1144 6477 7 18 46 0 atd
Aug 23 22:26:51 VM-0-7-centos kernel: [ 1171] 0 1171 27552 2 9 31 0 agetty
Aug 23 22:26:51 VM-0-7-centos kernel: [ 1172] 0 1172 27552 2 10 31 0 agetty
Aug 23 22:26:51 VM-0-7-centos kernel: [32380] 0 32380 11351 2 24 120 -1000 systemd-udevd
Aug 23 22:26:51 VM-0-7-centos kernel: [12665] 0 12665 146507 278 105 2986 0 tuned
Aug 23 22:26:51 VM-0-7-centos kernel: [19250] 0 19250 10728 2 26 67 0 lvmetad
Aug 23 22:26:51 VM-0-7-centos kernel: [19427] 0 19427 166116 622 175 89 0 rsyslogd
Aug 23 22:26:51 VM-0-7-centos kernel: [32536] 0 32536 1203030 37707 225 19159 0 java
Aug 23 22:26:51 VM-0-7-centos kernel: [ 9045] 0 9045 28234 67 58 195 -1000 sshd
Aug 23 22:26:51 VM-0-7-centos kernel: [ 6066] 0 6066 42016 381 22 199 0 redis-server
Aug 23 22:26:51 VM-0-7-centos kernel: [26425] 1000 26425 1013843 97449 486 58688 0 java
Aug 23 22:26:51 VM-0-7-centos kernel: [ 9542] 0 9542 51863 7632 43 1193 0 scrapyd
Aug 23 22:26:51 VM-0-7-centos kernel: [24774] 0 24774 46243 0 42 152 0 svnserve
Aug 23 22:26:51 VM-0-7-centos kernel: [ 5928] 0 5928 24373 96 20 28 0 sgagent
Aug 23 22:26:51 VM-0-7-centos kernel: [ 3104] 0 3104 11989 431 29 53 0 nginx
Aug 23 22:26:51 VM-0-7-centos kernel: [22261] 1000 22261 37951 2040 28 3645 0 python
Aug 23 22:26:51 VM-0-7-centos kernel: [24111] 1000 24111 38482 2796 30 2930 0 python
Aug 23 22:26:51 VM-0-7-centos kernel: [24112] 1000 24112 38482 2992 30 2751 0 python
Aug 23 22:26:51 VM-0-7-centos kernel: [24113] 1000 24113 38482 2991 30 2765 0 python
Aug 23 22:26:51 VM-0-7-centos kernel: [24114] 1000 24114 38482 3017 30 2747 0 python
Aug 23 22:26:51 VM-0-7-centos kernel: [24381] 1000 24381 616987 20964 139 527 0 python
Aug 23 22:26:51 VM-0-7-centos kernel: [21067] 0 21067 267754 597 42 211 0 YDLive
Aug 23 22:26:51 VM-0-7-centos kernel: [15268] 1000 15268 228656 8219 150 0 0 PM2 v5.3.0: God
Aug 23 22:26:51 VM-0-7-centos kernel: [ 5371] 1000 5371 179277 1643 22 0 0 mgtty
Aug 23 22:26:51 VM-0-7-centos kernel: [ 2652] 1000 2652 71523 2247 25 0 0 python
Aug 23 22:26:51 VM-0-7-centos kernel: [32094] 0 32094 287474 36843 164 10611 0 YDService
Aug 23 22:26:51 VM-0-7-centos kernel: [21810] 0 21810 312052 1141 47 444 0 sh
Aug 23 22:26:51 VM-0-7-centos kernel: [ 9387] 1000 9387 227773 10275 140 0 0 node /home/hjda
Aug 23 22:26:51 VM-0-7-centos kernel: [20365] 1000 20365 247573 22232 407 0 0 node /home/hjda
Aug 23 22:26:51 VM-0-7-centos kernel: [19566] 0 19566 33442 100 70 0 0 systemd-journal
Aug 23 22:26:51 VM-0-7-centos kernel: [ 3013] 0 3013 12914 1020 31 48 0 nginx
Aug 23 22:26:51 VM-0-7-centos kernel: [ 3014] 0 3014 12996 1044 31 48 0 nginx
Aug 23 22:26:51 VM-0-7-centos kernel: [ 3015] 0 3015 13066 1147 31 48 0 nginx
Aug 23 22:26:51 VM-0-7-centos kernel: [ 3016] 0 3016 13386 1463 32 48 0 nginx
Aug 23 22:26:51 VM-0-7-centos kernel: [ 8424] 27 8424 1795335 1019225 2442 0 0 mysqld
Aug 23 22:26:51 VM-0-7-centos kernel: [17805] 0 17805 31597 171 20 0 0 crond
Aug 23 22:26:51 VM-0-7-centos kernel: [ 911] 1000 911 240453 23601 335 0 0 node /home/hjda
Aug 23 22:26:51 VM-0-7-centos kernel: [ 4180] 1000 4180 1360179 202736 698 0 0 java
Aug 23 22:26:51 VM-0-7-centos kernel: [ 4371] 1000 4371 1024871 120478 410 0 0 java
Aug 23 22:26:51 VM-0-7-centos kernel: [21245] 0 21245 38922 1676 29 0 0 barad_agent
Aug 23 22:26:51 VM-0-7-centos kernel: [21261] 0 21261 41225 1863 35 0 0 barad_agent
Aug 23 22:26:51 VM-0-7-centos kernel: [21262] 0 21262 172850 3575 58 0 0 barad_agent
Aug 23 22:26:51 VM-0-7-centos kernel: [17196] 0 17196 39203 341 80 0 0 sshd

Aug 23 22:26:51 VM-0-7-centos kernel: [ 6579]  1000  6579 11058700   216642    3017        0             0 node

Aug 23 22:26:51 VM-0-7-centos kernel: Out of memory: Kill process 8424 (mysqld) score 485 or sacrifice child
Aug 23 22:26:51 VM-0-7-centos kernel: Killed process 8424 (mysqld), UID 27, total-vm:7181340kB, anon-rss:4076900kB, file-rss:0kB, shmem-rss:0kB
Aug 23 22:26:51 VM-0-7-centos systemd: mysqld.service: main process exited, code=killed, status=9/KILL

https://community.wandisco.com/portal/s/article/Guide-to-Out-of-Memory-OOM-events-and-decoding-their-logging 这篇文章纤细介绍了log的看法

首先看 下面这行

Aug 23 22:26:50 VM-0-7-centos kernel: VM Thread invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
gfp_mask: The gfp_mask is used to tell the page allocator which pages can be allocated, whether the allocator can wait for more memory to be freed, etc. All the gfp flags are defined in include/linux/gfp.h:
/* Plain integer GFP bitmasks. Do not use this directly. */
#define ___GFP_DMA      0x01u
#define ___GFP_HIGHMEM      0x02u
#define ___GFP_DMA32        0x04u
#define ___GFP_MOVABLE      0x08u
#define ___GFP_WAIT     0x10u
#define ___GFP_HIGH     0x20u
#define ___GFP_IO       0x40u
#define ___GFP_FS       0x80u
#define ___GFP_COLD     0x100u
#define ___GFP_NOWARN       0x200u
#define ___GFP_REPEAT       0x400u
#define ___GFP_NOFAIL       0x800u
#define ___GFP_NORETRY      0x1000u
#define ___GFP_MEMALLOC     0x2000u
#define ___GFP_COMP     0x4000u
#define ___GFP_ZERO     0x8000u
#define ___GFP_NOMEMALLOC   0x10000u
#define ___GFP_HARDWALL     0x20000u
#define ___GFP_THISNODE     0x40000u
#define ___GFP_RECLAIMABLE  0x80000u
#define ___GFP_KMEMCG       0x100000u
#define ___GFP_NOTRACK      0x200000u
#define ___GFP_NO_KSWAPD    0x400000u
#define ___GFP_OTHER_NODE   0x800000u
#define ___GFP_WRITE        0x1000000u
#define ___GFP_CMA      0x2000000u

/*
 * GFP bitmasks..
 *
 * Zone modifiers (see linux/mmzone.h - low three bits)
 *
 * Do not put any conditional on these. If necessary modify the definitions
 * without the underscores and use them consistently. The definitions here may
 * be used in bit comparisons.
 */
#define __GFP_DMA   ((__force gfp_t)___GFP_DMA)
#define __GFP_HIGHMEM   ((__force gfp_t)___GFP_HIGHMEM)
#define __GFP_DMA32 ((__force gfp_t)___GFP_DMA32)
#define __GFP_MOVABLE   ((__force gfp_t)___GFP_MOVABLE)  /* Page is movable */
#define __GFP_CMA   ((__force gfp_t)___GFP_CMA)
#define GFP_ZONEMASK    (__GFP_DMA|__GFP_HIGHMEM|__GFP_DMA32|__GFP_MOVABLE| \
            __GFP_CMA)

/*
 * Action modifiers - doesn't change the zoning
 *
 * __GFP_REPEAT: Try hard to allocate the memory, but the allocation attempt
 * _might_ fail.  This depends upon the particular VM implementation.
 *
 * __GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller
 * cannot handle allocation failures.  This modifier is deprecated and no new
 * users should be added.
 *
 * __GFP_NORETRY: The VM implementation must not retry indefinitely.
 *
 * __GFP_MOVABLE: Flag that this page will be movable by the page migration
 * mechanism or reclaimed
 */
#define __GFP_WAIT  ((__force gfp_t)___GFP_WAIT)    /* Can wait and reschedule? */
#define __GFP_HIGH  ((__force gfp_t)___GFP_HIGH)    /* Should access emergency pools? */
#define __GFP_IO    ((__force gfp_t)___GFP_IO)  /* Can start physical IO? */
#define __GFP_FS    ((__force gfp_t)___GFP_FS)  /* Can call down to low-level FS? */
#define __GFP_COLD  ((__force gfp_t)___GFP_COLD)    /* Cache-cold page required */
#define __GFP_NOWARN    ((__force gfp_t)___GFP_NOWARN)  /* Suppress page allocation failure warning */
#define __GFP_REPEAT    ((__force gfp_t)___GFP_REPEAT)  /* See above */
#define __GFP_NOFAIL    ((__force gfp_t)___GFP_NOFAIL)  /* See above */
#define __GFP_NORETRY   ((__force gfp_t)___GFP_NORETRY) /* See above */
#define __GFP_MEMALLOC  ((__force gfp_t)___GFP_MEMALLOC)/* Allow access to emergency reserves */
#define __GFP_COMP  ((__force gfp_t)___GFP_COMP)    /* Add compound page metadata */
#define __GFP_ZERO  ((__force gfp_t)___GFP_ZERO)    /* Return zeroed page on success */
#define __GFP_NOMEMALLOC ((__force gfp_t)___GFP_NOMEMALLOC) /* Don't use emergency reserves.
                             * This takes precedence over the
                             * __GFP_MEMALLOC flag if both are
                             * set
                             */
order: The 'order' of a page allocation is it's logarithm to the base 2, and the size of the allocation is 2order, an integral power-of-2 number of pages. 
这里order=0,表示分配 2 的0次方个page, 一个page=4k。至于 gfp_mask=0x201da 最后一个16进制是a,对应的二进制是1010,取low three bit = 010 Zone modifiers #define __GFP_HIGHMEM   0x02u 02对应的是010,所以应该是High Memory区(目前是这样理解的)
接下来看下面这个log
Aug 23 22:26:51 VM-0-7-centos kernel: Node 0 Normal free:45948kB min:42976kB low:53720kB high:64464kB active_anon:4076364kB inactive_anon:731832kB active_file:12232kB inactive_file:13024kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:5242880kB managed:5093976kB mlocked:0kB dirty:12kB writeback:12kB mapped:796kB shmem:384kB slab_reclaimable:56784kB slab_unreclaimable:44288kB kernel_stack:9600kB pagetables:26384kB unstable:0kB bounce:0kB free_pcp:912kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:39131 all_unreclaimable? yes

Normal代表High Memory,这里 free:45948kB min:42976kB low:53720kB, 这里min<free < low, 根据https://community.wandisco.com/portal/s/article/Guide-to-Out-of-Memory-OOM-events-and-decoding-their-logging 这位老哥的介绍,当free < min时内存就无法分配了,但我这里不是这种情况,那是因为内存碎片太多了吗?看下面这个log

Aug 23 22:26:51 VM-0-7-centos kernel: Node 0 Normal: 1003*4kB (UEM) 1872*8kB (UEM) 589*16kB (UEM) 386*32kB (UEM) 52*64kB (UEM) 11*128kB (EM) 2*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 46012kB

区区4k的内存,还是充足的,然后我又 搜索了一下,看到下面这里阿里云的链接:出现OOM Killer的原因与解决方案 (alibabacloud.com)

系统全局内存的使用率过高: 

如下日志记录的出现OOM Killer场景示例中,limit of host表示实例的全局内存出现了不足。在日志记录的数据中,空闲内存(free)已经低于了内存最低水位线(low)。
[六 9月 11 12:24:42 2021] test invoked oom-killer: gfp_mask=0x62****(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0,
[六 9月 11 12:24:42 2021] Task in /user.slice killed as a result of limit of host
[六 9月 11 12:24:42 2021] Node 0 DMA32 free:155160kB min:152412kB low:190512kB high:228612kB
[六 9月 11 12:24:42 2021] Node 0 Normal free:46592kB min:46712kB low:58388kB high:70064kB
原因:由于实例的空闲内存低于内存最低水位线,无法通过内存回收机制解决内存不足的问题,因此触发了OOM Killer。

内存节点(Node)的内存不足:

如下日志记录的出现OOM Killer场景示例中,部分日志记录说明:
limit of host表示内存节点的内存出现了不足。
实例存在Node 0和Node 1两个内存节点。
内存节点Node 1的空闲内存(free)低于内存最低水位线(low)。
实例的空闲内存还有大量剩余(free:4111496)。
[Sat Sep 11 09:46:24 2021] main invoked oom-killer: gfp_mask=0x62****(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0
[Sat Sep 11 09:46:24 2021] main cpuset=mm_cpuset mems_allowed=1
[Sat Sep 11 09:46:24 2021] Task in / killed as a result of limit of host
[Sat Sep 11 09:46:24 2021] Mem-Info:
[Sat Sep 11 09:46:24 2021] active_anon:172 inactive_anon:4518735 isolated_anon:
    free:4111496 free_pcp:1 free_cma:0
[Sat Sep 11 09:46:24 2021] Node 1 Normal free:43636kB min:45148kB low:441424kB high:837700kB
[Sat Sep 11 09:46:24 2021] Node 1 Normal: 856*4kB (UME) 375*8kB (UME) 183*16kB (UME) 184*32kB (UME) 87*64kB (ME) 45*128kB (UME) 16*256kB (UME) 5*512kB (UE) 14*1024kB (UME) 0     *2048kB 0*4096kB = 47560kB
[Sat Sep 11 09:46:24 2021] Node 0 hugepages_total=360 hugepages_free=360 hugepages_surp=0 hugepages_size=1048576kB
[Sat Sep 11 09:46:24 2021] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Sat Sep 11 09:46:24 2021] Node 1 hugepages_total=360 hugepages_free=360 hugepages_surp=0 hugepages_size=1048576kB
[Sat Sep 11 09:46:25 2021] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
原因:在NUMA存储模式下,操作系统可能存在多个内存节点(可运行cat /proc/buddyinfo命令查看相关资源信息)。如果通过cpuset.mems参数指定cgroup只能使用特定内存节点的内存,则可能导致实例在具备充足的空闲内存的情况下,仍出现OOM Killer的情况。

上面说的是当 free < low时,无法分配内存,看来阿里云的说法符合我的情况,可能是两种解释内核版本不同?

接下来解释几个名词:

total-vm: the size of the virtual memoy,Part of it is really mapped into the RAM itself (allocated and used). This is "RSS".

anon-rss:Part of the RSS is allocated in real memory blocks (other than mapped into a file or device). This is anonymous memory ("anon-rss")

file-rss:RSS memory blocks that are mapped into devices and files ("file-rss").

So, if you open a huge file in vim, the file-rss would be high, on the other side, if you malloc() a lot of memory and really use it, your anon-rss would be high also.On the other side, if you allocate a lot of space (with malloc()), but nevers use it, the total-vm would be higher, but no real memory would be used (due to the memory overcommit), so, the rss values would be low.

-------- https://stackoverflow.com/questions/18845857/what-does-anon-rss-and-total-vm-mean#

=================说了这么多,解决方法呢?

目前尝试了调整mysq的 innodb_buffer_pool_size = 2304M: 4G的75%,我的机器8G,因为我的机器人还跑了java和python,node服务。目前测试下来除了编译打包vue的ssr服务,node一下子占了好多内存导致oom发生外,还有chrome崩溃同时存在好多个chrome服务外还没有在正常运行中发生过

 

 

 
posted @ 2023-08-11 01:10  zjhgx  阅读(315)  评论(0编辑  收藏  举报