计算机体系结构-内存调优IPC OOMK
man ipc
[root@server1 proc]# man ipc
IPC(2) Linux Programmer’s Manual IPC(2)
NAME
ipc - System V IPC system calls
SYNOPSIS
int ipc(unsigned int call, int first, int second, int third,
void *ptr, long fifth);
DESCRIPTION
ipc() is a common kernel entry point for the System V IPC calls for messages,
semaphores, and shared memory. call determines which IPC function to invoke; the other
arguments are passed through to the appropriate call.
ipc() is a common kernel entry point for the System V IPC calls for messages, semaphores, and shared memory
进程间通信:
[root@server1 proc]# ipcs -l ------ Shared Memory Limits -------- max number of segments = 4096 //最大允许多少个段 max seg size (kbytes) = 67108864 //段大小最大是多少个字节 max total shared memory (kbytes) = 17179869184 //允许在全局范围内使用的共享内存是多大 min seg size (bytes) = 1 // 段大小最小是多少个字节 ------ Semaphore Limits -------- max number of arrays = 128 // max semaphores per array = 250 max semaphores system wide = 32000 max ops per semop call = 32 semaphore max value = 32767 ------ Messages: Limits -------- max queues system wide = 1982 //系统范围的最大有多少个队列 max size of message (bytes) = 65536 //每个消息最大是多少 default max size of queue (bytes) = 65536 //每个队列能接收消系的最大值
ipcrm:进程间通信睡眠了,不能唤醒了,可以用这个命令删掉
[root@server1 proc]# man ipcrm IPCRM(1) Linux Programmer’s Manual IPCRM(1) NAME ipcrm - remove a message queue, semaphore set or shared memory id
SHM:共享内存
kernel.shmmni //系统范围的充许使用共享内存的最大的值 Specifies the maximum number of shared memory segments system-wide, default = 4096 kernel.shmall //系统范围能够为的共享内存分配使用的最大页面数 Specifies the total amount of shared memory, in pages,
that can be used at one time on the system, default = 2097152
This should be at least kernel.shmmax/PAGE_SIZE kernel.shmmax //单个共享内存段能被创建的最大大小 <字节> Specifies the maximum size of a shared memory segment that can be created
MSG:消息
kernel.msgmnb //单个消息队列最大字节数
Specifies the maximum number of bytes in a single message queue, default = 16384
kernel.msgmni //系统范围内消息队列上限
Specifies the maximum number of message queue identifiers, default = 16
kernel.msgmax //单个消息大小的上限//一个队列放多个消息
Specifies the maximum size of a message that can be passed between processes
This memory cannot be swapped, default = 8192
Reclaiming dirty pages:
Using memory as cache creates a strong need to control how pages are reclaimed Memory is volatile Dirty pages need to be written to disk Free up pages for use by other processes
Handled by pdflush kernel threads Default minimum of two threads Additional threads created or destroyed according to IO activity vm.nr_pdflush_threads show the current number of pdflush threads
一个磁盘一个pdflush_thread
pdflush:回收脏页
Tune length/size of memory
vm.dirty_background_ratio: 系统范围,脏页多大比例,pdflush starts writing Percentage (of total memory) of number of dirty pages at which pdflush starts writing
vm.dirty_ratio: 单个进程脏页占用整个物理内存比例开始刷脏页 Percentage (of total memory) of number of dirty pages at which
a process itself starts writing out dirty data
Tune wait time vm.dirty_expire_centisecs: pdflush周期性起动写的操作的时间间隔,单位是百分秒 Interval between pdflush wakeups in 100ths of a second; set to zero to disable writeback
Tune observation period (between pdflush wakeups) vm.dirty_writeback_centisecs: 定义一个脏页在内存中存储多久以后,变为过期,需要pdflush 写到磁盘上,单位百分秒 Defines when data is old enough, in 100ths of a second, to be eligible for writeout bye pdflush
Reclaiming clean pages:
回收干净页面和缓存
1. Flush all dirty buffers and pages (清写脏缓存) 方法有4个:
Sync command fsync system call Alt-SysRq-S magic system request echo s > /proc/sysrq-trigger
2. Reclaim clean pages (然后释放) echo 3 > /proc/sys/vm/drop_caches 1 to free pagecache 2 to free dentries and inodes 3 to free pagecache, dentries and inodes
一般用:
echo 1 > /proc/sys/vm/drop_caches Eliminate bad data from cache Reduce memory before laptop hibernate Benchmark subsystem
sync命令一般情况下Buffercache能守同步到磁盘上,buffercache能够完全释放,因为pagecache 一定有文件打开,不能够完全释放
OOM
Out-of-memory killer:OOMK
zone_normal:没有可用空闲内存,起动OOMK
/proc/sys/vm/panic_on_oom :
parameter to 0 instructs the kernel to call the oom_killer function when oom oocurs
一但内存使用完了,就使用oomk
Kills processes if All memory (incl.swap) is active No pages are available in ZONE_NORMAL No memory is available for page table mappings
Interactive processes are preserved if possible High sleep average indecates interactive View level of immunity from oom-kill
/proc/PID/oom_score
Manually invoking oom-kill echo f > /proc/sysrq-trigger Will call oom_kill to kill a memory hog process Does not kill processes if memory is avaliable Outputs verbose memory information in /var/log/messages
oom_adj:
[-16,15]:帮助计算分数oom_score
-17:不被杀死
Tuning OOM policy:
Protect daemons from oom-kill //保护某进程不被OOMK echo n > /proc/PID/oom_adj The process to be killed in an out-of-memory situation is selected based on its badness score oom_score gets multiplied by 2n Caution: child processes inherit oom_adj from parent
Disable oom-kill in /etc/sysctl.conf //禁止OOM vm.panic_on_oom = 1
oom-kill is not a fix for memory leaks
Detecting memory leaks:
凭估内存泄漏
Two types of memory leaks
Virtual: process requests but does not use virtual address space (vsize)
Real: process fails to free memory (rss)
Use sar to observer system-wide memory change
sar -R 1 120
Report memory statistics
Use watch with ps or pmap
watch –n1 ‘ps axo pid,comm,rss,vsize | grep httpd’
// rss,vsize:只增不减,可能内存泄漏
Use valgrind
valgrind --tool=memcheck cat /proc/$$/maps
==11763==
==11763== HEAP SUMMARY:
==11763== in use at exit: 0 bytes in 0 blocks
==11763== total heap usage: 31 allocs, 31 frees, 40,544 bytes allocated
==11763==
==11763== All heap blocks were freed -- no leaks are possible
==11763==
==11763== For counts of detected and suppressed errors, rerun with: -v
==11763== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 6 from 6)
==11763== HEAP SUMMARY:
==11763== in use at exit: 0 bytes in 0 blocks
==11763== total heap usage: 31 allocs, 31 frees, 40,544 bytes allocated
==11763==
==11763== All heap blocks were freed -- no leaks are possible
==11763==
==11763== For counts of detected and suppressed errors, rerun with: -v
==11763== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 6 from 6)
swap介绍:
The unmapping of page frames from an active process
Swap-out: page frames are unmapped and placed in page slots on a swap device Swap-in: page frames are read in from page slots on a swap device and mapped into a process address space
Which pages get swapped? Inactvie pages Anonymous pages
Swap cache: Contains unmodified swapped-in pages:
从swap加载到内存,但是没有做修改的数据,当该数据被删了,还可以从swap 读
Avoids race conditions when multiple processes access a common page frame
提高swap的性能:
Defer swap until think time Frequent, small swaps //使用小分区 Swap anonymous pages more readily
Reduce visit count:降低访问次数,增大内存 Distribute swap areas across maximum of 32 LUNs Assign an equal, high priority to multiple swap areas //多个交换分区相同优先级 Kernel uses highest priority first Kernel uses round-robin forswap areas of equal priorty
Reduce service time:降低服务时间:SSD ,swap最外道上 Use partitions, not files Place swap areas on lowest numbered partitions of fastest LUNs
调优swapingess:
Tuning swappiness:
Searching for inactive pages can consume the CPU On large-memory boxes, finding and unmapping inactive pages consumes more disk
and CPU resources than writing anonymous pages to disk
Prefer anonymous pages (higher value) vm.swappiness Linux prefer to swap anonymous pages when:% of memory mapped into page tables + vm.swappiness >= 100
调大更多内存使用,调小使用更多swap,数据服务器不使用swap
Consequences Reduced CPU utilization Reduced disk bandwidth
Tuning swap size:SWAP应该设为多大:
Considerations Kernel uses two bytes in ZONE_NORMAL to track each page of swap Storage bandwidth cannot keep up with RAM and can lead to swaplock If memory shorage is severe, kernel will kill user mode process
general guidelines Batch compute servers: up to 4 * RAM Database server: <= 1 GiB Application server: >= 0.5 * RAM Consequences Avoid swaplock
对swap访问调优
Create up to 32 swap devices Make swap signatures mkswap -L SWAP_LABEL /path/to/device Assign priority in /etc/fstab /dev/sda1 swap swap pri=5 0 0 /dev/sdb1 swap swap pri=5 0 0 LABEL=testswap swap swap pri=5 0 0 /swaps/swapfile1 swap swap pri=1 0 0 Active swapon -a View active swap devices in /proc/swaps
监控swap:
Memory activity vmstat -n [interval] [count] sar -r [interval] [count] Report memory and swap space utilization statistics Rate of change in memory sar -R [interval] [count] Report memory statistics Swap activity sar -W [interval] [count] Report swapping statistics All IO sar -B [interval] [count] Report paging statistics