为方便问题发生后,问题原因的分析排查,我们可以在服务器中事先部署如下脚本,方便故障发生后,问题原因的分析排查

 

脚本部署方法:

1.将脚本【top_monitor.sh】上传到服务器

2.登陆虚拟机,并切换到root用户

3.执行命令: chmod +x  <path>/top_monitor.sh  【<path>替换成实际脚本上传的目录】

4.执行命令:crontab -e

5.添加定时任务:* * * * * <the path of the script>/top_monitor.sh <the path of ouputs saved> <keeps>

备注:

范例:* * * * * /root/top_monitor.sh /var/tmp/ 72 &

a.这里的keeps以小时为单位,如果保留三天的话,就输入72

b.该脚本每五秒钟收集一次top/vmstat/iostat命令的输出,每分钟会生成一个性能数据文件

c.不要把用户数据放到该目录下,否则会被清理掉

6.搜集到的性能日志格式如下:

2017-11-22 07:07:01
top - 07:07:01 up 1 day, 36 min,  2 users,  load average: 0.00, 0.00, 0.00
Tasks: 163 total,   1 running, 159 sleeping,   0 stopped,   3 zombie
Cpu(s):  1.0%us,  0.5%sy,  0.0%ni, 98.2%id,  0.3%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   3530640k total,   778368k used,  2752272k free,   127976k buffers
Swap:        0k total,        0k used,        0k free,   352288k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
   12 root      20   0     0    0    0 S  1.9  0.0   0:07.04 events/1           
    1 root      20   0 21388 1528 1204 S  0.0  0.0   0:07.80 init               
    2 root      20   0     0    0    0 S  0.0  0.0   0:00.01 kthreadd           
    3 root      RT   0     0    0    0 S  0.0  0.0   0:02.09 migration/0        
    4 root      20   0     0    0    0 S  0.0  0.0   0:00.79 ksoftirqd/0        
    5 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 stopper/0          
    6 root      RT   0     0    0    0 S  0.0  0.0   0:00.13 watchdog/0         
    7 root      RT   0     0    0    0 S  0.0  0.0   0:02.19 migration/1        
    8 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 stopper/1          
    9 root      20   0     0    0    0 S  0.0  0.0   0:00.80 ksoftirqd/1        
   10 root      RT   0     0    0    0 S  0.0  0.0   0:00.13 watchdog/1         
   11 root      20   0     0    0    0 S  0.0  0.0   0:04.85 events/0           
   13 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events/0           
   14 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events/1           
   15 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events_long/0      
   16 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events_long/1      
   17 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events_power_ef    
   18 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events_power_ef    
   19 root      20   0     0    0    0 S  0.0  0.0   0:00.00 cgroup             
   20 root      20   0     0    0    0 S  0.0  0.0   0:00.00 khelper            
   21 root      20   0     0    0    0 S  0.0  0.0   0:00.00 netns              
   22 root      20   0     0    0    0 S  0.0  0.0   0:00.00 async/mgr          
   23 root      20   0     0    0    0 S  0.0  0.0   0:00.00 pm                 
   24 root      20   0     0    0    0 S  0.0  0.0   0:00.39 sync_supers        
   25 root      20   0     0    0    0 S  0.0  0.0   0:00.06 bdi-default        
   26 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kintegrityd/0      
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0      0 2752396 127976 352288    0    0     2    49   58   53  1  1 98  0  0       
Linux 2.6.32-696.3.2.el6.x86_64 (hlmcent69nma)  11/22/2017      _x86_64_        (2 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.00    0.00    0.52    0.27    0.00   98.20

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdc               0.00     0.00    0.00    0.00     0.03     0.00     8.02     0.00    1.88    1.88    0.00   1.88   0.00
sdb               0.00     6.55    0.01    0.08     0.09  3249.50 37296.35     0.00   13.02    0.60   14.82   0.50   0.00
sda               0.04    15.12    0.12    2.52     6.35   141.07    55.83     0.01    4.93    6.92    4.83   3.29   0.87

 

脚本详情如下:

#! /bin/bash

if [ $# != 2 ]; then
        echo "Invalid option, try top_monitor.sh <path> <hours>"  
        exit
fi

export dir=$1
export keeps=$2

## Create topMon folder
if [ ! -d "$dir" ]; then
	mkdir $dir 
fi

mkdir $dir/topMon

file=$dir/topMon/"topMon_"`date +%Y%m%d`
file=$dir/topMon/"topMon_"`date +%Y%m%d%H%M%S`

## Cleanup expired outputs
for i in `find $dir/topMon/ -type f -atime +$keeps -print`
#for i in `find $dir/topMon/ -type f -amin +$keeps -print`
do
rm -rf $i
done

## Record top ouputs
for i in {0..10}
do
date +"%Y-%m-%d %H:%M:%S" >> $file
top -b -n 1 | head -33 >> $file
vmstat >> $file
iostat -x 1 1 >> $file
sleep 5
done

 

posted on 2017-11-22 14:16  铭的随记  阅读(758)  评论(0编辑  收藏  举报