execsnoop-短时进程追踪工具
在实际工作中,偶尔会遇到系统的CPU使用率和系统平均负载很高,但却找不到高CPU的应用;
产生这个问题的原因:进程有可能在不断的崩溃、重启
通过uptime发现系统负载很高,但是通过top,mpstat,pidstat,perf等工具很难发现是什么进程导致了系统负载和CPU使用率很高;
注:通过上面工具的判断,即不是CPU密集型,也不存在IO等待,也不存在进程、线程争用的情况
execsnoop-专门用于为追踪短时进程(瞬时进程)设计的工具;
它通过 ftrace 实时监控进程的 exec() 行为,并输出短时进程的基本信息,包括进程 PID、父进程 PID、命令行参数以及执行的结果。
github地址:https://github.com/brendangregg/perf-tools/blob/master/execsnoop
如何安装使用:将上面的github的内容复制,然后写入execsnoop文件,并且加上x权限即可;
[root@localhost ~]# cat execsnoop #!/bin/bash # # execsnoop - trace process exec() with arguments. # Written using Linux ftrace. # # This shows the execution of new processes, especially short-lived ones that # can be missed by sampling tools such as top(1). # # USAGE: ./execsnoop [-hrt] [-n name] # # REQUIREMENTS: FTRACE and KPROBE CONFIG, sched:sched_process_fork tracepoint, # and either the sys_execve, stub_execve or do_execve kernel function. You may # already have these on recent kernels. And awk. # # This traces exec() from the fork()->exec() sequence, which means it won't # catch new processes that only fork(). With the -r option, it will also catch # processes that re-exec. It makes a best-effort attempt to retrieve the program # arguments and PPID; if these are unavailable, 0 and "[?]" are printed # respectively. There is also a limit to the number of arguments printed (by # default, 8), which can be increased using -a. # # This implementation is designed to work on older kernel versions, and without # kernel debuginfo. It works by dynamic tracing an execve kernel function to # read the arguments from the %si register. The sys_execve function is tried # first, then stub_execve and do_execve. The sched:sched_process_fork # tracepoint is used to get the PPID. This program is a workaround that should be # improved in the future when other kernel capabilities are made available. If # you need a more reliable tool now, then consider other tracing alternatives # (eg, SystemTap). This tool is really a proof of concept to see what ftrace can # currently do. # # From perf-tools: https://github.com/brendangregg/perf-tools # # See the execsnoop(8) man page (in perf-tools) for more info. # # COPYRIGHT: Copyright (c) 2014 Brendan Gregg. # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License # as published by the Free Software Foundation; either version 2 # of the License, or (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software Foundation, # Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. # # (http://www.gnu.org/copyleft/gpl.html) # # 07-Jul-2014 Brendan Gregg Created this. ### default variables tracing=/sys/kernel/debug/tracing flock=/var/tmp/.ftrace-lock; wroteflock=0 opt_duration=0; duration=; opt_name=0; name=; opt_time=0; opt_reexec=0 opt_argc=0; argc=8; max_argc=16; ftext= trap ':' INT QUIT TERM PIPE HUP # sends execution to end tracing section function usage { cat <<-END >&2 USAGE: execsnoop [-hrt] [-a argc] [-d secs] [name] -d seconds # trace duration, and use buffers -a argc # max args to show (default 8) -r # include re-execs -t # include time (seconds) -h # this usage message name # process name to match (REs allowed) eg, execsnoop # watch exec()s live (unbuffered) execsnoop -d 1 # trace 1 sec (buffered) execsnoop grep # trace process names containing grep execsnoop 'udevd$' # process names ending in "udevd" See the man page and example file for more info. END exit } function warn { if ! eval "$@"; then echo >&2 "WARNING: command failed \"$@\"" fi } function end { # disable tracing echo 2>/dev/null echo "Ending tracing..." 2>/dev/null cd $tracing warn "echo 0 > events/kprobes/$kname/enable" warn "echo 0 > events/sched/sched_process_fork/enable" warn "echo -:$kname >> kprobe_events" warn "echo > trace" (( wroteflock )) && warn "rm $flock" } function die { echo >&2 "$@" exit 1 } function edie { # die with a quiet end() echo >&2 "$@" exec >/dev/null 2>&1 end exit 1 } ### process options while getopts a:d:hrt opt do case $opt in a) opt_argc=1; argc=$OPTARG ;; d) opt_duration=1; duration=$OPTARG ;; r) opt_reexec=1 ;; t) opt_time=1 ;; h|?) usage ;; esac done shift $(( $OPTIND - 1 )) if (( $# )); then opt_name=1 name=$1 shift fi (( $# )) && usage ### option logic (( opt_pid && opt_name )) && die "ERROR: use either -p or -n." (( opt_pid )) && ftext=" issued by PID $pid" (( opt_name )) && ftext=" issued by process name \"$name\"" (( opt_file )) && ftext="$ftext for filenames containing \"$file\"" (( opt_argc && argc > max_argc )) && die "ERROR: max -a argc is $max_argc." if (( opt_duration )); then echo "Tracing exec()s$ftext for $duration seconds (buffered)..." else echo "Tracing exec()s$ftext. Ctrl-C to end." fi ### select awk if (( opt_duration )); then [[ -x /usr/bin/mawk ]] && awk=mawk || awk=awk else # workarounds for mawk/gawk fflush behavior if [[ -x /usr/bin/gawk ]]; then awk=gawk elif [[ -x /usr/bin/mawk ]]; then awk="mawk -W interactive" else awk=awk fi fi ### check permissions cd $tracing || die "ERROR: accessing tracing. Root user? Kernel has FTRACE? debugfs mounted? (mount -t debugfs debugfs /sys/kernel/debug)" ### ftrace lock [[ -e $flock ]] && die "ERROR: ftrace may be in use by PID $(cat $flock) $flock" echo $$ > $flock || die "ERROR: unable to write $flock." wroteflock=1 ### build probe if [[ -x /usr/bin/getconf ]]; then bits=$(getconf LONG_BIT) else bits=64 [[ $(uname -m) == i* ]] && bits=32 fi (( offset = bits / 8 )) function makeprobe { func=$1 kname=execsnoop_$func kprobe="p:$kname $func" i=0 while (( i < argc + 1 )); do # p:kname do_execve +0(+0(%si)):string +0(+8(%si)):string ... kprobe="$kprobe +0(+$(( i * offset ))(%si)):string" (( i++ )) done } # try in this order: sys_execve, stub_execve, do_execve makeprobe sys_execve ### setup and begin tracing echo nop > current_tracer if ! echo $kprobe >> kprobe_events 2>/dev/null; then makeprobe stub_execve if ! echo $kprobe >> kprobe_events 2>/dev/null; then makeprobe do_execve if ! echo $kprobe >> kprobe_events 2>/dev/null; then edie "ERROR: adding a kprobe for execve. Exiting." fi fi fi if ! echo 1 > events/kprobes/$kname/enable; then edie "ERROR: enabling kprobe for execve. Exiting." fi if ! echo 1 > events/sched/sched_process_fork/enable; then edie "ERROR: enabling sched:sched_process_fork tracepoint. Exiting." fi echo "Instrumenting $func" (( opt_time )) && printf "%-16s " "TIMEs" printf "%6s %6s %s\n" "PID" "PPID" "ARGS" # # Determine output format. It may be one of the following (newest first): # TASK-PID CPU# |||| TIMESTAMP FUNCTION # TASK-PID CPU# TIMESTAMP FUNCTION # To differentiate between them, the number of header fields is counted, # and an offset set, to skip the extra column when needed. # offset=$($awk 'BEGIN { o = 0; } $1 == "#" && $2 ~ /TASK/ && NF == 6 { o = 1; } $2 ~ /TASK/ { print o; exit }' trace) ### print trace buffer warn "echo > trace" ( if (( opt_duration )); then # wait then dump buffer sleep $duration cat -v trace else # print buffer live cat -v trace_pipe fi ) | $awk -v o=$offset -v opt_name=$opt_name -v name=$name \ -v opt_duration=$opt_duration -v opt_time=$opt_time -v kname=$kname \ -v opt_reexec=$opt_reexec ' # common fields $1 != "#" { # task name can contain dashes comm = pid = $1 sub(/-[0-9][0-9]*/, "", comm) sub(/.*-/, "", pid) } $1 != "#" && $(4+o) ~ /sched_process_fork/ { cpid=$0 sub(/.* child_pid=/, "", cpid) sub(/ .*/, "", cpid) getppid[cpid] = pid delete seen[pid] } $1 != "#" && $(4+o) ~ kname { if (seen[pid]) next if (opt_name && comm !~ name) next # # examples: # ... arg1="/bin/echo" arg2="1" arg3="2" arg4="3" ... # ... arg1="sleep" arg2="2" arg3=(fault) arg4="" ... # ... arg1="" arg2=(fault) arg3="" arg4="" ... # the last example is uncommon, and may be a race. # if ($0 ~ /arg1=""/) { args = comm " [?]" } else { args=$0 sub(/ arg[0-9]*=\(fault\).*/, "", args) sub(/.*arg1="/, "", args) gsub(/" arg[0-9]*="/, " ", args) sub(/"$/, "", args) if ($0 !~ /\(fault\)/) args = args " [...]" } if (opt_time) { time = $(3+o); sub(":", "", time) printf "%-16s ", time } printf "%6s %6d %s\n", pid, getppid[pid], args if (!opt_duration fflush() if (!opt_reexec) { seen[pid] = 1 delete getppid[pid] } } $0 ~ /LOST.*EVENT[S]/ { print "WARNING: " $0 > "/dev/stderr" } ' ### end tracing end
[root@localhost ~]# ./execsnoop Tracing exec()s. Ctrl-C to end. Instrumenting sys_execve PID PPID ARGS 27570 27566 gawk -v o=1 -v opt_name=0 -v name= -v opt_duration=0 [...] 27571 27569 cat -v trace_pipe