linux命令之strace简单使用

  • strace是什么

  strace是一个可用于诊断、调试和教学的Linux用户空间跟踪器。我们用它来监控用户空间进程和内核的交互,比如系统调用、信号传递、进程状态变更等。

  • 使用方式
  1. strace 使用帮助
usage: strace [-CdffhiqrtttTvVwxxy] [-I n] [-e expr]...
              [-a column] [-o file] [-s strsize] [-P path]...
              -p pid... / [-D] [-E var=val]... [-u username] PROG [ARGS]
   or: strace -c[dfw] [-I n] [-e expr]... [-O overhead] [-S sortby]
              -p pid... / [-D] [-E var=val]... [-u username] PROG [ARGS]
Output format:
-a column alignment COLUMN for printing syscall results (default 40)
-i print instruction pointer at time of syscall
-o file send trace output to FILE instead of stderr
-q suppress messages about attaching, detaching, etc.
-r print relative timestamp
-s strsize limit length of print strings to STRSIZE chars (default 32)
-t print absolute timestamp
-tt print absolute timestamp with usecs
-T print time spent in each syscall
-x print non-ascii strings in hex
-xx print all strings in hex
-y print paths associated with file descriptor arguments
-yy print protocol specific information associated with socket file descriptors

Statistics:
-c count time, calls, and errors for each syscall and report summary
-C like -c but also print regular output
-O overhead set overhead for tracing syscalls to OVERHEAD usecs
-S sortby sort syscall counts by: time, calls, name, nothing (default time)
-w summarise syscall latency (default is system time)

Filtering:
-e expr a qualifying expression: option=[!]all or option=[!]val1[,val2]...
options: trace, abbrev, verbose, raw, signal, read, write
-P path trace accesses to path

Tracing:
-b execve detach on execve syscall
-D run tracer process as a detached grandchild, not as parent
-f follow forks
-ff follow forks with output into separate files
-I interruptible
1: no signals are blocked
2: fatal signals are blocked while decoding syscall (default)
3: fatal signals are always blocked (default if '-o FILE PROG')
4: fatal signals and SIGTSTP (^Z) are always blocked
(useful to make 'strace -o FILE PROG' not stop on ^Z)

Startup:
-E var remove var from the environment for command
-E var=val put var=val in the environment for command
-p pid trace process with process id PID, may be repeated
-u username run command as username handling setuid and/or setgid

常用使用方式

strace -tt  -e trace=process -o log.txt -f  command args

常见trace:
-e trace=file     跟踪和文件访问相关的调用(参数中有文件名)
-e trace=process  和进程管理相关的调用,比如fork/exec/exit_group
-e trace=network  和网络通信相关的调用,比如socket/sendto/connect
-e trace=signal    信号发送和处理相关,比如kill/sigaction
-e trace=desc  和文件描述符相关,比如write/read/select/epoll等
-e trace=ipc 进程见同学相关,比如shmget等 
  • 以下是本人在使用的应用场景
  1. 研究system调用执行一个命令ls -lh时的工作过程
#include<stdio.h>
#include<stdlib.h>

int main()
{
    system("ls -lh");
    return 0;
}

  编译源码gcc systemcall.c -o systemcall,使用strace -tt -o log.txt -e trace=process -f ./systemcall追踪结果如下:

33816 04:34:23.873655 execve("./systemcall", ["./systemcall"], [/* 25 vars */]) = 0
33816 04:34:23.875018 arch_prctl(ARCH_SET_FS, 0x7f81656b2740) = 0
33816 04:34:23.875312 clone(child_stack=0, flags=CLONE_PARENT_SETTID|SIGCHLD, parent_tidptr=0x7fff68104620) = 33817
33816 04:34:23.875399 wait4(33817,  <unfinished ...>
33817 04:34:23.875492 execve("/bin/sh", ["sh", "-c", "ls -lh"], [/* 25 vars */]) = 0
33817 04:34:23.876447 arch_prctl(ARCH_SET_FS, 0x7fd146dc9740) = 0
33817 04:34:23.879322 execve("/usr/bin/ls", ["ls", "-lh"], [/* 24 vars */]) = 0
33817 04:34:23.881106 arch_prctl(ARCH_SET_FS, 0x7f199c28a840) = 0
33817 04:34:23.887491 exit_group(0)     = ?
33817 04:34:23.887627 +++ exited with 0 +++
33816 04:34:23.887654 <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 33817
33816 04:34:23.887734 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=33817, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
33816 04:34:23.887775 exit_group(0)     = ?
33816 04:34:23.887839 +++ exited with 0 +++

可以看到system执行了clone生成了一个子进程33817,然后execve执行了/bin/sh -c "ls -lh"命令,接着sh程序解析-c参数后,调用execve执行了ls -lh命令

  2. 研究system调用执行两个命令ls -lh;ls时的工作过程

#include<stdio.h>
#include<stdlib.h>
int main()
{
        system("ls -lh;ls");
        return 0;
}

  编译源码gcc systemcall2.c -o systemcall2,使用strace -tt -o log.txt -e trace=process -f ./systemcall2追踪结果如下:

35076 04:43:55.567202 execve("./systemcall2", ["./systemcall2"], [/* 25 vars */]) = 0
35076 04:43:55.568049 arch_prctl(ARCH_SET_FS, 0x7f8e13d49740) = 0
35076 04:43:55.568376 clone(child_stack=0, flags=CLONE_PARENT_SETTID|SIGCHLD, parent_tidptr=0x7ffe05e7bd30) = 35077
35076 04:43:55.568472 wait4(35077,  <unfinished ...>
35077 04:43:55.568578 execve("/bin/sh", ["sh", "-c", "ls -lh;ls"], [/* 25 vars */]) = 0
35077 04:43:55.569651 arch_prctl(ARCH_SET_FS, 0x7f3ef8974740) = 0
35077 04:43:55.572759 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f3ef8974a10) = 35078
35077 04:43:55.573011 wait4(-1,  <unfinished ...>
35078 04:43:55.573253 execve("/usr/bin/ls", ["ls", "-lh"], [/* 24 vars */]) = 0
35078 04:43:55.575190 arch_prctl(ARCH_SET_FS, 0x7f07f5e0d840) = 0
35078 04:43:55.581789 exit_group(0)     = ?
35078 04:43:55.581945 +++ exited with 0 +++
35077 04:43:55.581972 <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 35078
35077 04:43:55.582030 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=35078, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
35077 04:43:55.582060 wait4(-1, 0x7fff86e41950, WNOHANG, NULL) = -1 ECHILD (No child processes)
35077 04:43:55.582435 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f3ef8974a10) = 35079
35077 04:43:55.582768 wait4(-1,  <unfinished ...>
35079 04:43:55.583048 execve("/usr/bin/ls", ["ls"], [/* 24 vars */]) = 0
35079 04:43:55.585341 arch_prctl(ARCH_SET_FS, 0x7fd65912d840) = 0
35079 04:43:55.587148 exit_group(0)     = ?
35079 04:43:55.587264 +++ exited with 0 +++
35077 04:43:55.587291 <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 35079
35077 04:43:55.587383 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=35079, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
35077 04:43:55.587413 wait4(-1, 0x7fff86e41990, WNOHANG, NULL) = -1 ECHILD (No child processes)
35077 04:43:55.587508 exit_group(0)     = ?
35077 04:43:55.587626 +++ exited with 0 +++
35076 04:43:55.587649 <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 35077
35076 04:43:55.587721 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=35077, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
35076 04:43:55.587757 exit_group(0)     = ?
35076 04:43:55.587826 +++ exited with 0 +++

  从追踪结果来看,system在执行时会产生一个子进程35077执行/bin/sh -c "ls -lh;ls"命令,然后由35077产生了子进程35078执行ls -lh命令,35078执行退出后,35077又产生了子进程35079执行ls命令。

  从上面追踪可知,system执行后首先创建一个进程执行/bin/sh命令,然后sh 命令解析其参数,如果只有一条命令时,则直接调用execve执行命令,否则由sh进程创建子进程调用execve执行相关命令。

 

posted @ 2020-02-06 16:21  ZhaoKevin  阅读(621)  评论(0编辑  收藏  举报