如何使用coredump
一、coredump
当用户态进程出现异常后,在该进程的执行目录下生成对应的coredump文件,如果我们想将coredump生成的位置做改变,就需要如下设置。
echo "/home/core-%e-%p-%u-%g-%t" > /proc/sys/kernel/core_pattern echo 0x000003ff >/proc/self/coredump_filter ulimit -c unlimited (修改profile文件) source /etc/profile
%%:相当于% %p:相当于<pid> %u:相当于<uid> %g:相当于<gid> %s:相当于导致dump的信号的数字 %t:相当于dump的时间 %e:相当于执行文件的名称 %h:相当于hostname
常用的分析coredump的命令有:
bt(打印调用栈),f num(查看某一个frame的调用栈),disassemble 0x000000000040b9f0 (disassemble 地址,查看对应地址的反汇编),i r(查看寄存器的内容),p *(struct link_map*)0x7fab515ff690(查看结构体信息),x /40xb 0x7fab515ff690(查看某个地址的存储值)
info proc mappings
(gdb) info proc mappings 查看proc maps Mapped address spaces: Start Addr End Addr Size Offset objfile 0x400000 0x410000 0x10000 0x0 /usr/bin/sysmonitor 0x60f000 0x610000 0x1000 0xf000 /usr/bin/sysmonitor 0x610000 0x611000 0x1000 0x10000 /usr/bin/sysmonitor 0x7fab509ee000 0x7fab50ba4000 0x1b6000 0x0 /usr/lib64/libc-2.17.so 0x7fab50ba4000 0x7fab50da3000 0x1ff000 0x1b6000 /usr/lib64/libc-2.17.so 0x7fab50da3000 0x7fab50da7000 0x4000 0x1b5000 /usr/lib64/libc-2.17.so 0x7fab50da7000 0x7fab50da9000 0x2000 0x1b9000 /usr/lib64/libc-2.17.so 0x7fab50dae000 0x7fab50daf000 0x1000 0x0 /usr/lib64/libalarm.so 0x7fab50daf000 0x7fab50faf000 0x200000 0x1000 /usr/lib64/libalarm.so 0x7fab50faf000 0x7fab50fb0000 0x1000 0x1000 /usr/lib64/libalarm.so
info files
(gdb) info files Symbols from "/home/sysmonitor". Local core dump file: `/home/core.sysmonitor_98927_1468144735', file type elf64-x86-64. 0x0000000000400000 - 0x0000000000410000 is load1 0x000000000060f000 - 0x0000000000610000 is load2 0x0000000000610000 - 0x0000000000611000 is load3 0x0000000000611000 - 0x0000000000644000 is load4 0x00000000023c0000 - 0x00000000023e1000 is load5 0x00007fab3c000000 - 0x00007fab3c021000 is load6 0x00007fab3c021000 - 0x00007fab40000000 is load7 0x00007fab44000000 - 0x00007fab44021000 is load8 0x00007fab44021000 - 0x00007fab48000000 is load9 0x00007fab4a9e2000 - 0x00007fab4a9e3000 is load10 0x00007fab4a9e3000 - 0x00007fab4b1e3000 is load11 0x00007fab4b1e3000 - 0x00007fab4b1e4000 is load12 0x00007fab4b1e4000 - 0x00007fab4b9e4000 is load13 0x00007fab50da3740 - 0x00007fab50da3748 is .init_array in /usr/lib64/libc.so.6 0x00007fab50da3748 - 0x00007fab50da3838 is __libc_subfreeres in /usr/lib64/libc.so.6 0x00007fab50da3838 - 0x00007fab50da3840 is __libc_atexit in /usr/lib64/libc.so.6 0x00007fab50da3840 - 0x00007fab50da3858 is __libc_thread_subfreeres in /usr/lib64/libc.so.6 0x00007fab50da3860 - 0x00007fab50da6b80 is .data.rel.ro in /usr/lib64/libc.so.6 0x00007fab50da6b80 - 0x00007fab50da6d70 is .dynamic in /usr/lib64/libc.so.6 0x00007fab50da6d70 - 0x00007fab50da6ff0 is .got in /usr/lib64/libc.so.6
gdb使用方法,gdb /usr/bin/cmd(debuginfo的文件,或者带-g选项的) core_xxx
watch *(int*)监控4字节地址 用户态修改
二、dmesg
当用户进程出现coredump时,在messages日志中也会记录异常,尤其是dmesg日志中,比如下面的记录:
php-fpm22053: segfault at 2559 ip 000000398a6145b2 sp 00007fffad1d4b78 error 4 in ld-2.5.so398a600000+1c000
其实demsg日志,也可以给我们提供一些有效信息,包括段错误的地址0x2559, 指令执行寄存器IP0x 000000398a6145b2, 当前栈地址SP 0x00007fffad1d4b78, 错误号4,以及段错误发生在ld-2.5.so
中。发生在ld-2.5.so 中, 我们没有debug symbol信息,因此无法直接定位段错误在程序的哪一行。但通过IP寄存器, 我们是可以定位到具体的汇编指令的。
objdump -d /lib64/ld-2.5.so > ld.asm 000000398a6145b0 <strcmp>: 398a6145b0: 8a 07 mov (%rdi),%al 398a6145b2: 3a 06 cmp (%rsi),%al 398a6145b4: 75 0d jne 398a6145c3 <strcmp+0x13>