Linux上调试core文件(Good)

coredump文件

一. 什么是coredump

通常情况下coredmp包含了程序运行时的内存，寄存器状态，堆栈指针，内存管理信息等。可以理解为把程序工作的当前状态存储成一个文件。许多程序和操作系统出错时会自动生成一个core文件。

造成程序coredump的原因很多，这里根据以往的经验总结一下：

1 内存访问越界

a) 由于使用错误的下标，导致数组访问越界
b) 搜索字符串时，依靠字符串结束符来判断字符串是否结束，但是字符串没有正常的使用结束符
c) 使用strcpy, strcat, sprintf, strcmp, strcasecmp等字符串操作函数，将目标字符串读/写爆。应该使用strncpy, strlcpy, strncat, strlcat, snprintf, strncmp, strncasecmp等函数防止读写越界。

2 多线程程序使用了线程不安全的函数

应该使用下面这些可重入的函数，尤其注意红色标示出来的函数，它们很容易被用错：
asctime_r(3c) gethostbyname_r(3n) getservbyname_r(3n) ctermid_r(3s) gethostent_r(3n) getservbyport_r(3n) ctime_r(3c) getlogin_r(3c) getservent_r(3n) fgetgrent_r(3c) getnetbyaddr_r(3n) getspent_r(3c) fgetpwent_r(3c) getnetbyname_r(3n) getspnam_r(3c) fgetspent_r(3c) getnetent_r(3n) gmtime_r(3c) gamma_r(3m) getnetgrent_r(3n) lgamma_r(3m) getauclassent_r(3) getprotobyname_r(3n) localtime_r(3c) getauclassnam_r(3) etprotobynumber_r(3n) nis_sperror_r(3n) getauevent_r(3) getprotoent_r(3n) rand_r(3c) getauevnam_r(3) getpwent_r(3c) readdir_r(3c) getauevnum_r(3) getpwnam_r(3c)strtok_r(3c) getgrent_r(3c) getpwuid_r(3c) tmpnam_r(3s) getgrgid_r(3c) getrpcbyname_r(3n) ttyname_r(3c) getgrnam_r(3c) getrpcbynumber_r(3n) gethostbyaddr_r(3n) getrpcent_r(3n)

3 多线程读写的数据未加锁保护

对于会被多个线程同时访问的全局数据，应该注意加锁保护，否则很容易造成core dump

4 非法指针

a) 使用空指针
b) 随意使用指针转换。一个指向一段内存的指针，除非确定这段内存原先就分配为某种结构或类型，或者这种结构或类型的数组，否则不要将它转换为这种结构或类型的指针，而应该将这段内存拷贝到一个这种结构或类型中，再访问这个结构或类型。这是因为如果这段内存的开始地址不是按照这种结构或类型对齐的，那么访问它时就很容易因为bus error而core dump. 总线错误（bus error）通常是指针强制转换，导致CPU读取数据违反了一定的总线规则。《c专家编程》

#include <stdlib.h>
#include <stdio.h>

#if defined(__GNUC__)
# if defined(__i386__)

/* Enable Alignment Checking on x86 */

__asm__("pushf\norl $0x40000,(%esp)\npopf");

# elif defined(__x86_64__)

/* Enable Alignment Checking on x86_64 */

__asm__("pushf\norl $0x40000,(%rsp)\npopf");

# endif
#endif

int main() {
    union{
        char a[10];
        int i;
    }u;

    int *p =(int*)&(u.a[1]);
    *p =17;
    printf("%d\n", *p);
}

原因是：

x86体系结构会把地址对齐之后，访问两次，然后把第一次的尾巴和第二次的头拼起来。

如果不是x86，那种体系结构下的机器不肯自动干这活，就会产生core。

如果在代码中将对齐检查功能打开，运行后能显示bus error。

5 堆栈溢出

不要使用大的局部变量（因为局部变量都分配在栈上），这样容易造成堆栈溢出，破坏系统的栈和堆结构，导致出现莫名其妙的错误。

1.core文件的生成开关和大小限制

---------------------------------
1）使用ulimit -c命令可查看core文件的生成开关。若结果为0，则表示关闭了此功能，不会生成core文件。

2）使用ulimit -c filesize命令，可以限制core文件的大小（filesize的单位为kbyte）。

#ulimit -c 300

ulimit -c unlimited 则表示core文件的大小不受限制。如果生成的信息超过此大小，将会被裁剪，最终生成一个不完整的core文件。在调试此

core文件的时候，gdb会提示错误。
2.core文件的名称和生成路径
----------------------------

二设置core文件相关参数

2.1 临时修改

若系统生成的core文件不带其它任何扩展名称，则全部命名为core。新的core文件生成将覆盖原来的core文件。
1）/proc/sys/kernel/core_uses_pid可以控制core文件的文件名中是否添加pid作为扩展。文件内容为1，表示添加pid作为扩展名，生成的core文件格式为core.xxxx；为0则表示生成的core文件同一命名为core。
可通过以下命令修改此文件：

echo "1" > /proc/sys/kernel/core_uses_pid

2）/proc/sys/kernel/core_pattern可以控制core文件保存位置和文件名格式。可通过以下命令修改此文件：

echo "/corefile/core-%e-%p-%t" > /proc/sys/kernel/core_pattern，可以将core文件统一生成到/corefile目录下，产生的文件名为core-命令名-pid-时间戳

直接在程序运行目录下产生core

echo "core-%e-%p-%t" > /proc/sys/kernel/core_pattern

2.2永久修改

方法一：使用sysctl -w name=value命令。例：/sbin/sysctl -w kernel.core_pattern=/corefile/core-%e-%p-%t

方法二：

设置core dump文件位置

vi /etc/sysctl.conf

修改（添加）如下两个变量

kernel.core_pattern =/var/core/core_%e_%p

kernel.core_uses_pid= 0

注：kernel.core_uses_pid= 1 （即使core_pattern没有给出%p进程ID，core文件名也会自动加上）

这里是改为生成目录在/var/core/，%e代表程序名称，%p是进程ID

如果想直接生成在可执行文件相同目录，前面不要加任何目录，直接

kernel.core_pattern =core_%e_%p

步骤三：让修改生效

sysctl -p /etc/sysctl.conf

如果想直接生成在可执行文件相同目录，前面不要加任何目录（远程没效果，存疑）

以下是参数列表:

%p - insert pid into filename 添加pid
%u - insert current uid into filename 添加当前uid
%g - insert current gid into filename 添加当前gid
%s - insert signal that caused the coredump into the filename
添加导致产生core的信号
%t - insert UNIX time that the coredump occurred into filename
添加core文件生成时的unix时间
%h - insert hostname where the coredump happened into filename
添加主机名
%e - insert coredumping executable name into filename
添加命令名

三 .用gdb查看core文件

发生coredump之后,用gdb进行查看core文件的内容,以定位文件中引发coredump的行.
gdb [execfile] [core file]

如: #gdb ./test core.22773

在进入gdb后, 用bt命令查看backtrace以检查发生程序运行到哪里, 来定位core dump发生在哪一行？

posted @ 2016-10-19 17:29 PKICA 阅读(363) 评论(1) 编辑收藏举报

刷新页面返回顶部