36.Linux驱动调试-根据oops定位错误代码行
1.当驱动有误时,比如,访问的内存地址是非法的,便会打印一大串的oops出来
1.1以LED驱动为例
将open()函数里的ioremap()屏蔽掉,直接使用物理地址的GPIOF,如下图所示:
1.2然后编译装载26th_segmentfault并执行测试程序后,内核便打印了oops出来,如下图所示:
2.接下来,我们便来分析oops:
Unable to handle kernel paging request at virtual address 56000050 //无法处理内核页面请求的虚拟地址56000050 pgd = c3850000 [56000050] *pgd=00000000 Internal error: Oops: 5 [#1] //内部错误oops Modules linked in: 26th_segmentfault //表示内部错误发生在26th_segmentfault.ko驱动模块里 CPU: 0 Not tainted (2.6.22.6 #2) PC is at first_drv_open+0x78/0x12c [26th_segmentfault] //PC值:程序运行成功的最后一次地址,位于first_drv_open()函数里,偏移值0x78,该函数总大小0x12c LR is at 0xc0365ed8 //LR值 /*发生错误时的各个寄存器值*/ pc : [<bf000078>] lr : [<c0365ed8>] psr: 80000013 sp : c3fcbe80 ip : c0365ed8 fp : c3fcbe94 r10: 00000000 r9 : c3fca000 r8 : c04df960 r7 : 00000000 r6 : 00000000 r5 : bf000de4 r4 : 00000000 r3 : 00000000 r2 : 56000050 r1 : 00000001 r0 : 00000052 Flags: Nzcv IRQs on FIQs on Mode SVC_32 Segment user Control: c000717f Table: 33850000 DAC: 00000015 Process 26th_segmentfau (pid: 813, stack limit = 0xc3fca258) //发生错误时,进程名称为26th_segmentfault Stack: (0xc3fcbe80 to 0xc3fcc000) //栈信息 be80: c06d7660 c3e880c0 c3fcbebc c3fcbe98 c008d888 bf000010 00000000 c04df960 bea0: c3e880c0 c008d73c c0474e20 c3fb9534 c3fcbee4 c3fcbec0 c0089e48 c008d74c bec0: c04df960 c3fcbf04 00000003 ffffff9c c002c044 c380a000 c3fcbefc c3fcbee8 bee0: c0089f64 c0089d58 00000000 00000002 c3fcbf68 c3fcbf00 c0089fb8 c0089f40 bf00: c3fcbf04 c3fb9534 c0474e20 00000000 00000000 c3851000 00000101 00000001 bf20: 00000000 c3fca000 c04c90a8 c04c90a0 ffffffe8 c380a000 c3fcbf68 c3fcbf48 bf40: c008a16c c009fc70 00000003 00000000 c04df960 00000002 be84ce38 c3fcbf94 bf60: c3fcbf6c c008a2f4 c0089f88 00008588 be84ce84 00008718 0000877c 00000005 bf80: c002c044 4013365c c3fcbfa4 c3fcbf98 c008a3a8 c008a2b0 00000000 c3fcbfa8 bfa0: c002bea0 c008a394 be84ce84 00008718 be84ce30 00000002 be84ce38 be84ce30 bfc0: be84ce84 00008718 0000877c 00000003 00008588 00000000 4013365c be84ce58 bfe0: 00000000 be84ce28 0000266c 400c98e0 60000010 be84ce30 30002031 30002431 Backtrace: //回溯信息 [<bf000000>] (first_drv_open+0x0/0x12c [26th_segmentfault]) from [<c008d888>] (chrdev_open+0x14c/0x164) r5:c3e880c0 r4:c06d7660 [<c008d73c>] (chrdev_open+0x0/0x164) from [<c0089e48>] (__dentry_open+0x100/0x1e8) r8:c3fb9534 r7:c0474e20 r6:c008d73c r5:c3e880c0 r4:c04df960 [<c0089d48>] (__dentry_open+0x0/0x1e8) from [<c0089f64>] (nameidata_to_filp+0x34/0x48) [<c0089f30>] (nameidata_to_filp+0x0/0x48) from [<c0089fb8>] (do_filp_open+0x40/0x48) r4:00000002 [<c0089f78>] (do_filp_open+0x0/0x48) from [<c008a2f4>] (do_sys_open+0x54/0xe4) r5:be84ce38 r4:00000002 [<c008a2a0>] (do_sys_open+0x0/0xe4) from [<c008a3a8>] (sys_open+0x24/0x28) [<c008a384>] (sys_open+0x0/0x28) from [<c002bea0>] (ret_fast_syscall+0x0/0x2c) Code: bf000094 bf0000b4 bf0000d4 e5952000 (e5923000) Segmentation fault
2.1上面的回溯信息,表示了函数的整个调用过程
比如上面的回溯信息表示:
- sys_open()->do_sys_open()->do_filp_open()->nameidata_to_filp()->chrdev_open()->first_drv_open();
最终错误出在了first_drv_open();
若内核没有配置回溯信息显示,则就不会打印函数调用过程,可以修改内核的.config文件,添加:
//CONFIG_FRAME_POINTER,表示帧指针,用fp寄存器表示
内核里,就会通过fp寄存器记录函数的运行位置,并存到栈里,然后当出问题时,从栈里调出fp寄存器,查看函数的调用关系,就可以看到回溯信息.
(PS:若不配置,也可以直接通过栈来分析函数调用过程,在下章会分析到:http://www.cnblogs.com/lifexy/p/8011966.html)
2.2而有些内核的环境不同,opps也可能不会打印出上面的:
Modules linked in: 26th_segmentfault PC is at first_drv_open+0x78/0x12c [26th_segmentfault]
这些相关信息, 只打印PC值,就根本无法知道,到底是驱动模块出的问题,还是内核自带的函数出的问题?
所以oops里的最重要内容还是这一段: pc : [<bf000078>]
2.3那么如何来确定,该PC值地址位于内核的函数,还是我们装载的驱动模块?
答:
可以在内核源码的根目录下通过的“vi System.map”来查看,该文件保存了内核里所有(符号、函数)的虚拟地址映射,比如下图的内核函数root_dev_setup():
通过vi命令的:0和:$命令行,可以看到内核的虚拟地址是c0004000~c03cebf4
所以,pc值bf000078为的驱动模块的地址值
2.4当有多个驱动装载时,又如何区分PC值是哪个驱动的函数的地址值?
答:通过/proc/kallsyms来查看:
#cat /proc/kallsyms //(kernel all symbols)查看所有的内核标号(包括内核函数,装载的驱动函数,变量符号等)的地址值
或者:
#cat /proc/kallsyms> /kallsyms.txt //将地址值放入kallsyms.txt中
如下图所示,在kallsyms.txt里,找到pc值bf000078位于26th_segmentfault驱动里first_drv_open()函数下的bf000000+0x78中
2.5然后将驱动生成反汇编:
arm-linux-objdump -D 26th_segmentfault.ko >26th_segmentfault.dis //反汇编
2.6打开反汇编:
如下图所示,左边是kallsyms.txt,右边是26th_segmentfault.dis反汇编
显然pc值bf000078,就位于反汇编的78地址处:
Disassembly of section .text: //.text段起始地址为0x00 00000000 <first_drv_open>: 38: e59fc0e8 ldr ip, [pc, #232]; 128 <.text+0x128> //ip=.text段+0x128里的内容 ... ... 50: e585c000 str ip, [r5] //r5=.text段+0x128里的内容 ... ... 74: e5952000 ldr r2, [r5] //r2=.text段+0x128里的内容 78: e5923000 ldr r3, [r2] // r3=.text段+0x128里的内容 7c: e3c33c3f bic r3, r3, #16128 ;0x3f00 //清除0x56000050的bit8~13 ... ... 128: 56000050 undefined //.text段+0x128里的内容=0x56000050
从上面看到,78地址处,主要是将0x56000050(r2)地址里的内容放入r3中.
而0x56000050是个物理地址,在linux眼中便是个非法地址,所以出错
并找到出错地方位于first_drv_open ()函数下:
3.若发生错误的驱动位于内核的地址值时
3.1还是以26th_segmentfault.c为例,首先加入内核:
#cp 26th_segmentfault.c /linux-2.6.22.6/drivers/char/ //将有问题的驱动复制到字符驱动目录下
#vi Makefile
添加:
obj-y += 26th_segmentfault.o //y:将该驱动放入内核中
3.2然后make uImage装载新内核后,再运行测试程序,便会打印出opps信息
3.3在内核源码的根目录下通过:
# arm-none-linux-gnueabi-objdump -D vmlinux > vmlinux.dis
将整个内核反汇编, vmlinux:未压缩的内核
3.4 vi vmlinux.dis,然后通过oops信息的PC值直接来查找地址即可
接下来下章便通过栈信息来分析函数调用过程:http://www.cnblogs.com/lifexy/p/8011966.html
人间有真情,人间有真爱。