从Android设备中提取内核和逆向分析
转载自http://blog.csdn.net/qq1084283172/article/details/57074695
一、手机设备环境
- Model number: Nexus 5
- OS Version: Android 4.4.4 KTU84P
- Kernel Version: 3.4.0-gd59db4e
二、Android内核提取
- adb shell
- su
- cd /dev/block/platform/msm_sdcc.1/by-name
- ls -l boot
boot 是个系统符号软链接,/dev/block/mmcblk0p19 就是boot分区
用 dd 将其dump到Nexus 5手机的sdcard文件夹下:
- dd if=/dev/block/mmcblk0p19 of=/sdcard/boot.img
adb pull 将dump出来的boot.img文件导出到 /home/androidcode/AndroidDevlop/Nexus5Boot 文件夹下
- adb pull /sdcard/boot.img /home/androidcode/AndroidDevlop/Nexus5Boot
用 Binwalk 工具分析boot.img文件
1.Binwalk工具的详细使用说明:Binwalk:后门(固件)分析利器
2.Binwalk工具的github地址:https://github.com/devttys0/binwalk
3.Binwalk工具的官方网址:http://binwalk.org/
4.Binwalk工具的wiki使用说明的地址:https://github.com/devttys0/binwalk/wiki
5.Binwalk工具作者收集的IDA插件和脚本:https://github.com/devttys0/ida
6.Binwalk工具的安装说明:https://github.com/devttys0/binwalk/blob/master/INSTALL.md
安装Binwalk工具并分析boot.img文件
- cd /home/androidcode/AndroidDevlop/Nexus5Boot/binwalk-master
- # 按照binwalk工具的说明安装binwalk
- sudo python setup.py install
- # 分析boot.img文件
- sudo binwalk ../boot.img >log
分析的结果截图:
boot.img文件跳过2k的文件头之后,包括有两个gz包,一个是boot.img-kernel.gz即Linux内核,一个是boot.img-ramdisk.cpio.gz,
大概的组成结构如下图,详细的信息可以参考Android源码的 android/platform/system/core/master/mkbootimg/bootimg.h 文件,在线查看 booting.h 文件地址:https://android.googlesource.com/platform/system/core/+/master/mkbootimg/bootimg.h 。
- /* tools/mkbootimg/bootimg.h
- **
- ** Copyright 2007, The Android Open Source Project
- **
- ** Licensed under the Apache License, Version 2.0 (the "License");
- ** you may not use this file except in compliance with the License.
- ** You may obtain a copy of the License at
- **
- ** http://www.apache.org/licenses/LICENSE-2.0
- **
- ** Unless required by applicable law or agreed to in writing, software
- ** distributed under the License is distributed on an "AS IS" BASIS,
- ** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- ** See the License for the specific language governing permissions and
- ** limitations under the License.
- */
- #include <stdint.h>
- #ifndef _BOOT_IMAGE_H_
- #define _BOOT_IMAGE_H_
- typedef struct boot_img_hdr boot_img_hdr;
- #define BOOT_MAGIC "ANDROID!"
- #define BOOT_MAGIC_SIZE 8
- #define BOOT_NAME_SIZE 16
- #define BOOT_ARGS_SIZE 512
- #define BOOT_EXTRA_ARGS_SIZE 1024
- struct boot_img_hdr
- {
- uint8_t magic[BOOT_MAGIC_SIZE];
- uint32_t kernel_size; /* size in bytes */
- uint32_t kernel_addr; /* physical load addr */
- uint32_t ramdisk_size; /* size in bytes */
- uint32_t ramdisk_addr; /* physical load addr */
- uint32_t second_size; /* size in bytes */
- uint32_t second_addr; /* physical load addr */
- uint32_t tags_addr; /* physical addr for kernel tags */
- uint32_t page_size; /* flash page size we assume */
- uint32_t unused; /* reserved for future expansion: MUST be 0 */
- /* operating system version and security patch level; for
- * version "A.B.C" and patch level "Y-M-D":
- * ver = A << 14 | B << 7 | C (7 bits for each of A, B, C)
- * lvl = ((Y - 2000) & 127) << 4 | M (7 bits for Y, 4 bits for M)
- * os_version = ver << 11 | lvl */
- uint32_t os_version;
- uint8_t name[BOOT_NAME_SIZE]; /* asciiz product name */
- uint8_t cmdline[BOOT_ARGS_SIZE];
- uint32_t id[8]; /* timestamp / checksum / sha1 / etc */
- /* Supplemental command line data; kept here to maintain
- * binary compatibility with older versions of mkbootimg */
- uint8_t extra_cmdline[BOOT_EXTRA_ARGS_SIZE];
- } __attribute__((packed));
- /*
- ** +-----------------+
- ** | boot header | 1 page
- ** +-----------------+
- ** | kernel | n pages
- ** +-----------------+
- ** | ramdisk | m pages
- ** +-----------------+
- ** | second stage | o pages
- ** +-----------------+
- **
- ** n = (kernel_size + page_size - 1) / page_size
- ** m = (ramdisk_size + page_size - 1) / page_size
- ** o = (second_size + page_size - 1) / page_size
- **
- ** 0. all entities are page_size aligned in flash
- ** 1. kernel and ramdisk are required (size != 0)
- ** 2. second is optional (second_size == 0 -> no second)
- ** 3. load each element (kernel, ramdisk, second) at
- ** the specified physical address (kernel_addr, etc)
- ** 4. prepare tags at tag_addr. kernel_args[] is
- ** appended to the kernel commandline in the tags.
- ** 5. r0 = 0, r1 = MACHINE_TYPE, r2 = tags_addr
- ** 6. if second_size != 0: jump to second_addr
- ** else: jump to kernel_addr
- */
- #if 0
- typedef struct ptentry ptentry;
- struct ptentry {
- char name[16]; /* asciiz partition name */
- unsigned start; /* starting block number */
- unsigned length; /* length in blocks */
- unsigned flags; /* set to zero */
- };
- /* MSM Partition Table ATAG
- **
- ** length: 2 + 7 * n
- ** atag: 0x4d534d70
- ** <ptentry> x n
- */
- #endif
- #endif
有关boot.img文件的生成可以参考Android源码的 android/platform/system/core/master/mkbootimg/bootimg 文件,在线查看 booting文件地址:https://android.googlesource.com/platform/system/core/+/master/mkbootimg/mkbootimg。
- #!/usr/bin/env python
- # Copyright 2015, The Android Open Source Project
- #
- # Licensed under the Apache License, Version 2.0 (the "License");
- # you may not use this file except in compliance with the License.
- # You may obtain a copy of the License at
- #
- # http://www.apache.org/licenses/LICENSE-2.0
- #
- # Unless required by applicable law or agreed to in writing, software
- # distributed under the License is distributed on an "AS IS" BASIS,
- # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- # See the License for the specific language governing permissions and
- # limitations under the License.
- from __future__ import print_function
- from sys import argv, exit, stderr
- from argparse import ArgumentParser, FileType, Action
- from os import fstat
- from struct import pack
- from hashlib import sha1
- import sys
- import re
- def filesize(f):
- if f is None:
- return 0
- try:
- return fstat(f.fileno()).st_size
- except OSError:
- return 0
- def update_sha(sha, f):
- if f:
- sha.update(f.read())
- f.seek(0)
- sha.update(pack('I', filesize(f)))
- else:
- sha.update(pack('I', 0))
- def pad_file(f, padding):
- pad = (padding - (f.tell() & (padding - 1))) & (padding - 1)
- f.write(pack(str(pad) + 'x'))
- def write_header(args):
- BOOT_MAGIC = 'ANDROID!'.encode()
- args.output.write(pack('8s', BOOT_MAGIC))
- args.output.write(pack('10I',
- filesize(args.kernel), # size in bytes
- args.base + args.kernel_offset, # physical load addr
- filesize(args.ramdisk), # size in bytes
- args.base + args.ramdisk_offset, # physical load addr
- filesize(args.second), # size in bytes
- args.base + args.second_offset, # physical load addr
- args.base + args.tags_offset, # physical addr for kernel tags
- args.pagesize, # flash page size we assume
- 0, # future expansion: MUST be 0
- (args.os_version << 11) | args.os_patch_level)) # os version and patch level
- args.output.write(pack('16s', args.board.encode())) # asciiz product name
- args.output.write(pack('512s', args.cmdline[:512].encode()))
- sha = sha1()
- update_sha(sha, args.kernel)
- update_sha(sha, args.ramdisk)
- update_sha(sha, args.second)
- img_id = pack('32s', sha.digest())
- args.output.write(img_id)
- args.output.write(pack('1024s', args.cmdline[512:].encode()))
- pad_file(args.output, args.pagesize)
- return img_id
- class ValidateStrLenAction(Action):
- def __init__(self, option_strings, dest, nargs=None, **kwargs):
- if 'maxlen' not in kwargs:
- raise ValueError('maxlen must be set')
- self.maxlen = int(kwargs['maxlen'])
- del kwargs['maxlen']
- super(ValidateStrLenAction, self).__init__(option_strings, dest, **kwargs)
- def __call__(self, parser, namespace, values, option_string=None):
- if len(values) > self.maxlen:
- raise ValueError('String argument too long: max {0:d}, got {1:d}'.
- format(self.maxlen, len(values)))
- setattr(namespace, self.dest, values)
- def write_padded_file(f_out, f_in, padding):
- if f_in is None:
- return
- f_out.write(f_in.read())
- pad_file(f_out, padding)
- def parse_int(x):
- return int(x, 0)
- def parse_os_version(x):
- match = re.search(r'^(\d{1,3})(?:\.(\d{1,3})(?:\.(\d{1,3}))?)?', x)
- if match:
- a = int(match.group(1))
- b = c = 0
- if match.lastindex >= 2:
- b = int(match.group(2))
- if match.lastindex == 3:
- c = int(match.group(3))
- # 7 bits allocated for each field
- assert a < 128
- assert b < 128
- assert c < 128
- return (a << 14) | (b << 7) | c
- return 0
- def parse_os_patch_level(x):
- match = re.search(r'^(\d{4})-(\d{2})-(\d{2})', x)
- if match:
- y = int(match.group(1)) - 2000
- m = int(match.group(2))
- # 7 bits allocated for the year, 4 bits for the month
- assert y >= 0 and y < 128
- assert m > 0 and m <= 12
- return (y << 4) | m
- return 0
- def parse_cmdline():
- parser = ArgumentParser()
- parser.add_argument('--kernel', help='path to the kernel', type=FileType('rb'),
- required=True)
- parser.add_argument('--ramdisk', help='path to the ramdisk', type=FileType('rb'))
- parser.add_argument('--second', help='path to the 2nd bootloader', type=FileType('rb'))
- parser.add_argument('--cmdline', help='extra arguments to be passed on the '
- 'kernel command line', default='', action=ValidateStrLenAction, maxlen=1536)
- parser.add_argument('--base', help='base address', type=parse_int, default=0x10000000)
- parser.add_argument('--kernel_offset', help='kernel offset', type=parse_int, default=0x00008000)
- parser.add_argument('--ramdisk_offset', help='ramdisk offset', type=parse_int, default=0x01000000)
- parser.add_argument('--second_offset', help='2nd bootloader offset', type=parse_int,
- default=0x00f00000)
- parser.add_argument('--os_version', help='operating system version', type=parse_os_version,
- default=0)
- parser.add_argument('--os_patch_level', help='operating system patch level',
- type=parse_os_patch_level, default=0)
- parser.add_argument('--tags_offset', help='tags offset', type=parse_int, default=0x00000100)
- parser.add_argument('--board', help='board name', default='', action=ValidateStrLenAction,
- maxlen=16)
- parser.add_argument('--pagesize', help='page size', type=parse_int,
- choices=[2**i for i in range(11,15)], default=2048)
- parser.add_argument('--id', help='print the image ID on standard output',
- action='store_true')
- parser.add_argument('-o', '--output', help='output file name', type=FileType('wb'),
- required=True)
- return parser.parse_args()
- def write_data(args):
- write_padded_file(args.output, args.kernel, args.pagesize)
- write_padded_file(args.output, args.ramdisk, args.pagesize)
- write_padded_file(args.output, args.second, args.pagesize)
- def main():
- args = parse_cmdline()
- img_id = write_header(args)
- write_data(args)
- if args.id:
- if isinstance(img_id, str):
- # Python 2's struct.pack returns a string, but py3 returns bytes.
- img_id = [ord(x) for x in img_id]
- print('0x' + ''.join('{:02x}'.format(c) for c in img_id))
- if __name__ == '__main__':
- main()
根据上面的信息,从boot.img中提取出压缩的内核文件:
- cd ../
- dd if=boot.img of=kernel.gz bs=1 skip=20660
由于Android的内核文件经过了gzip压缩,因此要拿到最终的Android内核文件还需要进行解压缩:
- gzip -d kernel.gz
补充说明:关于Android的内核文件的提取和解压方法很多,常用的工具也比较多,也可以使用下面的几个工具之一来进行boot.img文件的解包和gzip的解压缩操作。
- bootimg.exe https://github.com/cofface/android_bootimg
- bootimg-tools https://github.com/pbatard/bootimg-tools.git
- unpackbootimg http://bbs.pediy.com/showthread.php?t=197334
- abootimg https://github.com/ggrandou/abootimg
解压后的Android内核文件kernel中不包含符号信息。所以还要从Android设备中提取符号信息,尽管 /proc/kallsyms 文件中存储了所有内核符号信息,但是从分析的结果来看,文件中存储的内存地址值都是0,这是为了防止内核地址泄露。在dump 镜像文件boot.img的Android设备上执行下面的命令,就会发现Android设备上的所有内核符号都被屏蔽隐藏了。
- adb shell
- cat /proc/kallsyms
为了要获取Android内核中所有的内核符号信息,可以通过在root权限下,修改Andriod设备中的/proc/sys/kernel/kptr_restrict的值来实现,去掉Android内核符号的信息屏蔽。
- adb shell
- su
- # 查看默认值
- cat /proc/sys/kernel/kptr_restrict
- # 关闭内核符号屏蔽
- echo 0 > /proc/sys/kernel/kptr_restrict
- # 查看修改后的值
- cat /proc/sys/kernel/kptr_restrict
- cat /proc/kallsyms
关闭Android设备的内核符号的屏蔽以后,再次执行 cat /proc/kallsyms ,发现被隐藏的内核符号信息都显示出来了。
在root权限下,将Android设备中的内核符号信息dump出来,导出到 /home/androidcode/AndroidDevlop/Nexus5Boot/syms.txt文件中。因此,Android内核文件的内核符号信息都保存在syms.txt文件中了。
- # cat /proc/kallsyms > /sdcard/syms.txt
- # exit
- $ exit
- $ adb pull /sdcard/syms.txt syms.txt
三、IDA分析导出的Androd内核文件
将提取出来的Android内核 kernel文件 拖到IDA Pro 6.8中进行分析,设置处理器类型为ARM Little-endian。
在 ROM start address和Loading address 处填上0xc0008000,然后点击 OK 完成 。
*至于这里为什么要设置 ROM start address 和 Loading address的地址为 0xc0008000? 具体的可以参考 bootheader这个数据结构,在这里需要关注其中几个比较重要的值,这些值定义在boot/boardconfig.h中,不同的芯片对应vendor下不同的boardconfig,在这里我们的值分别是(分别是kernel/ramdis/tags 载入ram的物理地址):
- #define PHYSICAL_DRAM_BASE 0x00200000
- #define KERNEL_ADDR (PHYSICAL_DRAM_BASE + 0x00008000)
- #define RAMDISK_ADDR (PHYSICAL_DRAM_BASE + 0x01000000)
- #define TAGS_ADDR (PHYSICAL_DRAM_BASE + 0x00000100)
- #define NEWTAGS_ADDR (PHYSICAL_DRAM_BASE + 0x00004000)
上面这些值分中 KERNEL_ADDR 就是 ZTEXTADDR,RAMDISK_ADDR 就是 INITRD_PHYS,而 TAGS_ADDR 就是PARAMS_PHYS。bootloader会从boot.img的分区中将kernel和ramdisk分别读入RAM上面定义的内存地址中,然后就会跳到ZTEXTADDR开始执行。
详细的参考:Android boot.img 结构
OK,现在就可以在IDA中查看和分析Android内核文件中的代码了,但是函数名称不是很友好,很多系统函数的名称都没有显示出来,只是显示成IDA中默认的普通函数名称。
前面我们已经将Androd内核文件中的内核符号信息都dump出来,这里大有用武之地。因此,向IDA中导入之前提取出来的内核符号信息就可以看到对应的函数名称了。需要用到下面的python脚本:
- ksyms = open("C:\Users\Fly2016\Desktop\Binwalk工具\Nexus5_kernel\syms.txt")
- for line in ksyms:
- addr = int(line[0:8],16)
- name = line[11:]
- idaapi.set_debug_name(addr,name)
- MakeNameEx(addr,name,SN_NOWARN)
- Message("%08X:%sn"%(addr,name))
在IDA的 File->Script Command中运行上述python脚本,之后就可以在IDA中成功添加内核符号信息使IDA显示出正确的系统调用的函数名称来。
大功告成,现在可以愉快的分析Android内核的代码了:
总结:通过这种dump设备固件的方法,可以逆向分析没有源码的固件二进制文件,对于Android设备来说又可以通过这种方法修改Android的内核文件来进行反调试或者其他的目的。将修改好的Android内核文件使用boot.img等打包解包工具还原打包回到boot.img文件中,然后 fastboot flash boot boot.img 更新Android设备的内核文件即可达到目的。
学习链接:
从Android手机中提取内核 <主要参考>