IDAPython 教程6

Using IDAPython to Make Your Life Easier: Part 6

原文：https://unit42.paloaltonetworks.com/unit42-using-idapython-to-make-your-life-easier-part-6/

在我们的IDAPython博客系列的第5部分中，我们使用IDAPython从恶意样本中提取嵌入式可执行文件。对于第六部分，我想讨论一种非常自动化的方式来使用IDA。具体来说，让我们解决如何在不生成GUI的情况下将文件加载到IDA，自动运行IDAPython脚本并提取结果的方法。使用这种技术，我们将能够非常快速地处理许多样本，而无需在新的IDA实例中手动打开每个文件并运行IDAPython脚本。

许多人可能会惊讶地发现IDA可以完全在命令行上执行而无需生成GUI。为此，用户必须使用“ -A”开关运行IDA可执行文件。此特定开关将指示IDA以自主模式运行，从而确保没有窗口或对话框显示给用户。

以下命令行示例演示了OSX和Microsoft Windows中都使用的这一技术。在这些示例中，即使已存在一个“ -c”开关，它也会生成一个新的IDB文件。另外，“-S”开关指定将在执行时运行的IDAPython脚本。我们将在稍后的文章中使用这些开关。

"/Applications/IDA Pro 6.9/idaq.app/Contents/MacOS/idaq" -c -A -S/tmp/script.py file.exe

"C:\Program Files\IDA 6.9\idaq.exe" -c -A -SC:\script.py C:\file.exe

场景

在这个例子中，我将使用Cmstar恶意软件家族，前面所讨论的单元42对于那些不熟悉这个恶意软件系列，它是将转移文件的下载托管在特定的URL通过HTTP（S）和执行它在受害者的系统上。可以使用以下例程对有问题的URL进行混淆。

def decode(data):

out = ""

c = 0

for d in data:

out += chr(ord(d) - c - 10)

c += 1

return out

知道了这一点，我们的下一个任务是识别这些数据在Cmstar样本中的位置。通过比较几个样本，我们得出结论，使用对memcpy的调用将两个加密的字符串存储到变量中。其中一个字符串包含恶意软件将连接到的域或IP地址，另一个则包含URI。

我们还注意到，发生此memcpy指令时，将执行相同的指令序列：

mov esi，[offset]
pop ecx
lea edi，[variable]
rep movsd

图1在Cmstar中包含编码字符串的函数

掌握了这些信息后，我们可以尝试使用IDAPython识别此指令序列。为此，我们将遍历IDA标识的每个函数，并继续忽略标记为跳转函数或属于已知库的任何函数。然后，将通过使用滑动窗口来迭代其余功能，在该滑动窗口中，我们将一次检查四条指令，查找先前确定的标记以确定是否存在任何匹配项：

import idc, idautils

for func in idautils.Functions():

flags = idc.GetFunctionFlags(func)

# Ignore THUNK (jump function) or library functons

if flags & FUNC_LIB or flags & FUNC_THUNK:

continue

dism_addr = list(idautils.FuncItems(func))

for c in range(len(dism_addr)):

try:

# Look at four instructions at a time

v1 = dism_addr[c]

v2 = dism_addr[c+1]

v3 = dism_addr[c+2]

v4 = dism_addr[c+3]

# Look for known markers indicating we're seeing the encoded strings

# being copied to a variable.

if idc.GetMnem(v1) == 'mov' and idc.GetOpnd(v1, 0) == 'esi':

if idc.GetMnem(v2) == 'pop' and idc.GetOpnd(v2, 0) == 'ecx':

if idc.GetMnem(v3) == 'lea' and idc.GetOpnd(v3, 0) == 'edi':

if idc.GetDisasm(v4) == 'rep movsd':

print "[*] Found instruction starting at 0x{address:x}".format(address=v1)

except IndexError:

# Sliding window went past the end of the function

None

在上面的示例中，如果找到匹配项，我只是打印出调试字符串。在IDA中针对MD5哈希为4BEFA0F5B3F981E498ACD676EB352D45的示例运行此代码，我们得到以下输出。如下所示，我们已成功识别出两个混淆字符串的地址。

图2针对Cmstar示例的运行脚本

此时，我们可以获取已经确定的偏移量，并提取它们指向的字符串。然后可以使用先前定义的decode（）函数对这些字符串进行解码。

addr = idc.GetOperandValue(v1, 1)

data = ""

while Byte(addr) != 0x0:

data += chr(Byte(addr))

addr += 1

decoded = decode(data)

addr = idc.GetOperandValue(v1, 1)

综合所有这些，我们得出以下脚本：

import idautils, idc, idaapi

def decode(data):

out = ""

c = 0

for d in data:

out += chr(ord(d) - c - 10)

c += 1

return out

# Wait for auto-analysis to finish before running script

idaapi.autoWait()

url = ""

for func in idautils.Functions():

flags = idc.GetFunctionFlags(func)

# Ignore THUNK (jump function) or library functons

if flags & FUNC_LIB or flags & FUNC_THUNK:

continue

dism_addr = list(idautils.FuncItems(func))

for c in range(len(dism_addr)):

try:

# Look at four instructions at a time

v1 = dism_addr[c]

v2 = dism_addr[c+1]

v3 = dism_addr[c+2]

v4 = dism_addr[c+3]

# Look for known markers indicating we're seeing the encoded strings

# being copied to a variable.

if idc.GetMnem(v1) == 'mov' and idc.GetOpnd(v1, 0) == 'esi':

if idc.GetMnem(v2) == 'pop' and idc.GetOpnd(v2, 0) == 'ecx':

if idc.GetMnem(v3) == 'lea' and idc.GetOpnd(v3, 0) == 'edi':

if idc.GetDisasm(v4) == 'rep movsd':

addr = idc.GetOperandValue(v1, 1)

data = ""

while Byte(addr) != 0x0:

data += chr(Byte(addr))

addr += 1

decoded = decode(data)

url += decoded

except IndexError:

# Sliding window went past the end of the function

None

current_file = idaapi.get_root_filename()

f = open("/tmp/output.txt", 'ab')

if url != "":

f.write("[+] {0} : {1}\n".format(current_file, ''.join(url)))

f.close()

idc.Exit(0)

在这一阶段，我们可以使用在非GUI模式下运行IDA的自动化技术，并使用上面的脚本。这将使我们能够针对大量示例运行此脚本，而无需用户交互。我们将在OSX机器上运行脚本，如下所示：

为x in ls; 做/ Applications / IDA \ Pro \ 6.9 / idaq.app / Contents / MacOS / idaq -c -A -S / tmp / script.py $ x; 做完了

几分钟后，我们在/tmp/output.txt中处理了以下内容，这是我们指示脚本存储结果的位置。

图3 /tmp/output.txt的输出

结论

通过利用IDAPython的功能以及IDA的命令行开关，我们成功地自动提取了许多Cmstar示例的下载位置。这项技术可以轻松地应用于大量样本，从而使我们能够执行IDAPython操作，而无需手动打开IDA中的每个文件。对于那些不了解此IDA功能的读者，我恳请您对其进行研究，因为它不仅可以节省您的时间，而且在处理大量文件时使事情变得容易得多。

posted @ 2020-02-14 12:49 DirWangK 阅读(412) 评论(0) 编辑收藏举报

刷新页面返回顶部

DirWangK

勿在浮沙筑高台

IDAPython 教程6

Using IDAPython to Make Your Life Easier: Part 6

场景

结论

公告