Proj. CAR Paper Reading: Neural reverse engineering of stripped binaries using augmented control flow graphs
Abstract
背景:
逆向工程的挑战:stripped binary中语义信息很少,且不同的代码优化为assembly code patterns带来了很大差异
本文:
工具: Nero
方法:使用静态分析获取call sites的特征,结合control-flow graph中的call sites structure生成target name。本文分别尝试了graph-based, LSTM-based和transformer-based结构上应用该编码
任务:预测stripped binary中的procedure names
实验:
效果:
- existing methods: + 28%
- do not use any static analysis: +100%
1. Intro
P1: 逆向工程介绍
P2: precedure name的重要性
P3: 已有工作大多集中于高级语言,而非二进制
P4:
Challenge 1: Little syntactic information and token coreference.
Challenge 2: Long procedure names
本文步骤
- Build a control-flow graph (CFG) from the disassembled binary procedure input.
- Reconstruct a call-site-like structure for each call instruction present in the disassembled
code. - Use pointer-aware slicing to augment these call sites by finding concrete values or approximating abstracted values.
- Transform the CFG into an augmented call sites graph.
Nero: A framework for binary name prediction. Nero is composed of:
- a static binary analyzer for producing the augmented call sites representation from stripped binary procedures
- three different neural networks for generating and evaluating name prediction