论文笔记：Symbolic Execution for Software Testing: Three Decades Later

作者 Cristian Cadar 是英国帝国理工学院SRG（Software Reliability Group）小组负责人，斯坦福大学计算机科学博士，KLEE的开创者。

这是一篇发表于 2013年的综述性质的文章，主要讨论了符号执行的关键技术。

1. Introduction

an effective technique for generating highcoverage test suites and for finding deep errors in complex software applications.

we give an overview of modern symbolic execution
techniques, and discuss their challenges in terms of path exploration, constraint solving, and memory modeling.

choose to illustrate
some of the key challenges and proposed solutions by using
examples drawn primarily from the authors’ own work

符号执行的工作流程：

generate a set of concrete input values exercising that
path.
check for the presence of various kinds of errors
including assertion violations, uncaught exceptions, security
vulnerabilities, and memory corruption.

符号执行的优势：

符号执行能够生成高覆盖率的测试用例
符号执行提供了具体的测试用例来触发 bug，并且独立于符号执行程序。

这里有个问题，如何保证生成的测试用例的正确性？

这里我认为不用保证，对于生成的测试用例，不管正确与否，只要能被原程序执行后，发现原程序的错误就可以。也就是上述提到的第二点。

which can be used to confirm and debug the error independently of the symbolic execution tool that generated it

对于根据给定路径，寻找错误，符号执行比传统的动态执行技术像(valgrind 或者 purify)更好用。不像其他的程序分析技术，符号执行不仅仅可以寻找 generic error，像缓冲区溢出，而且可以 reason about 更高等级的程序性质，像复杂的程序断言。

2. Overview Of Classical Symbolic Execution

The key idea behind symbolic execution is to use
symbolic values, instead of concrete data values as input
and to represent the values of program variables as symbolic
expressions over the symbolic input values. As a result, the
output values computed by a program are expressed as a
function of the symbolic input values.

不使用具体的数值作为输入，而是使用符号表达式表示的符号值。
输出结果被表示为符号输入值的函数。

In software testing,
symbolic execution is used to generate a test input for each
execution path of a program.

如下图所示的例子中：

函数 testme(int x, int y) 有 3 个执行路径，如下图所示：

使用 $ {x = 0, y = 1}, {x = 2, y = 1}, {x = 30, y = 15} $ 这三个例子可以使得三条路径都被执行。所以符号执行的目的就是生成一个输入集合，使得在可接受的时间范围内，尽可能的执行更多的路径。

符号执行的内部数据结构状态如下所述：

Symbolic execution maintains a symbolic state σ, which
maps variables to symbolic expressions, and a symbolic
path constraint PC, which is a quantifier-free first-order
formula over symbolic expressions. At the beginning of a
symbolic execution, σ is initialized to an empty map and
PC is initialized to true. Both σ and PC are updated during
the course of symbolic execution. At the end of a symbolic
execution along an execution path of the program, PC is
solved using a constraint solver to generate concrete input
values. If the program is executed on these concrete input
values, it will take exactly the same path as the symbolic
execution and terminate in the same way.

要维持两个数据结构的状态：

PC (path constraint) 被初始化为 True
σ (变量与其符号值的映射) 初始化为空

如图所示，$ x 与 x_0 $ 就是将 $ x $ map 到 $ x_0 $ 这个符号值，同理 $ y $ 也进行这样的 map。

当遇到赋值操作，就需要将这个赋值操作的造成的影响更新到 σ 集合。

在遇到分支语句时 PC 需要更新为 $ PC \wedge σ(e) $。
同时创建一个新的 PC’ 用于表示 else 分支的，其值为 $ PC \wedge \neg σ(e) $。

如果 PC 是可以满足的，配合 σ 就可以继续进行符号执行。如果 PC 或者 PC‘ 是不可满足的终止当前路径的符号执行。

如何终止？

If a symbolic execution instance hits an exit statement or
an error (e.g., the program crashes or violates an assertion),
the current instance of symbolic execution is terminated and
a satisfying assignment to the current symbolic path constraint is generated, using an off-the-shelf constraint solver.

当符号执行遇到循环的结构的时候，会产生无数的路径，如下图：

循环会产生例如 $ N_1 = 3, N_2 = -1 $ 的路径，或者 $ N_1 = 1, N_2 = 2, N_3 = 3, ... N_9 = -10 $ 等

使用下面的式子进行表示

传统符号执行的一大问题

A key disadvantage of classical symbolic execution is
that it cannot generate an input if the symbolic path constraint along an execution path contains formulas that cannot be (efficiently) solved by a constraint solver.

我们将图一中的 twice 函数，换成下图所示的函数：

如果求解器不能计算非线性函数那么，$x_0 = (y_0y_0) mod 50 和 x_0 \neq (y_0y_0) mod 50 $ 这样的表达式就无法进行求解。

或者当 twice 函数是一个不知道如何现实的函数，求解器也无法进行运算。

下文将介绍两个现代的符号执行技术缓解上述问题，至少能保证对经过修改的程序，生成一些测试输入。

3. Modern Symbolic Execution Techniques

Concolic Testing

有记录变量状态的数据结构变为两个

concrete state maps (变量和具体值得映射)
symbolic stat maps (没有具体值就记录其符号值)

主要思路：

Concolic testing executes a program starting with some given or random
input, gathers symbolic constraints on inputs at conditional
statements along the execution, and then uses a constraint
solver to infer variants of the previous inputs in order to
steer the next execution of the program towards an alternative execution path.

Execution-Generated Testing (EGT)

KLEE 和 EXE 实现的主要方式

To this end, EGT intermixes concrete and symbolic execution by dynamically checking before every operation if the values involved are
all concrete.

在执行每个操作之前都要检查变量所有变量的状态，如果都是具体值，就执行原程序。
如果有一个值为符号值，就执行符号操作，更新路径以及约束条件。

One of the key advantages in mixing concrete and
symbolic execution is that imprecision caused by the interaction with external code or constraint solving timeouts can
be alleviated using concrete values.

the use of concrete values allows dynamic symbolic execution to recover from that imprecision, albeit at the cost of missing some execution paths,
and thus sacrificing completeness.

使用具体值得符号执行可以缓解外部代码和约束求解超时造成的不精确的问题，但是也会丧失完整性，导致因为所有变量使用的是具体值，从而某些路径无法执行到。

4. Key Challenges And Some Solutions

4.1 Path Explosion

程序的路径呈指数增长，在给定的时间内尽可能的发现更多的路径。

符号执行首先会过滤掉不依赖于符号输入的路径，然后是不可达的路径。

There are two
key approaches that have been used to address this problem:
heuristically prioritizing the exploration of the most promising path,
and using sound program analysis techniques to
reduce the complexity of the path exploration.

Heuristic techniques

使用控制流图指导路径发掘
从程序的输入开始，每次分支都随机选择一边
Another successful approach is to interleave symbolic exploration with random testing
More recently, symbolic execution was combined with evolutionary search, in which a fitness function is used to drive the exploration of the input space
Mutation testing has also been combined successfully with dynamic symbolic execution

Sound program analysis techniques

The other key way
in which the path explosion problem has been approached
was to use various ideas from program analysis and software
verification to reduce the complexity of the path exploration
in a sound way

merge explored paths statically, using select expressions that are then passed directly to the constraint
solver
compositional techniques improve symbolic execution by caching and reusing the analysis of lower-level functions in subsequent computations
lazy test generation

为了避免重复访问语句，要自动剪枝冗余路径。

通过约束和到达的位置情况，如果相同就计算一次。
可以通过将不相关的路径分区，分别进行处理。

4.2 Constraint Solving

约束求解是符号执行的瓶颈，需要优化生成的约束条件。

Irrelevant constraint elimination

determine the feasibility of taking a certain branch side

举了个例子，在混合符号执行过程中，会给一个判断条件进行取反来得到另一个分支的约束。通过观察发现，一个程序的分支判断条件，往往只依赖于一小部分变量。因此一个优化点就是删除掉不相关的变量。

More formally, the algorithm computes the transitive closure of all the constraints on which the negated constraint depends, by looking whether they share any variables between them.

Incremental solving

One important characteristic of the
constraint sets generated during symbolic execution is that
they are expressed in terms of a fixed set of static branches
from the program source code. For this reason, many paths
have similar constraint sets, and thus allow for similar solutions.

可以通过使用之前求出的结果，来加快约束求解的速度。

在 KLEE 中，如果碰到一个已经 cache 过的约束的子集，就可以删掉这个子集所代表的约束了，因为这个约束不影响最终父集所决定的结果。如果碰到一个已经 cache 过的约束的父集， KLEE 会通过检查 cache 的结果是否可以满足父集来决定是否继续求解。

在实践中约束通常不会使现有解决方案失效。

4.3 Memory Modeling

内存模型影响程序语句被翻译为符号约束的精确度。
对指针建模的处理，DART，CUTE 和 CREST对指针的处理能力很弱。EXE 和 KLEE 对指针类型进行建模。
精度和可扩展性之间的 trade-off，要根据被分析的代码和使用的约束求解器来决定。

In addition, note that in dynamic symbolic
execution, one can tune both scalability and precision by
customizing the use of concrete values in symbolic formulas.

4.4 Handling Concurrency

并发程序很难进行测试，动态符号执行能够高效的进行测试。

5. Tools

介绍了 5 个工具。

DART(Directed Automated Random Testing) 第一个动态符号执行工具，由贝尔实验室开发用于测试 C 程序。其目的是尽可能系统性地执行程序的所有路径。
CUTE(A Concolic Unit Testing Engine) 扩展了 DART 工具，由伊利诺伊大学厄巴纳-香槟分校开发，能够处理多线程程序，生成测试输入，和安排线程调度。适用于 C 语言程序。 jCute 适用于 Java 程序。
CREST 用于测试 C 程序，是一个可扩展的构建平台，用来尝试通过启发式的方法进行路径探索。
EXE 为 C 语言编写的复杂系统软件进行符号执行的工具。使用 bit 级别的内存模型，使用 STP 约束求解器。
KLEE 使用 LLVM 重新设计的 EXE，相对于 EXE，其最大的提升就是能够存储大量的并发状态，加强了与外部环境的交互能力，通过为此目的设计的模型。

还有介绍了一些工业界的符号执行工具。

6. Conclusion

符号执行已经变成一种高效的程序测试技术。能够自动化的生成触发程序错误的输入。错误包括 low-level 的程序 crash，到 higher-level 的语义性质错误。

generate test suite that achieve high program coverage; and provide per-path correctness guarantees.

existing tools have already proved effective in testing and finding errors in a variety of software,
varying from low-level network and operating systems code
to higher-level applications code.

词汇

draw from （从…中）得到，获得

reason about 关于...的原因

off-the-shelf 现成的; 非专门设计（或定制）的

alleviate 减轻; 缓和; 缓解

conjunct adj. 结合的；共同的；连接的

to this end 为了这个目的; 为此目的; 为此计; 为了达到这个目标; 为这目的;

intermix （使）混合；（使）混杂

creep in 开始发生（或影响）

albeit 同 although尽管；虽然

caveat 警告；告诫

interleave （尤指将片状物）插入，夹进

adequacy 足够的；合格的；合乎需要的

compositional 成分；构成；组合方式

counterexample-guided 反例引导

refinement （精细的）改进，改善

sound technique 可靠的技术

exploits 利用（…为自己谋利）

Reference

原文链接

posted on 2023-01-10 22:52 LambdaQ 阅读(193) 评论(0) 编辑收藏举报

刷新页面返回顶部

LiJQLOG