Proj THUDBFuzz Paper Reading: Singularity: Pattern Fuzzing for Worst Case Complexity

Abstract

Task: ⽤于确定给定应⽤程序的最坏情况的近似值
思路:to look for an input pattern rather than a concrete input that max the asymptotic resource usage 寻找输入的一种模式,而不是仅仅去寻找一个最大化目标程序的资源使用情况的具体输入
将测试问题转化为optimal program synthesis
方法:Recurrent Computational Graph + 遗传算法来解决最优合成问题optimal program synthesis
实验:

  1. 可以有效地发现各种算法的最坏复杂性
  2. 更具有可拓展性
  3. 可以在google Guava和JGraphT等程序中发现未知的性能错误和可用性漏洞

1. Intro

worst-case complexity analysis, worst performance inputs(WPIs)可以用来debug performance problems, and confirm the presence of security vulnerabilities. 解释最差资源占用情况的原因,帮助避免DoS attacks

假设:
WPIs almost always follow a specific pattern that can be expressed as a simple program.
WPI ⼏乎总是遵循可以表⽰为简单程序的特定模式。例如,要触发 插⼊排序算法的最坏情况性能,输⼊数组必须是反向排序的,这可以 通过将越来越⼤的数字附加到空列表来以编程⽅式

方法:
将复杂性测试问题转换为程序合成问题,其⽬标是找到⼀个表达所有 WPI 共享的通⽤模式的程序。
特别是, 给定⼀个⽬标程序P ,我们想要最⼤化其资源使⽤率,我们的算法合成另 ⼀个程序G,称为⽣成器,使得G的输出精确地对应于P
In particular, given a target program P whose resource usage we want to maximize, our algorithm synthesizes another program G, called a generator, such that the outputs of G correspond precisely to the WPIs of P.
In the simplest case, a generator G consists of an initial input seed s together with a function f whose output is larger than its input. Since size(fi(s)) > size(fj(s)) whenever i > j
For instance, the input pattern ([0], f = λx.append(x,last(x))) corresponds to an infinite sequence of inputs of the form {[0],[0, 0], [0, 0, 0], . . .}. Thus, we can determine the worst-case complexity of the target program by using the synthesized generator to obtain many WPIs and then fitting a curve through these data points.

最优合成问题。具体来说,我们使⽤⼀组 称为循环计算图 (RCG) 的 DSL 来表⽰⽣成器,这些 DSL (a) ⾜以对⼤多数感 兴趣的输⼊模式进行建模,并且 (b) 具有⾜够的限制性以使搜索空间易于管理

2. Overview

2.2 Motivating Example

Figure 1
Figure 2
遗传编程,开始于一系列近似于上下文无关文法的随机程序片段

3. Recurrent computation graphs

Figure 3

Definition 4. (Recurrent Computation Graph)A recurrent computation graph G is a triple (I, F, O) where I is a tuple of initialization expressions, F is a tuple of update expressions (where |I| = |F|), and O is a tuple of output expressions.

Figure 5

5 FINDING OPTIMAL RCG USING GP

5.2 Genetic Operators

Mutation operator\Crossover operator\Reproduction operator\ConstFold operator

5.3 Fitness Function

1)它应该与测量模型M ⼀致
2) 它应惩罚具有⾮常⼤ AST ⼤⼩的 RCG
3) 当两个 RCG 具有相似的⼤⼩和资源使⽤情况时,选择更简单的

posted @ 2022-05-25 01:12  雪溯  阅读(28)  评论(0编辑  收藏  举报