Proj CDeepFuzz Paper Reading: Exposing numerical bugs in deep learning via gradient back-propagation

Abstract

背景:numerical bugs可能导致NaN或者Inf这样的异常值,这种异常值被传播后最后会造成如log()这类的函数崩溃
本文:GRIST
Github: https://github.com/Jacob-yen/GRIST
Task: generate a small input to expose numerical bugs in DL Models
Method: leverage the gradients (through back-propagation) to understand how we should change the external values to induce an exception. 1. iterative external values: For each suspect function(e.g., log(x)), use the distance of current value to range boundary(defined as suspect loss) to guide the iterative. while min this loss and guide the external values to make the input invalid, numerical loss can be trigger 2. non-iterative: use a simple function to approximate

实验:
datasets: 63 real-world DL programs on Tensorflow and PyTorch
Competitors: DEBAR
效果:

  1. detects 79 bugs, + 56 bugs, 8 confirmed, 3 fixed
  2. 提供的reduced input与在原输入相比节约了8.79x执行时间
  3. miss 1 bug, no false positive
  • DEBAR 12 false positives, misses 31 true bugs

1. Intro

深度学习程序中的数字错误以“NaN”(意味着该值不是数字)、“INF”(意味着该值是⽆限数字)的形式表现出来,或者在训练或验证过程中崩溃[56 ]。它们通常是由数学属性违规或浮点表⽰错误引起的

如果数值错误不是确定性的(意味着它可能会或可能不会根据输⼊和特定运⾏⽽被触发),它必须与某些外部值直接或传递相关,这些外部值可能是训练输⼊由随机函数⽣成的样本或值(例如,随机初始权重)。这些外部值会导致数值运算(例如除法)中的⽆效操作数或数学函数(例如log())的⽆效参数,从⽽导致 NaN/INF

DEBAR 需要为其他语⾔或没有源代码的第三⽅库⼿动创建模, DEBAR依赖于DL程序的静态计算图,因此⽆法适⽤于具有动态计算图的DL程序

Q: the underlying infrastructures such as TensorFlow and PyTorch have a powerful mechanism to compute the gradients of arbitrary operands and function parameters regarding external inputs, As such, we do not need to derive the explicit symbolic form of data flow like in [56]

compute gradients between an arbitrary external value and a parameter of some internal mathematical operation in GRIST

3. Approach

3.1 Overview

Static Analyzer:识别external values和vulnerable operation

  1. 将对应external values的variables设为trainable, 使tensorflow和Pytorch记录其gradients
  2. 为每个vulnerable的operation建立一个名为suspect loss的函数,用来描述从当前值到invalid value的距离,目标是最小化该值

Gradient back propagation

  1. iterative external values: 对通过多轮训练、激活来影响程序状态的external inputs(e.g., training samples, random weight perturbations),对每个suspect operation,用dataflow找到贡献suspect loss的全部external values,然后gradient back propagation
  2. non-iterative: 对于程序影响只有1次,通常是在load时影响程序的external inputs,直接用一个函数模拟

driver:
只替换一小部分不重要的samples
it retains those that are important (and hence must have gone through non-trivial changes by gradient back-propagation). Fresh samples are needed to prevent the failure-inducing input generation process from being trapped in some local optima (that cannot trigger the numerical bug)

Defining Suspect Loss. For a vulnerable operation T(x), GRIST constructs its suspect loss automatically according to its valid ranges.
In the simplest scenario, let T(x) have a valid input range x>c, GRIST constructs its suspect loss f(i) = xi - c
For T(x) with multiple valid ranges, denoted as (l1, u1)∪(l2, u2)∪...(lk, uk) without losing generality, GRIST constructs a loss function for each of the boundary values as follows
flt(i) = xi - lt, fut(i) = ut - xi, with t ∈ [1,k]
运行时用最近的区间

posted @ 2023-08-29 16:05  雪溯  阅读(4)  评论(0编辑  收藏  举报