Proj THUDBFuzz Paper Reading: Fuzzing Challenges and Reflections
Abstract
Fuzzing-1. symbolic execution 2. random input generation
Intro
- Blockbox fuzzing
- mutational
- generational: Peach
- Greybox fuzzing: 需用到程序插桩
- Sanitizers会往程序中注入assertions
- e.g: AFL; LibFuzzer; Honggfuzz
- Whitebox fuzzing: 往往使用程序分析和constraint solvers来遍历有意义的code path
- constraint solver常常使用SMT(Satisfiability Module Theory)
- 提取要到达某条code path所必须解决的condition中的一阶逻辑公式,功能,谓词符号等,使用constraint solver计算出一组能够到达这条路径的解
- e.g: KLEE和SAGE
Recent Impact
1990就有了Fuzzing这个词,不过最近才在工业中大规模使用
Challenges
- How can we fuzz effeciently more types of software systems?
- 例如有环境交互,或者是机器学习
- How do we fuzz stateful software?
- How do we fuzz polyglot softwares?
- How do we fuzz GUI-based programs
- How can a symbolic execution tool fuzz a highly-structured-input software?
- How can the fuzzer identify more types of vulnerabilities(可以被认为是程序状态的断言)?
- 信道攻击可能性
- How can we find "deep bugs" for which efficient oracles exist, but which nevertheless evade detection?
- complex conditions
- What is the nature of vulnerabilities that have evaded discovery despite long fuzzing campaigns?
- How can fuzzers leverage the ingenuity of the auditor?
- How can the auditor instruct the fuzzer to overcome the roadblock?
- How can we improve the usability of fuzzing tools
- How can we prepare the output of a fuzzer for human consumption?
- How can we assess residual security risk if the fuzzing campaingn was unsuccessful?
- What are the theoretical limitations of blackbox, greybox and whitebox fuzzing?
- Given a program and a time budget, how can we select that fuzzing tech, or combination of techs, which finds the most vulnerabilities within time budget?
- How do program size and complexity affect the scalability and performance of each tech?
- How can we evaluate specialized fuzzers?
- How can we prevent overfitting to a specific benchmark?
- Are synthetic bugs representative?
- Are real bugs, which have previously been discovered with other fuzzers, representative?
- Is coverage a good measure of fuzzer effectiveness?
- What is a fair choice of time budget?
- How do we evaluate techs instead of implementations