Proj CDeepFuzz Paper Reading: Detecting and Understanding Real-World Differential Performance Bugs in Machine Learning Libraries
Abstract
本文: DPFuzz
Github: https://github.com/Tizpaz/DPFuzz
Task: Differential performance analysis with input groups, provide an explanation for the performance difference
Method:
- 每个类都有一个performance function maps the input size to performance
- to discover class: evolutionary fuzzing + clustering
- to explain: discriminant learning with clustering and decision tree
Steps:
- evolutionary fuzzing algo, use clustering to merge similar input classes
- to explain the differential performance using program inputs and internals, use discriminant learning with clustering and decision tree to localize suspicious code regions
实验:
datasets: a set of micro-benchmarks and real-world machine learning libraries(scikit-learn 0.20.3, subprocess32, numpy, argparse; Rscript (util).)
效果:
- outperform
- explain multiple bugs: e.g., logistic regression in scikit-learn
- 4 bugs, 1 fixed