data imbalanced problem
本博客是《Learning from class-imbalanced data: Review of methods and applications》的简要记录。本文发表时间较早,主要是非深度学习的方法,仅供参考。
1. Imbalanced data classification approaches
1.1 Basic strategies for dealing with imbalanced learning
1.1.1 Resampling. 直接挑选样本
- Over-sampling
- Under-sampling
1.1.2 Feature selection and extraction
数据不平衡时,少样本类可能被视作噪声而忽略。在特征层面做特征选择或特征提取。
1.1.3 Cost-sensitive learning
weighted cross entropy loss和focal loss应该都属于这一类。