sss
Density-based sampling
These types of strategies take into account the distribution and local density. The intuition is that the location with more density is more likely to be queried. i.e. the selected instances and the unlabeled instances should have similar distributions.
Density-based sampling:
Information density
RALF
k-Center-Greedy (Core-set): Only consider the representativeness.
Works:
An analysis of active learning strategies for sequence labeling tasks [2008, CEMNL]: Information density framework. The main idea is that informative instances should not only be those which are uncertain, but also those which are “representative” of the underlying distribution (i.e., inhabit dense regions of the input space).(659 citations)
RALF: A Reinforced Active Learning Formulation for Object Class Recognition [2012, CVPR]: RALF. A time-varying combination of exploration and exploitation sampling criteria. Include graph density in the exploitation strategies. (59 citations)
Active learning for convolutional neural networks: A core-set approach [ICLR, 2018]: Core-set loss is simply the difference between average empirical loss over the set of points which have labels for and the average empirical loss over the entire dataset including unlabelled points. Optimize the upper bound of core-set loss could be considered as a k-center problem in practice. Doesn't need to know the out put of the current model.
Minimax Active Learning [2020]: Develop a semi-supervised minimax entropy-based active learning algorithm that leverages both uncertainty and diversity in an adversarial manner.
Multiple-criteria Based Active Learning with Fixed-size Determinantal Point Processes [2021]
Uncertainty-based sampling
This is the most basic strategy of AL. It aims to select the instances which are most uncertain to the current model. There are basically three sub-strategies here.
Classification uncertainty
Select the instance close to the decision boundary.
Classification margin
Select the instance whose probability to be classified into to the two most likely classes are most close.
Classification entropy
Select the instance whose have the largest classification entropy among all the classes.
The equations and details could see here.
Works:
Heterogeneous uncertainty sampling for supervised learning [1994, ICML]: Most basic Uncertainty strategy. Could be used with probabilistic classifiers. (1071 citations)
Support Vector Machine Active Learning with Applications to Text Classification [2001, JMLR]: Version space reduction with SVM. (2643 citations)
How to measure uncertainty in uncertainty sampling for active learning [2021, Machine Learning]