Data Augmentation

Heuristics-driven

Pad and Crop

He K., Zhang X., Ren S. and Sun J. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

即:

import torchvision.transforms as T

T.Compose([
	T.Pad(4, padding_mode='reflect'),
	T.RandomCrop(32), # for CIFAR-10 and CIFAR-100
	T.RandomHorizontalFlip(),
	T.ToTensor()
])

image-20210609171026364

注: T.RandomCrop(200)

Cutout

DeVries T. and Taylor G. W. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552, 2017

原文代码

形象点说, 就是给图片挖几个洞(不过作者的解释挺有趣的, 输入层的连续dropout, 这个想法有趣).

# uoguelph-mlrg cutout
# https://github.com/uoguelph-mlrg/Cutout/blob/master/util/cutout.py
# 原文代码

  
import torch
import numpy as np

class Cutout(object):
    """Randomly mask out one or more patches from an image.
    Args:
        n_holes (int): Number of patches to cut out of each image.
        length (int): The length (in pixels) of each square patch.
    """
    def __init__(self, n_holes, length):
        self.n_holes = n_holes
        self.length = length

    def __call__(self, img):
        """
        Args:
            img (Tensor): Tensor image of size (C, H, W).
        Returns:
            Tensor: Image with n_holes of dimension length x length cut out of it.
        """
        h = img.size(1)
        w = img.size(2)

        mask = np.ones((h, w), np.float32)

        for n in range(self.n_holes):
            y = np.random.randint(h)
            x = np.random.randint(w)

            y1 = np.clip(y - self.length // 2, 0, h)
            y2 = np.clip(y + self.length // 2, 0, h)
            x1 = np.clip(x - self.length // 2, 0, w)
            x2 = np.clip(x + self.length // 2, 0, w)

            mask[y1: y2, x1: x2] = 0.

        mask = torch.from_numpy(mask)
        mask = mask.expand_as(img)
        img = img * mask

        return img

image-20210609172940195

MixUp

Zhang H., Cisse M., Dauphin Y. N. and Lopez-Paz D. mixup: Beyond empirical risk minimization. In International Conference on Learning Representations (ICLR), 2018.

代码

MixUp 就是将两个图片按照一定比例相加

\[\tilde{x} = \lambda x_A + (1 - \lambda) x_B, \]

\(\lambda \sim \mathrm{Beta}(\alpha, \alpha), \alpha \in (0, +\infty)\), 同时

\[\tilde{y} = \lambda y_A + (1 - \lambda) y_B. \]

image-20210609180616376

CutMix

Yun S., Han D., Oh S. J., Chun S., Choe J. and Yoo Y. Cutmix: Regularization strategy to train strong classifiers with localizable features. In International Conference on Vision (ICCV), 2019.

原文代码

CutMix 就是将一个图片裁剪掉一块, 然后用另一张图片的对应位置填补:

\[\tilde{x} = M \circ x_A + (1 - M) \circ x_B, \]

其中\(M \in \{0, 1\}^{H \times W}\) 是掩码矩阵, 按照如下方式生成:

\[r_x \sim U[0, W], \: r_y = U[0, H], \\ r_w = W\sqrt{1 - \lambda} , \: r_h = H\sqrt{1 - \lambda}, \]

得到一个bounding box, 在其中的\(M_{ij}= 0\), 否则为\(1\), \(\lambda\)控制框的大小的:

\[ \frac{r_w r_h}{HW} = 1 - \lambda. \]

此时, 标签需要发生相应变化:

\[\tilde{y} = \lambda y_{A} + (1 - \lambda)y_B, \]

这里标签是one-hot形式(或者概率向量).

image-20210609175227418

AugMix

AugMix

Data-driven

AutoAugment

Cubuk E. D., Zoph B., Man'{e}, D., Vasudevan V. and Le Q. V. AutoAugment: learning augmentation strategies from data. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

根据数据集选择合理的augmentations策略.

RandAugment

Cubuk E. D., Zoph B., Shlens J. and Le Q. V. RandAugment: practical automated data augmentation with a reduced search space.

\(K\)中augmentations中随机选择\(N\)(replace=True), 每个的magnitude都是\(M\).
其中\(M, N\)是通过网格搜索找到的(太粗暴了吧)

DeepAugment

Hendrycks D., Basart S., Mu N., Kadavath S., Wang F., Dorundo E., Desai R., Zhu T., Parajuli S., Guo M., Song D., Steinhardt J. Gilmer J. The many faces of robustness: a critical analysis of out-of-distribution generalization. arXiv preprint arXiv:2006.16241, 2020.

假设\(f(\cdot;\theta)\)是一个Image-to-Image的网络, 则DeepAugment是指对\(\theta\)添加扰动, 比如扰动权重, 改变激活函数等等.

和这个思想类似的还有AdversarialAugment.

posted @ 2021-06-14 11:58  馒头and花卷  阅读(93)  评论(0编辑  收藏  举报