MobileNet系列---mobileNetV2

最近在利用SSD检测物体时，由于实际项目要求，需要对模型进行轻量化，所以考虑利用轻量网络替换原本的骨架VGG16，查找一些资料后最终采用了google开源的mobileNetV2。这里对学习mobileNet系列的过程做一些总结。mobileNetV1是由google在2017年发布的一个轻量级深度神经网络，其主要特点是采用深度可分离卷积替换了普通卷积，2018年提出的mobileNetV2在V1的基础上引入了线性瓶颈 (Linear Bottleneck)和倒残差 (Inverted Residual)来提高网络的表征能力。

2.mobileNetV2

mobileNetV2是对mobileNetV1的改进，同样是一种轻量级的神经网络。为了防止非线性层(ReLU)损失一部分信息，引入了线性瓶颈层(Linear Bottleneck)；另外借鉴ResNet及DenseNet等一系列网络采用了shortcut的网络得到了很好的效果，作者结合depthwise convolution的特点，提出了倒残差 (Inverted Residual)。

原文地址：https://128.84.21.199/abs/1801.04381

2.1 线性瓶颈层

对于mobileNetV1的深度可分离卷积而言，宽度乘数压缩后的M维空间后会通过一个非线性变换ReLU，根据ReLU的性质，输入特征若为负数，该通道的特征会被清零，本来特征已经经过压缩，这会进一步损失特征信息；若输入特征是正数，经过激活层输出特征是原始的输入值，则相当于线性变换。

下图是将低维流形的ReLU嵌入高维空间中的例子，原始特征通过随机矩阵T变换，后面接ReLU层，变换到n维空间后再通过反变换T^-1转变回原始空间。当n=2，3时，会导致比较严重的信息丢失，部分特征重叠到一起了；当n=15到30时，信息丢失程度降低，但是变换矩阵已经是高度非凸的了。由于非线性层会损失一部分信息，因而使用线性瓶颈层。

那什么是线性瓶颈层呢，看下图b和d就清楚了，就是将残差快最后一个1*1的卷积后的ReLU更改为线性函数Linear

2.2 倒残差

残差块已经被证明有助于提高精度，所以mobileNetV2也引入了类似的块。经典的残差块（residual block）的过程是：1x1(降维)-->3x3(卷积)-->1x1(升维)，但深度卷积层（Depthwise convolution layer）提取特征限制于输入特征维度，若采用残差块，先经过1x1的逐点卷积（Pointwise convolution）操作先将输入特征图压缩(一般压缩率为0.25)，再经过深度卷积后，提取的特征会更少。所以mobileNetV2是先经过1x1的逐点卷积操作将特征图的通道进行扩张，丰富特征数量，进而提高精度。这一过程刚好和残差块的顺序颠倒，这也就是倒残差的由来：1x1(升维)-->3x3(dw conv+relu)-->1x1(降维+线性变换)。

2.3 网络结构

主要由不同的瓶颈残差块组成，一个瓶颈残差块的具体结构如下表1所示。输入通过1*1的conv+ReLU层将维度从k维增加到tk维，之后通过3*3conv+ReLU可分离卷积对图像进行降采样（stride>1时），此时特征维度已经为tk维度，最后通过1*1conv（无ReLU）进行降维，维度从tk降低到k维。

需要注意的是，整个模型中除了第一个瓶颈层的t=1之外，其他瓶颈层t=6（论文中Table 2），即第一个瓶颈层内部并不对特征进行升维。

另外对于瓶颈层，当stride=1时，才会使用elementwise 的sum将输入和输出特征连接（如下图左侧）；stride=2时，无short cut连接输入和输出特征（下图右侧）。

MobileNetV2的模型如下图所示，其中$t$为瓶颈层内部升维的倍数，$c$为特征的维数，$n$为该瓶颈层重复的次数，$s$为瓶颈层第一个conv的步幅。

需要注意的是：

当$n>1$时（即该瓶颈层重复的次数>1），只在第一个瓶颈层stride为对应的s，其他重复的瓶颈层stride均为1;
只在$stride=1$时，输出特征尺寸和输入特征尺寸一致，才会使用elementwise sum将输出与输入相加;
当$n>1$时，只在第一个瓶颈层特征维度会改变，其他时候channel不变。

例如，对于该图中562*24的那层，共有3个该瓶颈层，只在第一个瓶颈层使用stride=2，后两个瓶颈层stride=1；第一个瓶颈层由于输入和输出尺寸不一致，因而无short cut连接，后两个由于stride=1，输入输出特征尺寸一致，会使用short cut将输入和输出特征进行elementwise的sum；只在第一个瓶颈层最后的1*1conv对特征进行升维，后两个瓶颈层输出维度不变（不要和瓶颈层内部的升维弄混了）。该层输入特征为56*56*24，第一个瓶颈层输出为28*28*32（特征尺寸降低，特征维度增加，无short cut），第二个、第三个瓶颈层输入和输出均为28*28*32（此时c=32，s=1，有short cut）。

另外表中还有一个$k$。mobileNetV1中提出了宽度缩放因子，其作用是在整体上对网络的每一层维度（特征数量）进行瘦身。mobileNetV2中，当该因子<1时，最后的那个1*1conv不进行宽度缩放；否则进行宽度缩放。

2.4 实现

pytorch实现: https://github.com/tonylins/pytorch-mobilenet-v2
tensorflow实现： https://github.com/tensorflow/models/tree/master/research/slim/nets/mobilenet

  1 import torch
  2 import torch.nn as nn
  3 import numpy as np
  4 
  5 # 定义bottleneck
  6 class Bottlenect(nn.Module):
  7     def __init__(self, inplanes, outplanes, stride=1, expand_ratio=1):
  8         super(Bottlenect, self).__init__()
  9         
 10         self.stride = stride
 11         assert stride in [1, 2]
 12         self.use_res_connect = (self.stride == 1) and (inplanes == outplanes)  # 是否连接残差
 13         
 14         hidden_dim = inplanes*expand_ratio # 中间层维度
 15         self.conv1 = nn.Conv2d(inplanes, hidden_dim, kernel_size=1, stride=1, padding=0, bias=False)
 16         self.bn1 = nn.BatchNorm2d(hidden_dim)
 17         
 18         self.conv2 = nn.Conv2d(hidden_dim, hidden_dim, 3, stride, padding=1, groups=hidden_dim, bias=False)
 19         self.bn2 = nn.BatchNorm2d(hidden_dim)
 20         
 21         self.conv3 = nn.Conv2d(hidden_dim, outplanes, 1, 1, 0, bias=False)
 22         self.bn3 = nn.BatchNorm2d(outplanes)
 23         
 24         self.relu = nn.ReLU(inplace=True)
 25         
 26     def forward(self, x):
 27         residual = x
 28         
 29         out = self.conv1(x)
 30         out = self.bn1(out)
 31         out = self.relu(out)
 32         
 33         out = self.conv2(out)
 34         out = self.bn2(out)
 35         out = self.relu(out)
 36         
 37         out = self.conv3(out)
 38         out = self.bn3(out)
 39         
 40         if self.use_res_connect:
 41             out += residual
 42         
 43         return out
 44 
 45 def make_divisible(x, divisible_by=8):
 46     return int(np.ceil(x * 1. / divisible_by) * divisible_by)
 47 
 48 class MobileNetV2(nn.Module):
 49     def __init__(self, n_class=1000, input_size=224, width_multi=1.0):
 50         super(MobileNetV2, self).__init__()
 51         
 52         input_channel = 32
 53         last_channel = 1280
 54         
 55         bottlenet_setting = [
 56             # t, c, n, s
 57             [1, 16, 1, 1],
 58             [6, 24, 2, 2],
 59             [6, 32, 3, 2],
 60             [6, 64, 4, 2],
 61             [6, 96, 3, 1],
 62             [6, 160, 3, 2],
 63             [6, 320, 1, 1]
 64         ]
 65         
 66         assert input_size % 32 == 0
 67         self.last_channel = make_divisible(last_channel*width_multi) if width_multi > 1.0 else last_channel
 68         
 69         # first conv layer 1
 70         self.conv1 = nn.Conv2d(3, input_channel, 3, 2, 1, bias=False)
 71         self.bn1 = nn.BatchNorm2d(input_channel)
 72         self.relu = nn.ReLU(inplace=True)
 73         
 74         # bottlenect layer 2-->8
 75         self.bottlenect_layer = []
 76         for t, c, n, s in bottlenet_setting:
 77             output_channel = make_divisible(c*width_multi) if t > 1 else c
 78             for i in range(n):
 79                 if i == 0:  # 第一层的stride = stride， 其他层stride = 1
 80                     self.bottlenect_layer.append(Bottlenect(input_channel, output_channel, s, expand_ratio=t))
 81                 else:
 82                     self.bottlenect_layer.append(Bottlenect(input_channel, output_channel, 1, expand_ratio=t))
 83                 input_channel = output_channel
 84         self.bottlenect_layer = nn.Sequential(*self.bottlenect_layer)
 85         
 86         # conv layer 9
 87         self.conv2 = nn.Conv2d(input_channel, self.last_channel, 1, 1, 0, bias=False)
 88         self.bn2 = nn.BatchNorm2d(self.last_channel)
 89         
 90         # avg pool layer
 91         self.avg_pool = nn.AvgPool2d(7, stride=1)
 92         
 93         # last conv layer
 94         self.conv9 = nn.Conv2d(self.last_channel, n_class, 1, 1, bias=False)
 95         self.bn3 = nn.BatchNorm2d(n_class)
 96         
 97     def forward(self, x):
 98         out = self.conv1(x)
 99         out = self.bn1(out)
100         out = self.relu(out)
101         
102         out = self.bottlenect_layer(out)
103         
104         out = self.conv2(out)
105         out = self.bn2(out)
106         out = self.relu(out)
107         
108         out = self.avg_pool(out)
109         
110         out = self.conv9(out)
111         out = self.bn3(out)
112         out = self.relu(out)
113         
114         out = out.view(x.size(0), -1)
115         
116         return out
117 
118 model = MobileNetV2(width_multi=1)

2.5 参考链接

posted @ 2020-02-22 21:39 半夜打老虎阅读(9600) 评论(0) 编辑收藏举报

刷新页面返回顶部

Arabic	Hebrew	Polish
Bulgarian	Hindi	Portuguese
Catalan	Hmong Daw	Romanian
Chinese Simplified	Hungarian	Russian
Chinese Traditional	Indonesian	Slovak
Czech	Italian	Slovenian
Danish	Japanese	Spanish
Dutch	Klingon	Swedish
English	Korean	Thai
Estonian	Latvian	Turkish
Finnish	Lithuanian	Ukrainian
French	Malay	Urdu
German	Maltese	Vietnamese
Greek	Norwegian	Welsh
Haitian Creole	Persian

半夜打老虎