MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

论文下载链接：https://arxiv.org/pdf/1704.04861.pdf

看题目：Efﬁcient Convolutional Neural Networks forMobile Vision Applications。首先，efficient，有效率的，难道普通的卷积没有效率吗？难道除了一个个在图片上滑动，还有其他方法提取特征？再看，moblie vision，这是为手机视觉设计的。

再看introduction：

The general trend has been to make deeper and more complicated networksin order to achieve higher accuracy . However, these advances to improve accuracy are not necessarily making networks more efﬁcient with respect to size and speed. In many real world applications such as robotics, self-driving car and augmented reality ,the recognition tasks need to be carried out ina timely fashion on a computationally limited platform.

现在的网络是越来越深，越来越复杂。但能提高精度的这些手段，并不能让网络更有效率。而现实中的应用，要求在计算资源有限的情况下，目标识别能“适时”。

This paper describes an efﬁcient network architecture and a set of twohyper-parametersin order to build very small, low latency（延迟） models that can be easily matched to the design requirements for mobile and embedded vision applications.

一个网络，两个超参数。

接下来看流程图，看不懂？没有流程图？那就看全文中心：创新处！

先找点博客，看看中文怎么说的（找不到就直接看英文）；

MobileNet和传统的CNN在结构上的差别主要是，传统CNN中在batch normalization和ReLU(线性整流函数)前边，是一个3×3卷积层，而MobileNet将卷积过程分为一个3×3深度方向的卷积和一个1×1点对点的卷积。

那么MobileNet的短板是什么呢？准确性。跟我们熟悉的那些大型、消耗巨大资源的神经网络相比，MobileNet的准确性不如前者高。但是MobileNet的长处是能够在功耗和性能之间寻求良好的平衡点。

MobileNet拥有两个表观变量：width multiplier和resolution multiplier，我们可以通过调整这两个变量值来使得模型适应具体问题。Width multiplier让我们把网络变得稀疏，而resolution multiplier可以改变输入图片的分辨率，从而降低每层网络间的内部表达。

接下来最花费时间的开始了：看英文创新处！

其中M代表输入通道M个，N代表有N个卷积核，卷积核大小为Dk*Dk。

计算量从减小到

posted on 2017-09-12 17:21 MissSimple 阅读(842) 评论(0) 收藏举报

刷新页面返回顶部

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

导航

公告