FCN论文阅读记录tips

原文链接:http://101.96.8.164/www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Long_Fully_Convolutional_Networks_2015_CVPR_paper.pdf

<Experimental related>

(1) whole image training -- patchwise training

Sampling in patchwise training can correct class imbalance and mitigate the spatial correlation of dense patches. In fully convolutional training, class balance can also be achieved by weighting the loss, and loss sampling can be used to address spatial correlation.
(2) Class Balancing
Fully convolutional training can balance classes by weighting or sampling the loss. 但是发现并没有什么影响
(3) Dense Prediction
使用反卷积层,实质上就是双线性插值,把图片大小还原回去。反卷积层参数可学习得到。
(4) Optimization
SGD with momentum, learning rate should *10^-2 from 32s - 16s - 8s(依次降低100~1000倍)
 
<Architecture related>
(1) Segmentation Architecture 大部分基于传统经典网络,例如alexnet,vgg16,googlenet,只是修改了这些网络的全连接层为卷积层,使得整个网络结构变为只有卷积层的结构,所以叫全卷积网络。
(2) 为了提高performance,作者把最后一个卷积层的输出通过一次小的upsample之后(也得crop一下),使得和前面几个pool层大小相同,然后相加。充分利用前面的层次得到的信息(因为后面过于稀疏)。
Refinement by other means:
(1)Decreasing the stride of pooling layers is the most straightforward way to obtain finer predictions.(训练cost太大,而且得到的效果也不好)
(2)shift and stitch??
  (3) 最后作者使用了直接upsample by deconv,and skip pooling fusion,效果最好最直接。
 
 
 
 
<后续问题 related>
可以直观地看出,本文方法和Groud truth相比,容易丢失较小的目标,比如第一幅图片中的汽车,和第二幅图片中的观众人群,如果要改进的话,这一点上应该是有一些提升空间的。
 
 
posted @ 2017-02-21 16:22  吕吕吕吕吕  阅读(621)  评论(0编辑  收藏  举报