[CVPR2017] Weakly Supervised Cascaded Convolutional Networks论文笔记

https://www.csee.umbc.edu/~hpirsiav/papers/cascade_cvpr17.pdf

Weakly Supervised Cascaded Convolutional Networks, Ali Diba, Vivek Sharma, Ali Pazandeh, Hamed Pirsiavash and Luc Van Gool

亮点

通过多任务叠加（分类，分割）提高了多物体弱监督检测的正确率
通过利用segmentation筛选纯净的proposals，得到了更鲁棒的结果
为弱监督分割任务设计比较鲁棒的loss

只考虑全局的分类结果和置信度对高的部分
通过loss的weights关注到最需要关注的部分

相关工作

One of the most common approaches [7] consists of the following steps:

generates object proposals,
extracts features from the proposals,
applies multiple instance learning (MIL) to the features and finds the box labels from the weak bag (image) labels.

弱监督物体检测难点: 弱监督物体检测对初始化要求很高，不好的初始化可能会使网络陷入局部最优解，解决的办法主要有以下几个：

improve the initialization [31, 9, 28, 29]
regularizing the optimization strategies [4, 5, 7]
[17] employ an iterative self-learning strategy to employ harder samples to a small set of initial samples
[15] use a convex relaxation of soft-max loss

Majority of the previous works [25, 32] use a large collection of noisy object proposals to train their object detector. In contrast, our method only focuses on a very few clean collection of object proposals that are far more reliable, robust, computationally efficient, and gives better performance

方法

Two-stage: proposal and image classification (conv1 till con5, global pooling) + multiple instance learning (2fc, score layer)

1. image classification: CNN with global average pooling (GAP) ［36］中引入，将分类过程中fc层的weights作为原来convolutional layer输出的权重并将所有频道加权得到的图作为class activation map。在这一步中，还产生一个分类的loss LGAP

[36] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. Learning deep features for discriminative localization. In CVPR, 2016. 3, 4, 5, 6, 7, 8

2. multiple instance learning

Proposal: edgeboxs [37] is used to generate an initial set of object proposals. Then we threshold the class activation map [36] to come up with a mask. Finally, we choose the initial boxes with largest overlap with the mask.

Three-stage: more information about the objects’ boundary learned in a segmentation task can lead to acquisition of a better appearance model and then better object localization.

主要思想：分割监督信号帮助提升定位准确率。
弱分割监督信号：上一级得到的mask

实验结果

PASCAL VOC 2007

＋3.3% classification compared with [18]
+1.6% correct localization compared with [27]
+0.6% compared with [6]

PASCAL VOC 2010

+3.3% compared with [6]

PASCAL VOC 2012

+8.8% compared with [18]
ILSVRC 2013
+5.5% compared with [18]

Object detection training

PASCAL VOC 2007 test set: Faster RCNN trained by the pseudo ground-truth (GT) bounding boxes generated by our cascaded networks performs slightly better than our transfered model. (+0.3%)

[6] H. Bilen and A. Vedaldi. Weakly supervised deep detection networks. In CVPR, 2016. 6, 7, 8

[18] D. Li, J.-B. Huang, Y. Li, S. Wang, and M.-H. Yang. Weakly supervised object localization with progressive domain adaptation. In IEEE Conference on Computer Vision and Pattern Recognition, 2016. 2, 6, 7

[27] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015. 5, 6

posted @ 2018-03-26 15:55 TinaSmile 阅读(690) 评论(0) 编辑收藏举报

刷新页面返回顶部

TinaSmile

多读论文多实验，多听音乐多微笑～

[CVPR2017] Weakly Supervised Cascaded Convolutional Networks论文笔记

公告