HRNet学习笔记
-
为何要提出HRNet?
-
对于位置信息敏感的视觉问题,高分辨率的表征representation是非常重要, 如人体姿态估计、语义分割、物体检测。而以往的都是先降后升,比如encoder-decoder、SegNet、UNet,先通过一个backbone降低分辨率,然后再通过上采样或者反卷积等恢复分辨率,或者使用空洞卷积来避免一些下采样降低分辨率。提出一种新的结构,HRNet,在整个过程中能保持分辨率不变。
-
HRNet的结构?
-
总体结构:首先,经过stem,将分辨率降为1/4: We input the image into a stem, which consists of two stride-2 3 × 3 convolutions decreasing the resolution to 1/4然后,从高分辨率的卷积流开始,慢慢地往最后一个卷积增加一个从高分辨率到低分辨率的卷积流,在增加新的卷积流的同时,进行多分辨率融合,以此来交换或者说共享不同尺度的信息。We start from a high resolution convolution stream, gradually add high-to-low resolution convolution streams one by one, and connect the multi-resolution streams in parallel. The resulting network consists of several (4 in this paper) stages as depicted in Figure 2, and the nth stage contains n streams corresponding to n resolutions. We conduct repeated multi-resolution fusions by exchanging the information across the parallel streams over and over
-
Repeated Multi-Resolution Fusions: The goal of the fusion module is to exchange the information across multi-resolution representations.
-
Representation Head
-
-
HRNet的优点
-
posted on 2020-10-25 18:51 ZhicongHou 阅读(252) 评论(0) 编辑 收藏 举报