『论文』PointNet++

『论文』PointNet++

https://zhuanlan.zhihu.com/p/266324173

Few prior works study deep learning on point sets. PointNet [20] is a pioneer in this direction. However, by design PointNet does not capture local structures induced by the metric space points live in, limiting its ability to recognize fine-grained patterns and generalizability to complex scenes.

3. Method#

Our work can be viewed as an extension of PointNet [20] with added hierarchical structure. We first review PointNet (Sec. 3.1) and then introduce a basic extension of PointNet with hierarchical structure (Sec. 3.2). Finally, we propose our PointNet++ that is able to robustly learn features even in non-uniformly sampled point sets (Sec. 3.3).

3.1 Review of PointNet [20]: A Universal Continuous Set Function Approximator#

image

where γ and h are usually multi-layer perceptron (MLP) networks.
The set function f in Eq. 1 is invariant to input point permutations and can arbitrarily approximate any continuous set function [20]. Note that the response of h can be interpreted as the spatial encoding 38 of a point (see [20] for details). PointNet achieved impressive performance on a few benchmarks. However, it lacks the ability to capture local context at different scales

3.2 Hierarchical Point Set Feature Learning#

image

While PointNet uses a single max pooling operation to aggregate the whole point set, our new architecture builds a hierarchical grouping of points and progressively abstract larger and larger local regions along the hierarchy
Our hierarchical structure is composed by a number of set abstraction levels (Fig. 2). At each level, a set of points is processed and abstracted to produce a new set with fewer elements. The set abstraction level is made of three key layers: Sampling layer, Grouping layer and PointNet layer.

  • The Sampling layer selects a set of points from input points, which defines the centroids of local regions
  • Grouping layer then constructs local region sets by finding “neighboring” points around the centroids
  • PointNet layer uses a mini-PointNet to encode local region patterns into feature vectors

A set abstraction level takes an N × (d + C) matrix as input that is from N points with d-dim coordinates and C-dim point feature. It outputs an N’ × (d + C’) matrix of N’ subsampled points with d-dim coordinates and new C 0 -dim feature vectors summarizing local context

image

  • Ball query: finds all points that are within a radius to the query point (an upper limit of K is set in implementation)
  • An alternative range query - K nearest neighbor (kNN) search which finds a fixed number of neighboring points.

Compared with kNN, ball query’s local neighborhood guarantees a fixed region scale thus making local region feature more generalizable across space, which is preferred for tasks requiring local pattern recognition (e.g. semantic point labeling)

image

个人目前总结:总得来说,PointNet++就是设计了以hierarchical格式下专注local region的特征提取方式。具体而言,对于上一级输来的N × (d + C),利用FPS做sampling,选出其中N’个centroids,然后grouping依照这N’个centriods的coordinates N’ x d和原features,用ball query或knn取每个centroid的neighbors们,组成local regions们,得到N’ x K x (d + C)。值得注意的是,其实可以看到这两步其实也就是一个根据我们的设计来做一些几何和数据上的计算整合,没有学习的成分。接着就可以接我们的PointNet了,也就是传统艺能point-wise shared mlp + max pool + mlp (optional, 我认为),得到N’ x K x (d + C’)
不难发现,之前PointNet其实就是给一堆points (point set,或者说features set),各自接个mlp,直接嘎一个maxpool选一下各channel上最salient的得到这个vector就完事了,其实很粗暴。所以这里PointNet++就是在之前的这个过程里不断hierarchical地来sample和group local regions,PointNet提特征,再接着往下重复,来不断不断地最后提完这个特征
值得注意的是,fps其实应该并没有参与梯度计算和反向传播,可以理解成是PointNet++将点云进行不同规模的fps降采样,事先将这些数据准备好,再送到网络中去训练的。其实相当于一个预处理部分
另外,感觉论文里没有提到,PointNet++里应该是没有用T-net这个事的

3.3 Robust Feature Learning under Non-Uniform Sampling Density#

虽然读起来也会很快但是今天有点懒得读了,直接把那个知乎博客的这部分粘过来以参考,以后想看了再改成原文这块内容

pointnet++实际上就是对局部邻域表征。那就不得不面对一个挑战:non-uniform sampling density,也就是在稀疏点云局部邻域训练可能不能很好挖掘点云的局部结构。PointNet++做法:learn to combine features from regions of different scales when the input sampling density changes
因此文章提出了两个方案:
image

  • Multi-scale grouping(MSG)
    对当前层的每个中心点,取不同radius的query ball,可以得到多个不同大小的同心球,也就是得到了多个相同中心但规模不同的局部邻域,分别对这些局部邻域表征,并将所有表征拼接。代码层面其实就是加了个遍历radius_list的循环,分别处理,并最后concat
  • Multi-resolution grouping(MRG)
    (摘自原文)features of a region at some level Li is a concatenation of two vectors. One vector (left in figure) is obtained by summarizing the features at each subregion from the lower level using the set abstraction level. The other vector (right) is the feature that is obtained by directly processing all raw points in the local region using a single PointNet.
    简单来说,就是当前set abstraction的局部邻域表征由两部分构成:左边表征:对上一层set abstraction(还记得上一层的点规模是更大的吗?)各个局部邻域(或者说中心点)的特征进行聚合。右边表征:使用一个单一的PointNet直接在局部邻域处理原始点云

作者:traviscui

出处:https://www.cnblogs.com/traviscui/p/16559918.html

版权:本作品采用「署名-非商业性使用-相同方式共享 4.0 国际」许可协议进行许可。

posted @   traviscui  阅读(11)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· .NET10 - 预览版1新功能体验(一)
more_horiz
keyboard_arrow_up dark_mode palette
选择主题
menu
点击右上角即可分享
微信分享提示