回环检测(场景识别)论文综述

一、VLAD

1、NetVLAD

视觉场景识别经典之作

论文:https://arxiv.org/pdf/1511.07247.pdf

代码:https://github.com/Nanne/pytorch-NetVlad(目前测试工作已完成)

 

小代一下ghost-vlad:降低不清楚图片的权重

 

 

 

2、Patch-NetVLAD

2021NetVLAD的延续工作

论文:https://arxiv.org/pdf/2103.01486v1.pdf

代码:https://github.com/QVPR/Patch-NetVLAD(同样等待数据集)

数据集'./Pittsburgh250k/001/001048_pitch1_yaw7.jpg'发生损坏,暂时用旁边的替代,需要时重新下载

BATCH需要调到2才能正常工作。

目前数据集除09外重新下载不存在问题,其他的暂时用旁边的代替

 

speed结果:

====> Recall NetVLAD@1: 0.0135
====> Recall NetVLAD@5: 0.0389
====> Recall NetVLAD@10: 0.0497
====> Recall NetVLAD@20: 0.0580
====> Recall NetVLAD@50: 0.0634
====> Recall NetVLAD@100: 0.0660
====> Recall PatchNetVLAD@1: 0.0343
====> Recall PatchNetVLAD@5: 0.0483
====> Recall PatchNetVLAD@10: 0.0541
====> Recall PatchNetVLAD@20: 0.0593
====> Recall PatchNetVLAD@50: 0.0647
====> Recall PatchNetVLAD@100: 0.0660
Writing recalls to results/recalls.txt

 

3、PatchNet+

Patch-net的改进版本,不过数据方面和patchnet原文测得不一样,没有仔细看

论文:https://arxiv.org/pdf/2202.05738.pdf

 

4、MultiRes-NetVLAD

2022NetVLAD延续工作

论文:https://arxiv.org/pdf/2202.09146.pdf

代码:https://github.com/Ahmedest61/MultiRes-NetVLAD

5、

https://arxiv.org/pdf/2010.09228.pdf

 

 

6、VLAD-SLAM

2016年的作品,将VLAD融入回环检测并用SDC进行对比。

论文:https://sci-hub.mksa.top/10.1109/icinfa.2016.7831876

 

7、Spatial Pyramid-Enhanced NetVLAD 2019

空间加强的VLAD,有点类似与图像金字塔,同时改变了权重(与每个epoch有关,收敛不好的加大权重)

论文:https://sci-hub.st/10.1109/TNNLS.2019.2908982

 

8、DELG

论文:https://arxiv.org/pdf/2001.05027.pdf

代码:https://github.com/tensorflow/models/tree/master/research/delf

 

9、CRN

一种带局部加权的VLAD

论文:https://openaccess.thecvf.com/content_cvpr_2017/papers/Kim_Learned_Contextual_Feature_CVPR_2017_paper.pdf

差不多的论文:

https://www.researchgate.net/publication/329857970_Learning_to_Fuse_Multiscale_Features_for_Visual_Place_Recognition

 

10、APA

一种金字塔带注意力的VLAD

论文:https://arxiv.org/pdf/1808.00288v1.pdf

这里将NETVLAD  PCA之后再做的实验,不知道我是否可以。

待阅读:

https://arxiv.org/pdf/2107.02440.pdf

https://blog.csdn.net/qq_24954345/article/details/86176862 里面有一个时空VLAD

 

11、VSA

将语义信息编码进入vector,数学成分很多

论文:http://www.roboticsproceedings.org/rss17/p083.pdf

 

12、DELF

带注意力的local feature用在检索

论文:https://arxiv.org/pdf/1612.06321.pdf

代码:https://github.com/nashory/DeLF-pytorch

 

二、Transformer

1、基本知识

注意力机制:

https://zhuanlan.zhihu.com/p/52119092

Transformer:

https://zhuanlan.zhihu.com/p/82312421

https://blog.csdn.net/longxinchen_ml/article/details/86533005

CVPR2021:https://blog.csdn.net/amusi1994/article/details/117433649?spm=1001.2101.3001.6650.4&utm_medium=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7ERate-4.pc_relevant_default&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7ERate-4.pc_relevant_default&utm_relevant_index=9

vit必读:https://blog.csdn.net/u014546828/article/details/117657912?spm=1001.2101.3001.6650.1&utm_medium=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7ERate-1.pc_relevant_default&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7ERate-1.pc_relevant_default&utm_relevant_index=2

 

解读Transformer:

https://luweikxy.gitbook.io/machine-learning-notes/self-attention-and-transformer#skip%20connection%E5%92%8CLayer%20Normalization

https://zhuanlan.zhihu.com/p/48508221

 

2、

论文:https://blog.csdn.net/weixin_43882112/article/details/121440070

https://blog.csdn.net/weixin_43882112/article/details/121440070

代码:https://github.com/fpthink/PPT-Net

激光的,可以不看

3、

论文:https://arxiv.org/pdf/2203.03397.pdf

 

4、TransVPR  2022 CVPR

transformer 用于vpr,效果比原文patchnet要好

论文:https://arxiv.org/pdf/2201.02001.pdf

解读:https://zhuanlan.zhihu.com/p/461437620

代码:无

 

5、VIT 2020

论文:https://arxiv.org/pdf/2010.11929v1.pdf

代码:https://github.com/google-research/vision_transformer , https://github.com/lucidrains/vit-pytorch , https://github.com/likelyzhao/vit-pytorch
解读:https://blog.csdn.net/qq_44055705/article/details/113825863

 

6、Self-supervising Fine-grained Region Similaritiesfor Large-scale Image Localization

自监督图像相似性检测,采用迭代训练,将上一epoch的结果导入到下一epoch中。

论文:https://arxiv.org/pdf/2006.03926.pdf

代码:https://github.com/yxgeee/OpenIBL

解读:https://zhuanlan.zhihu.com/p/169596514

 

7、https://arxiv.org/abs/2201.005201

 

8、swin transformer

 一种特殊窗口的VIT

论文:https://arxiv.org/pdf/2103.14030.pdf

代码:https://github.com/microsoft/Swin-Transformer

解读:https://zhuanlan.zhihu.com/p/367111046

 

9、ConTransformer

桥接cnn与transformer

论文:https://arxiv.org/pdf/2105.03889.pdf

代码:https://github.com/pengzhiliang/Conformer

解读:https://blog.csdn.net/qq_15698613/article/details/119723545?utm_medium=distribute.pc_relevant.none-task-blog-2~default~baidujs_title~default-1.pc_relevant_antiscanv2&spm=1001.2101.3001.4242.2&utm_relevant_index=4

已经运行成功,这篇可以参考。

与之类似的mobile-former待观察。

 

10、EViT

 

论文:https://zhuanlan.zhihu.com/p/440294002

代码:https://github.com/youweiliang/evit

解读:https://zhuanlan.zhihu.com/p/440294002

 

 

11、Transformer + CNN

论文:https://arxiv.org/pdf/2106.03180.pdf

代码:https://github.com/yun-liu/TransCNN

 

12、PVT

金字塔Transformer

论文:https://arxiv.org/pdf/2102.12122.pdf

代码:https://github.com/whai362/PVT

 

13、MVIT

多层级的

论文:https://arxiv.org/pdf/2104.11227.pdf

代码:https://github.com/facebookresearch/SlowFast

 

14、DEIT

知识蒸馏

论文:https://arxiv.org/pdf/2012.12877.pdf

代码:https://github.com/facebookresearch/deit/issues?q=is%3Aclosed

deit三代

论文:https://arxiv.org/pdf/2204.07118.pdf

套用VLAD结果(由于TIMM存在问题)

====> Recall@1: 0.0503
====> Recall@5: 0.1284
====> Recall@10: 0.1976
====> Recall@20: 0.2946

 

15、Patch-conv-net

修改VIT结构,和DEIT是一个团队

论文:https://arxiv.org/pdf/2112.13692.pdf

代码:同DEIT

 

16、DOLG-EfficientNet

S-TRANSFORMRE VPR

https://arxiv.org/pdf/2110.03786.pdf

2021检索挑战赛冠军,可参考https://jishuin.proginn.com/p/763bfbd6b138

 

 

三、其他形式VPR

1、NYU-VPR

IROS2021

论文:https://arxiv.org/pdf/2110.09004.pdf

 这是个数据集

2、HSD

ITSC2021

论文:https://arxiv.org/pdf/2109.14916.pdf

 这个看不太懂

 

3、激光点云介入

证明了基于点云强度图的可行性

论文名称:Visual Place Recognition using LiDAR Intensity Information

参考:https://xuwuzhou.top/%E8%AE%BA%E6%96%87%E9%98%85%E8%AF%BB63/

 

四、其他领域可参考论文

1、https://proceedings.neurips.cc/paper/2021/file/27d52bcb3580724eb4cbe9f2718a9365-Paper.pdf

See More for Scene: Pairwise Consistency Learning for Scene Classification

利用focus区域进行场景分类

 

2、elective Convolutional Descrip- tor Aggregation for Fine-Grained Image Retrieval

基于关键区域的图像检索

https://openaccess.thecvf.com/content_CVPR_2019/papers/Chen_Destruction_and_Construction_Learning_for_Fine-Grained_Image_Recognition_CVPR_2019_paper.pdf

 

posted @ 2022-03-23 21:04  小咸鱼在看博客  阅读(1086)  评论(0编辑  收藏  举报