关于VLAD部分的详细复现结果

纯VLAD复现

1、NetVLAD

 

2022/5/11

作者给出的checkpoint

测试集

====> Recall@1: 0.8190
====> Recall@5: 0.9123
====> Recall@10: 0.9366
====> Recall@20: 0.9575

 

另一次测试居然效果更高了:

====> Recall@1: 0.8527
====> Recall@5: 0.9468
====> Recall@10: 0.9700
====> Recall@20: 0.9838

与原作品,但是作者在回答中说,3~4个点是正常的。

这个结果应该是正确的,与论文TransVPR中相同,他一定也是用的这个模型

 

同样的方法测试patch-net-vlad(只用gloabl,不用patch)

测试集30k:

====> Recall@1: 0.7742
====> Recall@5: 0.9048
====> Recall@10: 0.9403
====> Recall@20: 0.9671

 

2、修改版本

用VIT替换VGG16,修改版本放在packages/slam_loop下,名字没改

 描述子维度均按照VLAD的32768

首先是一点都没有训练的随机数据:

====> Recall@1: 0.0712
====> Recall@5: 0.1353
====> Recall@10: 0.1910
====> Recall@20: 0.2846

 

1)首先用VLAD原本的参数,VIT就着VLAD的维度来,进行测试

此时的TRANSFORMER是没有经过训练的:

====> Recall@1: 0.0720
====> Recall@5: 0.1428
====> Recall@10: 0.2038
====> Recall@20: 0.2915

似乎并没有森么区别

2)尝试使用VIT预训练的版本

 需要把resize_flag两个都置为true,然后vlad的checkpoint给删除了。main也加了一个resize_flag,这个有待于之后修改

描述子维度32768:

====> Recall@1: 0.4833
====> Recall@5: 0.7349
====> Recall@10: 0.8232
====> Recall@20: 0.8895

描述子维度改成PCA之后的256:

====> Recall@1: 0.4371
====> Recall@5: 0.6812
====> Recall@10: 0.7873
====> Recall@20: 0.8647

稍微降低,有一点影响

不过我又换了一种方法计算了一次,结果变成:

====> Recall@1: 0.3888
====> Recall@5: 0.6389
====> Recall@10: 0.7447
====> Recall@20: 0.8374

似乎又下降了很多个点,但是按理来说并不应该。

这可真是太好了,还没开始就已经很高了,那这必须是愉快的resize用VIT了

 

还是这个版本,利用250k的val

====> Recall@1: 0.3496
====> Recall@5: 0.5664
====> Recall@10: 0.6640
====> Recall@20: 0.7589

 

epoch1:

30kval:

====> Recall@1: 0.2220
====> Recall@5: 0.4265
====> Recall@10: 0.5393
====> Recall@20: 0.6600

还是又出现了这个问题,还没有随机的效果好,可能可以选择部分区域不track

-------------------------------------------------------

 还是全训练模型在epoch5的30k

====> Recall@1: 0.3076
====> Recall@5: 0.5465
====> Recall@10: 0.6529
====> Recall@20: 0.7605

 他确实在进步,感觉有训练的必要。再用相同的模型试一下250k

是有提升的,即将超过随机参数,因此采用一个预训练模型很有必要,而且可能需要训练很多epoch才真的会有效果

====> Recall@1: 0.2909
====> Recall@5: 0.5193
====> Recall@10: 0.6222
====> Recall@20: 0.7308

 

下一个checkpoint(出了意外没有保存)

====> Recall@1: 0.3879
====> Recall@5: 0.6475
====> Recall@10: 0.7534
====> Recall@20: 0.8410
可以看到已经不错了

 18个epoch之后:

====> Recall@1: 0.4327
====> Recall@5: 0.6985
====> Recall@10: 0.8051
====> Recall@20: 0.8779

 

感觉提升太慢,把lr提高到0.001,并且实际证明,distill_loss不能放大到100倍,会发散

 

前27best:30k

====> Recall@1: 0.4512
====> Recall@5: 0.7188
====> Recall@10: 0.8197
====> Recall@20: 0.8993

-------------------------------------------------------

 

如果把VIT部分变成noguard:

====> Recall@1: 0.2233
====> Recall@5: 0.4298
====> Recall@10: 0.5505
====> Recall@20: 0.6675

还是这个版本,换成pisst250k的val:

====> Recall@1: 0.2008
====> Recall@5: 0.3899
====> Recall@10: 0.5058
====> Recall@20: 0.6213

比随机数下降了十多个点。。

 

结果却和一起训练一样,因此感觉有点问题,测试一下训练集看看是不是过拟合。

经过验证,确实只是训练了vlad部分,但是效果不知为何比随机的还要下降,这个问题同样也在之前的训练中多次出现。

详细训练结果见相册

 

_______________________________-

2022/5/25

May 25-16-37-12

学习率SGD lr0.000001

从epoch33开始,数据集用pitts30k,不监督encorder

在下降

 

May 25-17-01-08

学习率SGD lr0.000001

从epoch33开始,数据集用pitts30k,监督encorder

在下降

 

 

3、网络调整

1、ADAM+lr0.0001,学习率可能太大,误差一直波动

epoch1:

====> Recall@1: 0.0754
====> Recall@5: 0.1884
====> Recall@10: 0.2748
====> Recall@20: 0.3895

不合理

2、ADAM+lr0.00001,distill_loss不进行放大

====> Recall@1: 0.2186
====> Recall@5: 0.4644
====> Recall@10: 0.5878
====> Recall@20: 0.7140

相比于之前的epoch1结果,是比较理想的

3、ADAM+lr0.000001, distill不变

====> Recall@1: 0.3759
====> Recall@5: 0.6180
====> Recall@10: 0.7203
====> Recall@20: 0.8174

但是distill莫名出现负数。需要进一步观察

沿着这个继续训练:

epoch2

====> Recall@1: 0.3586
====> Recall@5: 0.6049
====> Recall@10: 0.7221
====> Recall@20: 0.8181

 

epoch3

====> Recall@1: 0.3571
====> Recall@5: 0.6175
====> Recall@10: 0.7307
====> Recall@20: 0.8272

 

epoch4

====> Recall@1: 0.3601
====> Recall@5: 0.6208
====> Recall@10: 0.7353
====> Recall@20: 0.8299

 

epoch5

====> Recall@1: 0.3697
====> Recall@5: 0.6346
====> Recall@10: 0.7434
====> Recall@20: 0.8344

 

epoch6:

====> Recall@1: 0.3623
====> Recall@5: 0.6205
====> Recall@10: 0.7295
====> Recall@20: 0.8294

 

epoch7

====> Recall@1: 0.3625
====> Recall@5: 0.6297
====> Recall@10: 0.7391
====> Recall@20: 0.8343

 

epoch8

====> Recall@1: 0.3788
====> Recall@5: 0.6530
====> Recall@10: 0.7610
====> Recall@20: 0.8503

 

4、ADAM+lr0.0000001,distill不变

====> Recall@1: 0.3845
====> Recall@5: 0.6241
====> Recall@10: 0.7309
====> Recall@20: 0.8223

 

 

TODO LIST:

1、考虑使用原始方法与Distill方法进行训练

2、PCA问题,描述子维度问题

3、LORM问题,原本ENCORDER后面有一个L2NORM被我暂时删除了。

4、关于disstill_loss,我的结果和base_loss差3个数量级,参考https://github.com/facebookresearch/deit/issues/134,或许可以尝试hard_loss

还是关于软硬loss的问题:https://github.com/facebookresearch/deit/issues/56

5、一些训练中可能涉及到的问题:

loss NAN:issue忘了记录了

decay问题:https://github.com/facebookresearch/deit/issues/68

kdloss或许可以修改的地方:https://github.com/facebookresearch/deit/issues/61

 

6、修改Loss的控制台输出,加上时间,以及每个epoch各自的大小。

 

7\

 

改动细节:

1、有关250k

改动前面都加了##change:250k

 

 

 

3、实验分析:

近年SOTA:

图片似乎采用resize

1、PITTS30K——VAL

1)VLAD

32768

 ====> Recall@1: 0.8190
====> Recall@5: 0.9123
====> Recall@10: 0.9366
====> Recall@20: 0.9575

 

match_time:8147.40478515625

2)PATCH-NETVLAD

512

====> Recall@1: 0.7742
====> Recall@5: 0.9048
====> Recall@10: 0.9403
====> Recall@20: 0.9671

match_time

160.3480987548828

3)Our

256

calculate_time

15.406815528869629
16.089759826660156
15.35318374633789
15.888575553894043
15.57817554473877
15.533535957336426
15.269087791442871
15.363360404968262
15.466752052307129
15.417375564575195
15.604928016662598

match_time

136.85328674316406


 

4)DELG

512

====> Recall@1: 0.5798
====> Recall@5: 0.7693
====> Recall@10: 0.8268
====> Recall@20: 0.8633

match time 835.2278442382812

extract time

30.439456939697266
30.195775985717773
30.43881607055664
30.27030372619629
30.372512817382812
30.267072677612305
30.44816017150879
29.61907196044922
30.250240325927734
30.172447204589844
30.360992431640625
30.19660758972168
30.330944061279297
30.242048263549805
30.271839141845703

 

5)SFRS

图片似乎采用resize——后续修改了,结果不采用

不用PCA:

match_time:8896.9111328125

====> Recall@1: 0.4026
====> Recall@5: 0.6126
====> Recall@10: 0.7014
====> Recall@20: 0.7847

数据有问题,缺少了好多层

extract_time不含PAC:

60.04828643798828
59.72060775756836
60.447200775146484
60.522335052490234
60.50899124145508
60.42073440551758
60.49766540527344
60.42166519165039
60.51308822631836
60.844417572021484
60.53388977050781

 

6)Multires-vlad

match_time:

1330.575927734375

====> Recall@1: 0.8938
====> Recall@5: 0.9660
====> Recall@10: 0.9817
====> Recall@20: 0.9892

eatract time:

111.94406127929688
112.33484649658203
111.87155151367188
112.25910186767578
112.72732543945312
112.26844787597656
112.12531280517578
112.54713439941406
112.44147491455078
112.49273681640625
112.44268798828125
112.41622161865234

这篇论文原本带统计时间的函数,不过方法不一样。

 

3)DBOW方法

extract time:单位秒

0.013574361801147461
0.012018680572509766
0.01383066177368164
0.020437240600585938
0.021411657333374023
0.05488920211791992
0.021865367889404297
0.02072429656982422
0.01495504379272461
0.016642093658447266
0.024945497512817383
0.022006988525390625
0.012542963027954102
0.008678913116455078
0.008888006210327148
0.01682734489440918
0.017751693725585938
0.019913196563720703
0.0222165584564209
0.021813631057739258
0.07405996322631836
0.02047872543334961
0.026796340942382812
0.021975278854370117
0.01502370834350586
0.008692026138305664
0.011438131332397461
0.019820213317871094
0.02095174789428711
0.020492076873779297
0.02255845069885254
0.01819467544555664
0.020274877548217773
0.022342920303344727
0.09872317314147949
0.02173924446105957
0.00989842414855957
0.00745081901550293
0.00677180290222168
0.011330842971801758
0.01789116859436035
0.02043914794921875
0.02269911766052246
0.021180152893066406
0.02588033676147461
0.027025938034057617
0.024840831756591797
0.020226240158081055
0.01088571548461914
0.010780811309814453
0.014653444290161133
0.010091304779052734
0.11080074310302734
0.022695541381835938
0.023570537567138672
0.023064851760864258
0.022131681442260742
0.023566246032714844
0.0262906551361084
0.015903711318969727
0.008333444595336914
0.007961511611938477
0.010892629623413086
0.010495662689208984
0.011915922164916992
0.020488500595092773

 

6)GeM

without white PCA

====> Recall@1: 0.7373
====> Recall@5: 0.8729
====> Recall@10: 0.9131
====> Recall@20: 0.9460

 

 

 

 

 

 

2、PITTS250K——VAL

1)VLAD

 

calculate_time

61.26380920410156
61.38684844970703
61.17171096801758
61.360958099365234
61.534080505371094
61.08665466308594
61.02115249633789
61.401729583740234
61.36284637451172
61.369598388671875
61.3520622253418
61.36713409423828
61.53398513793945

 

2)PATCH-NETVLAD

====> Recall@1: 0.7458
====> Recall@5: 0.8705
====> Recall@10: 0.9017
====> Recall@20: 0.9271 

 

match_time

1811.72021484375

 

calculate_time

15.583264350891113
14.759072303771973
15.042176246643066
15.01699161529541
14.90236759185791
15.883071899414062
15.401472091674805
15.909536361694336
15.60086441040039
14.560511589050293
15.225855827331543
14.903519630432129
15.892895698547363
15.987263679504395

 

3)OUr

match_time:

903.8621826171875ms

 

4)DELG

====> Recall@1: 0.5300
====> Recall@5: 0.6918
====> Recall@10: 0.7412
====> Recall@20: 0.7827

match_time:6062.5439453125

 

posted @ 2022-05-11 15:00  小咸鱼在看博客  阅读(430)  评论(1编辑  收藏  举报