关于VLAD部分的详细复现结果
纯VLAD复现
1、NetVLAD
2022/5/11
作者给出的checkpoint
测试集
====> Recall@1: 0.8190
====> Recall@5: 0.9123
====> Recall@10: 0.9366
====> Recall@20: 0.9575
另一次测试居然效果更高了:
====> Recall@1: 0.8527
====> Recall@5: 0.9468
====> Recall@10: 0.9700
====> Recall@20: 0.9838
与原作品,但是作者在回答中说,3~4个点是正常的。
这个结果应该是正确的,与论文TransVPR中相同,他一定也是用的这个模型
同样的方法测试patch-net-vlad(只用gloabl,不用patch)
测试集30k:
====> Recall@1: 0.7742
====> Recall@5: 0.9048
====> Recall@10: 0.9403
====> Recall@20: 0.9671
2、修改版本
用VIT替换VGG16,修改版本放在packages/slam_loop下,名字没改
描述子维度均按照VLAD的32768
首先是一点都没有训练的随机数据:
====> Recall@1: 0.0712
====> Recall@5: 0.1353
====> Recall@10: 0.1910
====> Recall@20: 0.2846
1)首先用VLAD原本的参数,VIT就着VLAD的维度来,进行测试
此时的TRANSFORMER是没有经过训练的:
====> Recall@1: 0.0720
====> Recall@5: 0.1428
====> Recall@10: 0.2038
====> Recall@20: 0.2915
似乎并没有森么区别
2)尝试使用VIT预训练的版本
需要把resize_flag两个都置为true,然后vlad的checkpoint给删除了。main也加了一个resize_flag,这个有待于之后修改
描述子维度32768:
====> Recall@1: 0.4833
====> Recall@5: 0.7349
====> Recall@10: 0.8232
====> Recall@20: 0.8895
描述子维度改成PCA之后的256:
====> Recall@1: 0.4371
====> Recall@5: 0.6812
====> Recall@10: 0.7873
====> Recall@20: 0.8647
稍微降低,有一点影响
不过我又换了一种方法计算了一次,结果变成:
====> Recall@1: 0.3888
====> Recall@5: 0.6389
====> Recall@10: 0.7447
====> Recall@20: 0.8374
似乎又下降了很多个点,但是按理来说并不应该。
这可真是太好了,还没开始就已经很高了,那这必须是愉快的resize用VIT了
还是这个版本,利用250k的val
====> Recall@1: 0.3496
====> Recall@5: 0.5664
====> Recall@10: 0.6640
====> Recall@20: 0.7589
epoch1:
30kval:
====> Recall@1: 0.2220
====> Recall@5: 0.4265
====> Recall@10: 0.5393
====> Recall@20: 0.6600
还是又出现了这个问题,还没有随机的效果好,可能可以选择部分区域不track
-------------------------------------------------------
还是全训练模型在epoch5的30k
====> Recall@1: 0.3076
====> Recall@5: 0.5465
====> Recall@10: 0.6529
====> Recall@20: 0.7605
他确实在进步,感觉有训练的必要。再用相同的模型试一下250k
是有提升的,即将超过随机参数,因此采用一个预训练模型很有必要,而且可能需要训练很多epoch才真的会有效果
====> Recall@1: 0.2909
====> Recall@5: 0.5193
====> Recall@10: 0.6222
====> Recall@20: 0.7308
下一个checkpoint(出了意外没有保存)
====> Recall@1: 0.3879 ====> Recall@5: 0.6475 ====> Recall@10: 0.7534 ====> Recall@20: 0.8410
可以看到已经不错了
18个epoch之后:
====> Recall@1: 0.4327
====> Recall@5: 0.6985
====> Recall@10: 0.8051
====> Recall@20: 0.8779
感觉提升太慢,把lr提高到0.001,并且实际证明,distill_loss不能放大到100倍,会发散
前27best:30k
====> Recall@1: 0.4512
====> Recall@5: 0.7188
====> Recall@10: 0.8197
====> Recall@20: 0.8993
-------------------------------------------------------
如果把VIT部分变成noguard:
====> Recall@1: 0.2233
====> Recall@5: 0.4298
====> Recall@10: 0.5505
====> Recall@20: 0.6675
还是这个版本,换成pisst250k的val:
====> Recall@1: 0.2008
====> Recall@5: 0.3899
====> Recall@10: 0.5058
====> Recall@20: 0.6213
比随机数下降了十多个点。。
结果却和一起训练一样,因此感觉有点问题,测试一下训练集看看是不是过拟合。
经过验证,确实只是训练了vlad部分,但是效果不知为何比随机的还要下降,这个问题同样也在之前的训练中多次出现。
详细训练结果见相册
_______________________________-
2022/5/25
May 25-16-37-12
学习率SGD lr0.000001
从epoch33开始,数据集用pitts30k,不监督encorder
在下降
May 25-17-01-08
学习率SGD lr0.000001
从epoch33开始,数据集用pitts30k,监督encorder
在下降
3、网络调整
1、ADAM+lr0.0001,学习率可能太大,误差一直波动
epoch1:
====> Recall@1: 0.0754
====> Recall@5: 0.1884
====> Recall@10: 0.2748
====> Recall@20: 0.3895
不合理
2、ADAM+lr0.00001,distill_loss不进行放大
====> Recall@1: 0.2186
====> Recall@5: 0.4644
====> Recall@10: 0.5878
====> Recall@20: 0.7140
相比于之前的epoch1结果,是比较理想的
3、ADAM+lr0.000001, distill不变
====> Recall@1: 0.3759
====> Recall@5: 0.6180
====> Recall@10: 0.7203
====> Recall@20: 0.8174
但是distill莫名出现负数。需要进一步观察
沿着这个继续训练:
epoch2
====> Recall@1: 0.3586
====> Recall@5: 0.6049
====> Recall@10: 0.7221
====> Recall@20: 0.8181
epoch3
====> Recall@1: 0.3571
====> Recall@5: 0.6175
====> Recall@10: 0.7307
====> Recall@20: 0.8272
epoch4
====> Recall@1: 0.3601
====> Recall@5: 0.6208
====> Recall@10: 0.7353
====> Recall@20: 0.8299
epoch5
====> Recall@1: 0.3697
====> Recall@5: 0.6346
====> Recall@10: 0.7434
====> Recall@20: 0.8344
epoch6:
====> Recall@1: 0.3623
====> Recall@5: 0.6205
====> Recall@10: 0.7295
====> Recall@20: 0.8294
epoch7
====> Recall@1: 0.3625
====> Recall@5: 0.6297
====> Recall@10: 0.7391
====> Recall@20: 0.8343
epoch8
====> Recall@1: 0.3788
====> Recall@5: 0.6530
====> Recall@10: 0.7610
====> Recall@20: 0.8503
4、ADAM+lr0.0000001,distill不变
====> Recall@1: 0.3845
====> Recall@5: 0.6241
====> Recall@10: 0.7309
====> Recall@20: 0.8223
TODO LIST:
1、考虑使用原始方法与Distill方法进行训练
2、PCA问题,描述子维度问题
3、LORM问题,原本ENCORDER后面有一个L2NORM被我暂时删除了。
4、关于disstill_loss,我的结果和base_loss差3个数量级,参考https://github.com/facebookresearch/deit/issues/134,或许可以尝试hard_loss
还是关于软硬loss的问题:https://github.com/facebookresearch/deit/issues/56
5、一些训练中可能涉及到的问题:
loss NAN:issue忘了记录了
decay问题:https://github.com/facebookresearch/deit/issues/68
kdloss或许可以修改的地方:https://github.com/facebookresearch/deit/issues/61
6、修改Loss的控制台输出,加上时间,以及每个epoch各自的大小。
7\
改动细节:
1、有关250k
改动前面都加了##change:250k
3、实验分析:
近年SOTA:
图片似乎采用resize
1、PITTS30K——VAL
1)VLAD
32768
====> Recall@1: 0.8190
====> Recall@5: 0.9123
====> Recall@10: 0.9366
====> Recall@20: 0.9575
match_time:8147.40478515625
2)PATCH-NETVLAD
512
====> Recall@1: 0.7742
====> Recall@5: 0.9048
====> Recall@10: 0.9403
====> Recall@20: 0.9671
match_time
160.3480987548828
3)Our
256
calculate_time
15.406815528869629
16.089759826660156
15.35318374633789
15.888575553894043
15.57817554473877
15.533535957336426
15.269087791442871
15.363360404968262
15.466752052307129
15.417375564575195
15.604928016662598
match_time
136.85328674316406
4)DELG
512
====> Recall@1: 0.5798
====> Recall@5: 0.7693
====> Recall@10: 0.8268
====> Recall@20: 0.8633
match time 835.2278442382812
extract time
30.439456939697266
30.195775985717773
30.43881607055664
30.27030372619629
30.372512817382812
30.267072677612305
30.44816017150879
29.61907196044922
30.250240325927734
30.172447204589844
30.360992431640625
30.19660758972168
30.330944061279297
30.242048263549805
30.271839141845703
5)SFRS
图片似乎采用resize——后续修改了,结果不采用
不用PCA:
match_time:8896.9111328125
====> Recall@1: 0.4026
====> Recall@5: 0.6126
====> Recall@10: 0.7014
====> Recall@20: 0.7847
数据有问题,缺少了好多层
extract_time不含PAC:
60.04828643798828
59.72060775756836
60.447200775146484
60.522335052490234
60.50899124145508
60.42073440551758
60.49766540527344
60.42166519165039
60.51308822631836
60.844417572021484
60.53388977050781
6)Multires-vlad
match_time:
1330.575927734375
====> Recall@1: 0.8938
====> Recall@5: 0.9660
====> Recall@10: 0.9817
====> Recall@20: 0.9892
eatract time:
111.94406127929688
112.33484649658203
111.87155151367188
112.25910186767578
112.72732543945312
112.26844787597656
112.12531280517578
112.54713439941406
112.44147491455078
112.49273681640625
112.44268798828125
112.41622161865234
这篇论文原本带统计时间的函数,不过方法不一样。
3)DBOW方法
extract time:单位秒
0.013574361801147461
0.012018680572509766
0.01383066177368164
0.020437240600585938
0.021411657333374023
0.05488920211791992
0.021865367889404297
0.02072429656982422
0.01495504379272461
0.016642093658447266
0.024945497512817383
0.022006988525390625
0.012542963027954102
0.008678913116455078
0.008888006210327148
0.01682734489440918
0.017751693725585938
0.019913196563720703
0.0222165584564209
0.021813631057739258
0.07405996322631836
0.02047872543334961
0.026796340942382812
0.021975278854370117
0.01502370834350586
0.008692026138305664
0.011438131332397461
0.019820213317871094
0.02095174789428711
0.020492076873779297
0.02255845069885254
0.01819467544555664
0.020274877548217773
0.022342920303344727
0.09872317314147949
0.02173924446105957
0.00989842414855957
0.00745081901550293
0.00677180290222168
0.011330842971801758
0.01789116859436035
0.02043914794921875
0.02269911766052246
0.021180152893066406
0.02588033676147461
0.027025938034057617
0.024840831756591797
0.020226240158081055
0.01088571548461914
0.010780811309814453
0.014653444290161133
0.010091304779052734
0.11080074310302734
0.022695541381835938
0.023570537567138672
0.023064851760864258
0.022131681442260742
0.023566246032714844
0.0262906551361084
0.015903711318969727
0.008333444595336914
0.007961511611938477
0.010892629623413086
0.010495662689208984
0.011915922164916992
0.020488500595092773
6)GeM
without white PCA
====> Recall@1: 0.7373
====> Recall@5: 0.8729
====> Recall@10: 0.9131
====> Recall@20: 0.9460
2、PITTS250K——VAL
1)VLAD
calculate_time
61.26380920410156
61.38684844970703
61.17171096801758
61.360958099365234
61.534080505371094
61.08665466308594
61.02115249633789
61.401729583740234
61.36284637451172
61.369598388671875
61.3520622253418
61.36713409423828
61.53398513793945
2)PATCH-NETVLAD
====> Recall@1: 0.7458
====> Recall@5: 0.8705
====> Recall@10: 0.9017
====> Recall@20: 0.9271
match_time
1811.72021484375
calculate_time
15.583264350891113
14.759072303771973
15.042176246643066
15.01699161529541
14.90236759185791
15.883071899414062
15.401472091674805
15.909536361694336
15.60086441040039
14.560511589050293
15.225855827331543
14.903519630432129
15.892895698547363
15.987263679504395
3)OUr
match_time:
903.8621826171875ms
4)DELG
====> Recall@1: 0.5300
====> Recall@5: 0.6918
====> Recall@10: 0.7412
====> Recall@20: 0.7827
match_time:6062.5439453125