pytorch_SSD300
SSD 网络结构图
原文采用了VGG16作为基础网络,也可以选择其他网络,比如RestNet50等
- '300': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'C', 512, 512, 512, 'M',512, 512, 512],
VGG 网络结构
- input image = (300,300, 3)
- conv1_1 (3, 64) + Relu feature map = (300,300, 64)
- conv1_2 (64, 64) + Relu feature map = (300,300, 64)
- MaxPool_1 feature map = (150,150, 64)
- conv2_1 (64, 128) + Relu feature map = (150,150,128)
- conv2_2 (128, 128) + Relu feature map = (150,150,128)
- MaxPool_2 feature map = (75 ,75 ,128)
- conv3_1 (128, 256)+ Relu feature map = (75 ,75 ,256)
- conv3_2 (256, 256)+ Relu feature map = (75 ,75 ,256)
- conv3_3 (256, 256)+ Relu feature map = (75 ,75 ,256)
- MaxPool_3 feature map = (38 ,38 ,256) ceil_mode=True
- conv4_1 (256, 512)+ Relu feature map = (38 ,38 ,512)
- conv4_2 (512, 512)+ Relu feature map = (38 ,38 ,512) ------>>>>>>>Detecte
- conv4_3 (512, 512)+ Relu feature map = (38 ,38 ,512)
- MaxPool_4 feature map = (19 ,19 ,512)
- conv5_1 (512, 512)+ Relu feature map = (19 ,19 ,512)
- conv5_2 (512, 512)+ Relu feature map = (19 ,19 ,512)
- conv5_3 (512, 512)+ Relu feature map = (19 ,19 ,512)
- MaxPool_5 feature map = (19 ,19 ,512) 不改变大小
- conv6 (512, 1024)+ Relu feature map = (19 ,19 ,1024)
- conv7 (1024,1024)+ Relu feature map = (19 ,19 ,1024) ------>>>>>>>Detecte
Extras 网络结构
-
cfg --> '300': [256, 'S', 512, 128, 'S', 256, 128, 256, 128, 256]
-
conv8_1 (1024,256) feature map = (19 ,19 ,256)
-
conv8_2 (256, 512) feature map = (10 ,10 ,512) ------>>>>>>>Detecte
-
conv9_1 (512, 128) feature map = (10 ,10 ,128)
-
conv9_2 (128, 256) feature map = (5 , 5 ,256) ------>>>>>>>Detecte
-
conv10_1 (256, 128) feature map = (5 , 5 ,128)
-
conv10_2 (128, 256) feature map = (3 , 3 ,256) ------>>>>>>>Detecte
-
conv11_1 (256, 128) feature map = (3 , 3 ,128)
-
conv11_2 (128, 256) feature map = (1 , 1 ,256) ------>>>>>>>Detecte
Prior Box 的计算
至此 两个正方形的大小已经确定 另外两个矩形根据以下选取
中心点怎么选取 最后需要乘以原图大小*300得到原图位置
损失函数 分类损失 和 定位损失
Smooth L1 loss
- 对于下图公式中的g和l,对应在代码中分别代表已经encode完成的offset。(g代表一对匹配的真实框和默认框的offset,对应代码中的loc_t,l代表预测框和默认框之间的offset,对应代码中的loc_p)。
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0.01, 2, 100)
y = np.log10(x)
y2 = x-1
# y = np.log(x)
# 移动坐标轴 ax 就是整张图
ax = plt.gca() #获取整张图
ax.spines['right'].set_color('none')#右边的轴设置成没有
ax.spines['top'].set_color('none')
ax.xaxis.set_ticks_position('bottom')
# 设置x轴的位置在 y = 0处
ax.spines['bottom'].set_position(('data',0))
ax.spines['left'].set_position(('data',0))
plt.plot(x,y)
# plt.plot(x,y2)
plt.title('log')
plt.show()
Hard Negative Mining(难例挖掘)
训练过程
- 在VOC数据集上进行训练 16551张图片 迭代10000次 约20次Epoch
测试精度 mAP in VOC
batch_size=64 之后精度有所提升
【参考文章】
posted on 2020-07-17 16:05 wangxiaobei2019 阅读(613) 评论(0) 编辑 收藏 举报