AlexNet网络的Pytorch实现

1.文章原文地址

ImageNet Classification with Deep Convolutional Neural Networks

2.文章摘要

我们训练了一个大型的深度卷积神经网络用于在ImageNet LSVRC-2010竞赛中,将120万(12百万)的高分辨率图像进行1000个类别的分类。在测试集上,网络的top-1和top-5误差分别为37.5%和17.0%,这结果极大的优于先前的最好结果。这个拥有6千万(60百万)参数和65万神经元的神经网络包括了五个卷积层,其中一些卷积层后面会跟着最大池化层,以及三个全连接层,其中全连接层是以1000维的softmax激活函数结尾的。为了可以训练的更快,我们使用了非饱和神经元(如Relu,激活函数输出没有将其限定在特定范围)和一个非常高效的GPU来完成卷积运算,为了减少过拟合,我们在全连接层中使用了近期发展起来的一种正则化方式,即dropout,它被证明是非常有效的。我们也使用了该模型的一个变体用于ILSVRC-2012竞赛中,并且以top-5的测试误差为15.3赢得比赛,该比赛中第二名的top-5测试误差为26.2%。

3.网络结构

4.Pytorch实现

 1 import torch.nn as nn
 2 from torchsummary import summary
 3 
 4 try:
 5     from torch.hub import load_state_dict_from_url
 6 except ImportError:
 7     from torch.utils.model_zoo import load_url as load_state_dict_from_url
 8 
 9 model_urls = {
10     'alexnet': 'https://download.pytorch.org/models/alexnet-owt-4df8aa71.pth',
11 }
12 
13 class AlexNet(nn.Module):
14     def __init__(self,num_classes=1000):
15         super(AlexNet,self).__init__()
16         self.features=nn.Sequential(
17             nn.Conv2d(3,96,kernel_size=11,stride=4,padding=2),   #(224+2*2-11)/4+1=55
18             nn.ReLU(inplace=True),
19             nn.MaxPool2d(kernel_size=3,stride=2),   #(55-3)/2+1=27
20             nn.Conv2d(96,256,kernel_size=5,stride=1,padding=2), #(27+2*2-5)/1+1=27
21             nn.ReLU(inplace=True),
22             nn.MaxPool2d(kernel_size=3,stride=2),   #(27-3)/2+1=13
23             nn.Conv2d(256,384,kernel_size=3,stride=1,padding=1),    #(13+1*2-3)/1+1=13
24             nn.ReLU(inplace=True),
25             nn.Conv2d(384,384,kernel_size=3,stride=1,padding=1),    #(13+1*2-3)/1+1=13
26             nn.ReLU(inplace=True),
27             nn.Conv2d(384,256,kernel_size=3,stride=1,padding=1),    #13+1*2-3)/1+1=13
28             nn.ReLU(inplace=True),
29             nn.MaxPool2d(kernel_size=3,stride=2),   #(13-3)/2+1=6
30         )   #6*6*256=9126
31 
32         self.avgpool=nn.AdaptiveAvgPool2d((6,6))
33         self.classifier=nn.Sequential(
34             nn.Dropout(),
35             nn.Linear(256*6*6,4096),
36             nn.ReLU(inplace=True),
37             nn.Dropout(),
38             nn.Linear(4096,4096),
39             nn.ReLU(inplace=True),
40             nn.Linear(4096,num_classes),
41         )
42 
43     def forward(self,x):
44         x=self.features(x)
45         x=self.avgpool(x)
46         x=x.view(x.size(0),-1)
47         x=self.classifier(x)
48         return x
49 
50 def alexnet(pretrain=False,progress=True,**kwargs):
51     r"""
52     Args:
53         pretrained(bool):If True, retures a model pre-trained on IMageNet
54         progress(bool):If True, displays a progress bar of the download to stderr
55     """
56     model=AlexNet(**kwargs)
57     if pretrain:
58         state_dict=load_state_dict_from_url(model_urls['alexnet'],
59                                             progress=progress)
60         model.load_state_dict(state_dict)
61     return model
62 
63 if __name__=="__main__":
64     model=alexnet()
65     print(summary(model,(3,224,224)))
 1 Output:
 2 ----------------------------------------------------------------
 3         Layer (type)               Output Shape         Param #
 4 ================================================================
 5             Conv2d-1           [-1, 96, 55, 55]          34,944
 6               ReLU-2           [-1, 96, 55, 55]               0
 7          MaxPool2d-3           [-1, 96, 27, 27]               0
 8             Conv2d-4          [-1, 256, 27, 27]         614,656
 9               ReLU-5          [-1, 256, 27, 27]               0
10          MaxPool2d-6          [-1, 256, 13, 13]               0
11             Conv2d-7          [-1, 384, 13, 13]         885,120
12               ReLU-8          [-1, 384, 13, 13]               0
13             Conv2d-9          [-1, 384, 13, 13]       1,327,488
14              ReLU-10          [-1, 384, 13, 13]               0
15            Conv2d-11          [-1, 256, 13, 13]         884,992
16              ReLU-12          [-1, 256, 13, 13]               0
17         MaxPool2d-13            [-1, 256, 6, 6]               0
18 AdaptiveAvgPool2d-14            [-1, 256, 6, 6]               0
19           Dropout-15                 [-1, 9216]               0
20            Linear-16                 [-1, 4096]      37,752,832
21              ReLU-17                 [-1, 4096]               0
22           Dropout-18                 [-1, 4096]               0
23            Linear-19                 [-1, 4096]      16,781,312
24              ReLU-20                 [-1, 4096]               0
25            Linear-21                 [-1, 1000]       4,097,000
26 ================================================================
27 Total params: 62,378,344
28 Trainable params: 62,378,344
29 Non-trainable params: 0
30 ----------------------------------------------------------------
31 Input size (MB): 0.57
32 Forward/backward pass size (MB): 11.16
33 Params size (MB): 237.95
34 Estimated Total Size (MB): 249.69
35 ----------------------------------------------------------------

参考

https://github.com/pytorch/vision/tree/master/torchvision/models

posted @ 2019-05-09 20:21  ysyouaremyall  阅读(4773)  评论(0编辑  收藏  举报