AlexNet网络的Pytorch实现
1.文章原文地址
ImageNet Classification with Deep Convolutional Neural Networks
2.文章摘要
我们训练了一个大型的深度卷积神经网络用于在ImageNet LSVRC-2010竞赛中,将120万(12百万)的高分辨率图像进行1000个类别的分类。在测试集上,网络的top-1和top-5误差分别为37.5%和17.0%,这结果极大的优于先前的最好结果。这个拥有6千万(60百万)参数和65万神经元的神经网络包括了五个卷积层,其中一些卷积层后面会跟着最大池化层,以及三个全连接层,其中全连接层是以1000维的softmax激活函数结尾的。为了可以训练的更快,我们使用了非饱和神经元(如Relu,激活函数输出没有将其限定在特定范围)和一个非常高效的GPU来完成卷积运算,为了减少过拟合,我们在全连接层中使用了近期发展起来的一种正则化方式,即dropout,它被证明是非常有效的。我们也使用了该模型的一个变体用于ILSVRC-2012竞赛中,并且以top-5的测试误差为15.3赢得比赛,该比赛中第二名的top-5测试误差为26.2%。
3.网络结构
4.Pytorch实现
1 import torch.nn as nn 2 from torchsummary import summary 3 4 try: 5 from torch.hub import load_state_dict_from_url 6 except ImportError: 7 from torch.utils.model_zoo import load_url as load_state_dict_from_url 8 9 model_urls = { 10 'alexnet': 'https://download.pytorch.org/models/alexnet-owt-4df8aa71.pth', 11 } 12 13 class AlexNet(nn.Module): 14 def __init__(self,num_classes=1000): 15 super(AlexNet,self).__init__() 16 self.features=nn.Sequential( 17 nn.Conv2d(3,96,kernel_size=11,stride=4,padding=2), #(224+2*2-11)/4+1=55 18 nn.ReLU(inplace=True), 19 nn.MaxPool2d(kernel_size=3,stride=2), #(55-3)/2+1=27 20 nn.Conv2d(96,256,kernel_size=5,stride=1,padding=2), #(27+2*2-5)/1+1=27 21 nn.ReLU(inplace=True), 22 nn.MaxPool2d(kernel_size=3,stride=2), #(27-3)/2+1=13 23 nn.Conv2d(256,384,kernel_size=3,stride=1,padding=1), #(13+1*2-3)/1+1=13 24 nn.ReLU(inplace=True), 25 nn.Conv2d(384,384,kernel_size=3,stride=1,padding=1), #(13+1*2-3)/1+1=13 26 nn.ReLU(inplace=True), 27 nn.Conv2d(384,256,kernel_size=3,stride=1,padding=1), #13+1*2-3)/1+1=13 28 nn.ReLU(inplace=True), 29 nn.MaxPool2d(kernel_size=3,stride=2), #(13-3)/2+1=6 30 ) #6*6*256=9126 31 32 self.avgpool=nn.AdaptiveAvgPool2d((6,6)) 33 self.classifier=nn.Sequential( 34 nn.Dropout(), 35 nn.Linear(256*6*6,4096), 36 nn.ReLU(inplace=True), 37 nn.Dropout(), 38 nn.Linear(4096,4096), 39 nn.ReLU(inplace=True), 40 nn.Linear(4096,num_classes), 41 ) 42 43 def forward(self,x): 44 x=self.features(x) 45 x=self.avgpool(x) 46 x=x.view(x.size(0),-1) 47 x=self.classifier(x) 48 return x 49 50 def alexnet(pretrain=False,progress=True,**kwargs): 51 r""" 52 Args: 53 pretrained(bool):If True, retures a model pre-trained on IMageNet 54 progress(bool):If True, displays a progress bar of the download to stderr 55 """ 56 model=AlexNet(**kwargs) 57 if pretrain: 58 state_dict=load_state_dict_from_url(model_urls['alexnet'], 59 progress=progress) 60 model.load_state_dict(state_dict) 61 return model 62 63 if __name__=="__main__": 64 model=alexnet() 65 print(summary(model,(3,224,224)))
1 Output: 2 ---------------------------------------------------------------- 3 Layer (type) Output Shape Param # 4 ================================================================ 5 Conv2d-1 [-1, 96, 55, 55] 34,944 6 ReLU-2 [-1, 96, 55, 55] 0 7 MaxPool2d-3 [-1, 96, 27, 27] 0 8 Conv2d-4 [-1, 256, 27, 27] 614,656 9 ReLU-5 [-1, 256, 27, 27] 0 10 MaxPool2d-6 [-1, 256, 13, 13] 0 11 Conv2d-7 [-1, 384, 13, 13] 885,120 12 ReLU-8 [-1, 384, 13, 13] 0 13 Conv2d-9 [-1, 384, 13, 13] 1,327,488 14 ReLU-10 [-1, 384, 13, 13] 0 15 Conv2d-11 [-1, 256, 13, 13] 884,992 16 ReLU-12 [-1, 256, 13, 13] 0 17 MaxPool2d-13 [-1, 256, 6, 6] 0 18 AdaptiveAvgPool2d-14 [-1, 256, 6, 6] 0 19 Dropout-15 [-1, 9216] 0 20 Linear-16 [-1, 4096] 37,752,832 21 ReLU-17 [-1, 4096] 0 22 Dropout-18 [-1, 4096] 0 23 Linear-19 [-1, 4096] 16,781,312 24 ReLU-20 [-1, 4096] 0 25 Linear-21 [-1, 1000] 4,097,000 26 ================================================================ 27 Total params: 62,378,344 28 Trainable params: 62,378,344 29 Non-trainable params: 0 30 ---------------------------------------------------------------- 31 Input size (MB): 0.57 32 Forward/backward pass size (MB): 11.16 33 Params size (MB): 237.95 34 Estimated Total Size (MB): 249.69 35 ----------------------------------------------------------------
参考
https://github.com/pytorch/vision/tree/master/torchvision/models