J1周：ResNet-50算法实战与解析

🍨 本文为🔗365天深度学习训练营中的学习记录博客

🍖 原作者：K同学啊|接辅导、项目定制

本周任务

根据本文 TensorFlow 代码，编写出相应的 Pytorch 代码
了解残差结构
是否可以将残差模块融入到C3当中（自由探索）

一、知识储备

深度残差网络ResNet（deep residual network）在2015年由何凯明等提出，因为它简单与实用并存，随后很多研究都是建立在ResNet-50或者ResNet-101基础上完成的。
ResNet主要解决深度卷积网络在深度加深时候的“退化”问题。在一般的卷积神经网络中，增大网络深度后带来的第一个问题就是梯度消失、爆炸，这个问题在Szegedy提出BN后被顺利解决。BN层能对各层的输出做归一化，这样梯度在反向层层传递后仍能保持大小稳定，不会出现过小或过大的情况。但是作者发现加了BN后，再加大深度仍然不容易收敛，其提到了第二个问题——准确率下降问题：层级大到一定程度时，准确率就会饱和，然后迅速下降。这种下降既不是梯度消失引起的，也不是过拟合造成的，而是由于网络过于复杂，以至于光靠不加约束的放养式的训练很难达到理想的错误率。准确率下降问题不是网络结构本身的问题，而是现有的训练方式不够理想造成的。当前广泛使用的训练方法，无论是SGD，还是RMSProp，或是Adam，都无法在网络深度变大后达到理论上最优的收敛结果。还可以证明只要有理想的训练方式，更深的网络肯定会比较浅的网络效果要好。证明过程也很简单：假设在一种网络A的后面添加几层形成新的网络B，如果增加的层级只是对A的输出做了个恒等映射（identity mapping），即A的输出经过新增的层级变成B的输出后没有发生变化，这样网络A和网络B的错误率就是相等的，也就证明了加深后的网络不会比加深前的网络效果差。

何凯明提出了一种残差结构来实现上述恒等映射：整个模块除了正常的卷积层输出外，还有一个分支把输入直接连到输出上，该分支输出和卷积的输出做算数相加得到最终的输出，用公式表达就是 H ( x ) = F ( x ) + x H(x) = F(x) + xH(x)=F(x)+x，x是输入，F ( x ) F(x)F(x)是卷积分支的输出，H ( x ) H(x)H(x)是整个结构的输出。可以证明如果F ( x ) F(x)F(x)分支中所有参数都是0，H ( x ) H(x)H(x)就是个恒等映射。残差结构人为制造了恒等映射，就能让整个结构朝着恒等映射的方向去收敛，确保最终的错误率不会因为深度的变大而越来越差。如果一个网络通过简单的手工设置参数就能达到想要的结果，那这种结构就很容易通过训练来收敛到该结果，这是一条设计复杂的网络时通用的规则。

上图左边的单元为ResNet两层的残差单元，两层的残差单元包含两个相同输出的通道数的3x3卷积，只是用于较浅的ResNet网络，对较深的网络主要使用三层的残差单元。三层的残差单元又称为bottleneck结构，先用一个1x1卷积进行降维，然后3x3卷积，最后用1x1升维恢复原有的维度。另外，如果有输入输出维度不同的情况，可以对输入做一个线性映射变换维度，再连接后面的层，三层的残差单元对于相同数量的层又减少了参数量，因此可以拓展更深的模型，通过残差单元的组合有经典的ResNet-50，ResNet-101等网络结构。

二、前期工作

1、设置GPU

import torch
import torchvision

if __name__=='__main__':
    ''' 设置GPU '''
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    print("Using {} device\n".format(device))

2、导入数据

root = './jupyternotebook/data'
output = 'output'
data_dir = os.path.join(root, 'bird_photos')

3、查看数据

''' 读取本地数据集并划分训练集与测试集 '''
def localDataset(data_dir):
    data_dir = pathlib.Path(data_dir)
    
    # 读取本地数据集
    data_paths = list(data_dir.glob('*'))
    classeNames = [str(path).split("\\")[-1] for path in data_paths]
    
    # 关于transforms.Compose的更多介绍可以参考：https://blog.csdn.net/qq_38251616/article/details/124878863
    train_transforms = torchvision.transforms.Compose([
        torchvision.transforms.Resize([224, 224]),  # 将输入图片resize成统一尺寸
        # torchvision.transforms.RandomHorizontalFlip(), # 随机水平翻转
        torchvision.transforms.ToTensor(),          # 将PIL Image或numpy.ndarray转换为tensor，并归一化到[0,1]之间
        torchvision.transforms.Normalize(           # 标准化处理-->转换为标准正太分布（高斯分布），使模型更容易收敛
            mean=[0.485, 0.456, 0.406], 
            std=[0.229, 0.224, 0.225])  # 其中 mean=[0.485,0.456,0.406]与std=[0.229,0.224,0.225] 从数据集中随机抽样计算得到的。
    ])
    
    total_dataset = torchvision.datasets.ImageFolder(data_dir, transform=train_transforms)
    print(total_dataset, '\n')
    print(total_dataset.class_to_idx, '\n')
    
    # 划分训练集与测试集
    train_size = int(0.8 * len(total_dataset))
    test_size  = len(total_dataset) - train_size
    print('train_size', train_size, 'test_size', test_size, '\n')
    train_dataset, test_dataset = torch.utils.data.random_split(total_dataset, [train_size, test_size])
    
    return classeNames, train_dataset, test_dataset

classeNames, train_ds, test_ds = localDataset(data_dir)
num_classes = len(classeNames)
print('num_classes', num_classes)

三、数据预处理

1、加载数据

''' 加载数据，并设置batch_size '''
def loadData(train_ds, test_ds, batch_size=32, root='', show_flag=False):
    # 从 train_ds 加载训练集
    train_dl = torch.utils.data.DataLoader(train_ds,
                                           batch_size=batch_size,
                                           shuffle=True,
                                           num_workers=1)
    # 从 test_ds 加载测试集
    test_dl  = torch.utils.data.DataLoader(test_ds,
                                           batch_size=batch_size,
                                           shuffle=True,
                                           num_workers=1)
    
    # 取一个批次查看数据格式
    # 数据的shape为：[batch_size, channel, height, weight]
    # 其中batch_size为自己设定，channel，height和weight分别是图片的通道数，高度和宽度。
    for X, y in test_dl:
        print('Shape of X [N, C, H, W]: ', X.shape)
        print('Shape of y: ', y.shape, y.dtype, '\n')
        break
    
    imgs, labels = next(iter(train_dl))
    print('Image shape: ', imgs.shape, '\n')
    # torch.Size([32, 3, 224, 224])  # 所有数据集中的图像都是224*224的RGB图
    displayData(imgs, root, show_flag)
    return train_dl, test_dl

batch_size = 8
train_dl, test_dl = loadData(train_ds, test_ds, batch_size, root, True)

2、可视化

''' 数据可视化 '''
def displayData(imgs, root='', flag=False):
    # 指定图片大小，图像大小为20宽、5高的绘图(单位为英寸inch)
    plt.figure('Data Visualization', figsize=(10, 5)) 
    for i, imgs in enumerate(imgs[:8]):
        # 维度顺序调整 [3, 224, 224]->[224, 224, 3]
        npimg = imgs.numpy().transpose((1, 2, 0))
        # 将整个figure分成2行10列，绘制第i+1个子图。
        plt.subplot(2, 4, i+1)
        plt.imshow(npimg)  # cmap=plt.cm.binary
        plt.title(list(classeNames)[labels[i]])
        plt.axis('off')
    plt.savefig(os.path.join(root, 'DatasetDisplay.png'))
    if flag:
        plt.show()
    else:
        plt.close('all')

四、REsNet介绍

1、ResNet解决了什么

残差网络是为了解决神经网络隐藏层过多时，而引起的网络退化问题。退化（degradation）问题是指：当网络隐藏层变多时，网络的准确度达到饱和然后急剧退化，而且这个退化不是由于过拟合引起的。

2、ResNet 50

五、构建ResNet50网络模型

''' Same Padding '''
def autopad(k, p=None):  # kernel, padding
    # Pad to 'same'
    if p is None:
        p = k // 2 if isinstance(k, int) else [x // 2 for x in k]  # auto-pad
    return p


''' Identity Block '''
class IdentityBlock(nn.Module):
    def __init__(self, in_channel, kernel_size, filters):
        super(IdentityBlock, self).__init__()
        filters1, filters2, filters3 = filters
        self.conv1 = nn.Sequential(
            nn.Conv2d(in_channel, filters1, 1, stride=1, padding=0, bias=False),
            nn.BatchNorm2d(filters1),
            nn.ReLU(True)
        )
        self.conv2 = nn.Sequential(
            nn.Conv2d(filters1, filters2, kernel_size, stride=1, padding=autopad(kernel_size), bias=False),
            nn.BatchNorm2d(filters2),
            nn.ReLU(True)
        )
        self.conv3 = nn.Sequential(
            nn.Conv2d(filters2, filters3, 1, stride=1, padding=0, bias=False),
            nn.BatchNorm2d(filters3)
        )
        self.relu = nn.ReLU(True)
    
    def forward(self, x):
        x1 = self.conv1(x)
        x1 = self.conv2(x1)
        x1 = self.conv3(x1)
        x = x1 + x
        self.relu(x)
        return x


''' Conv Block '''
class ConvBlock(nn.Module):
    def __init__(self, in_channel, kernel_size, filters, stride=2):
        super(ConvBlock, self).__init__()
        filters1, filters2, filters3 = filters
        self.conv1 = nn.Sequential(
            nn.Conv2d(in_channel, filters1, 1, stride=stride, padding=0, bias=False),
            nn.BatchNorm2d(filters1),
            nn.ReLU(True)
        )
        self.conv2 = nn.Sequential(
            nn.Conv2d(filters1, filters2, kernel_size, stride=1, padding=autopad(kernel_size), bias=False),
            nn.BatchNorm2d(filters2),
            nn.ReLU(True)
        )
        self.conv3 = nn.Sequential(
            nn.Conv2d(filters2, filters3, 1, stride=1, padding=0, bias=False),
            nn.BatchNorm2d(filters3)
        )
        self.conv4 = nn.Sequential(
            nn.Conv2d(in_channel, filters3, 1, stride=stride, padding=0, bias=False),
            nn.BatchNorm2d(filters3)
        )
        self.relu = nn.ReLU(True)
    
    def forward(self, x):
        x1 = self.conv1(x)
        x1 = self.conv2(x1)
        x1 = self.conv3(x1)
        x2 = self.conv4(x)
        x = x1 + x2
        self.relu(x)
        return x


''' 构建ResNet-50 '''
class ResNet50(nn.Module):
    def __init__(self, classes=1000):
        super(ResNet50, self).__init__()
        self.conv1 = nn.Sequential(
            nn.Conv2d(3, 64, 7, stride=2, padding=3, bias=False, padding_mode='zeros'),
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=3, stride=2, padding=0)
        )
        self.conv2 = nn.Sequential(
            ConvBlock(64, 3, [64, 64, 256], stride=1),
            IdentityBlock(256, 3, [64, 64, 256]),
            IdentityBlock(256, 3, [64, 64, 256])
        )
        self.conv3 = nn.Sequential(
            ConvBlock(256, 3, [128, 128, 512]),
            IdentityBlock(512, 3, [128, 128, 512]),
            IdentityBlock(512, 3, [128, 128, 512]),
            IdentityBlock(512, 3, [128, 128, 512])
        )
        self.conv4 = nn.Sequential(
            ConvBlock(512, 3, [256, 256, 1024]),
            IdentityBlock(1024, 3, [256, 256, 1024]),
            IdentityBlock(1024, 3, [256, 256, 1024]),
            IdentityBlock(1024, 3, [256, 256, 1024]),
            IdentityBlock(1024, 3, [256, 256, 1024]),
            IdentityBlock(1024, 3, [256, 256, 1024])
        )
        self.conv5 = nn.Sequential(
            ConvBlock(1024, 3, [512, 512, 2048]),
            IdentityBlock(2048, 3, [512, 512, 2048]),
            IdentityBlock(2048, 3, [512, 512, 2048])
        )
        self.pool = nn.AvgPool2d(kernel_size=7, stride=7, padding=0)
        self.fc = nn.Linear(2048, n_class)
    
    def forward(self, x):
        x = self.conv1(x)
        x = self.conv2(x)
        x = self.conv3(x)
        x = self.conv4(x)
        x = self.conv5(x)
        x = self.pool(x)
        x = torch.flatten(x, start_dim=1)
        x = self.fc(x)
        return x


model = ResNet50().to(device)
''' 显示网络结构 '''
torchsummary.summary(model, (3, 224, 224))
#torchinfo.summary(model)
print(model)

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1         [-1, 64, 112, 112]           9,408
       BatchNorm2d-2         [-1, 64, 112, 112]             128
              ReLU-3         [-1, 64, 112, 112]               0
         MaxPool2d-4           [-1, 64, 55, 55]               0
            Conv2d-5           [-1, 64, 55, 55]           4,096
       BatchNorm2d-6           [-1, 64, 55, 55]             128
              ReLU-7           [-1, 64, 55, 55]               0
            Conv2d-8           [-1, 64, 55, 55]          36,864
       BatchNorm2d-9           [-1, 64, 55, 55]             128
             ReLU-10           [-1, 64, 55, 55]               0
           Conv2d-11          [-1, 256, 55, 55]          16,384
      BatchNorm2d-12          [-1, 256, 55, 55]             512
           Conv2d-13          [-1, 256, 55, 55]          16,384
      BatchNorm2d-14          [-1, 256, 55, 55]             512
             ReLU-15          [-1, 256, 55, 55]               0
        ConvBlock-16          [-1, 256, 55, 55]               0
           Conv2d-17           [-1, 64, 55, 55]          16,384
      BatchNorm2d-18           [-1, 64, 55, 55]             128
             ReLU-19           [-1, 64, 55, 55]               0
           Conv2d-20           [-1, 64, 55, 55]          36,864
      BatchNorm2d-21           [-1, 64, 55, 55]             128
             ReLU-22           [-1, 64, 55, 55]               0
           Conv2d-23          [-1, 256, 55, 55]          16,384
      BatchNorm2d-24          [-1, 256, 55, 55]             512
             ReLU-25          [-1, 256, 55, 55]               0
    IdentityBlock-26          [-1, 256, 55, 55]               0
           Conv2d-27           [-1, 64, 55, 55]          16,384
      BatchNorm2d-28           [-1, 64, 55, 55]             128
             ReLU-29           [-1, 64, 55, 55]               0
           Conv2d-30           [-1, 64, 55, 55]          36,864
      BatchNorm2d-31           [-1, 64, 55, 55]             128
             ReLU-32           [-1, 64, 55, 55]               0
           Conv2d-33          [-1, 256, 55, 55]          16,384
      BatchNorm2d-34          [-1, 256, 55, 55]             512
             ReLU-35          [-1, 256, 55, 55]               0
    IdentityBlock-36          [-1, 256, 55, 55]               0
           Conv2d-37          [-1, 128, 28, 28]          32,768
      BatchNorm2d-38          [-1, 128, 28, 28]             256
             ReLU-39          [-1, 128, 28, 28]               0
           Conv2d-40          [-1, 128, 28, 28]         147,456
      BatchNorm2d-41          [-1, 128, 28, 28]             256
             ReLU-42          [-1, 128, 28, 28]               0
           Conv2d-43          [-1, 512, 28, 28]          65,536
      BatchNorm2d-44          [-1, 512, 28, 28]           1,024
           Conv2d-45          [-1, 512, 28, 28]         131,072
      BatchNorm2d-46          [-1, 512, 28, 28]           1,024
             ReLU-47          [-1, 512, 28, 28]               0
        ConvBlock-48          [-1, 512, 28, 28]               0
           Conv2d-49          [-1, 128, 28, 28]          65,536
      BatchNorm2d-50          [-1, 128, 28, 28]             256
             ReLU-51          [-1, 128, 28, 28]               0
           Conv2d-52          [-1, 128, 28, 28]         147,456
      BatchNorm2d-53          [-1, 128, 28, 28]             256
             ReLU-54          [-1, 128, 28, 28]               0
           Conv2d-55          [-1, 512, 28, 28]          65,536
      BatchNorm2d-56          [-1, 512, 28, 28]           1,024
             ReLU-57          [-1, 512, 28, 28]               0
    IdentityBlock-58          [-1, 512, 28, 28]               0
           Conv2d-59          [-1, 128, 28, 28]          65,536
      BatchNorm2d-60          [-1, 128, 28, 28]             256
             ReLU-61          [-1, 128, 28, 28]               0
           Conv2d-62          [-1, 128, 28, 28]         147,456
      BatchNorm2d-63          [-1, 128, 28, 28]             256
             ReLU-64          [-1, 128, 28, 28]               0
           Conv2d-65          [-1, 512, 28, 28]          65,536
      BatchNorm2d-66          [-1, 512, 28, 28]           1,024
             ReLU-67          [-1, 512, 28, 28]               0
    IdentityBlock-68          [-1, 512, 28, 28]               0
           Conv2d-69          [-1, 128, 28, 28]          65,536
      BatchNorm2d-70          [-1, 128, 28, 28]             256
             ReLU-71          [-1, 128, 28, 28]               0
           Conv2d-72          [-1, 128, 28, 28]         147,456
      BatchNorm2d-73          [-1, 128, 28, 28]             256
             ReLU-74          [-1, 128, 28, 28]               0
           Conv2d-75          [-1, 512, 28, 28]          65,536
      BatchNorm2d-76          [-1, 512, 28, 28]           1,024
             ReLU-77          [-1, 512, 28, 28]               0
    IdentityBlock-78          [-1, 512, 28, 28]               0
           Conv2d-79          [-1, 256, 14, 14]         131,072
      BatchNorm2d-80          [-1, 256, 14, 14]             512
             ReLU-81          [-1, 256, 14, 14]               0
           Conv2d-82          [-1, 256, 14, 14]         589,824
      BatchNorm2d-83          [-1, 256, 14, 14]             512
             ReLU-84          [-1, 256, 14, 14]               0
           Conv2d-85         [-1, 1024, 14, 14]         262,144
      BatchNorm2d-86         [-1, 1024, 14, 14]           2,048
           Conv2d-87         [-1, 1024, 14, 14]         524,288
      BatchNorm2d-88         [-1, 1024, 14, 14]           2,048
             ReLU-89         [-1, 1024, 14, 14]               0
        ConvBlock-90         [-1, 1024, 14, 14]               0
           Conv2d-91          [-1, 256, 14, 14]         262,144
      BatchNorm2d-92          [-1, 256, 14, 14]             512
             ReLU-93          [-1, 256, 14, 14]               0
           Conv2d-94          [-1, 256, 14, 14]         589,824
      BatchNorm2d-95          [-1, 256, 14, 14]             512
             ReLU-96          [-1, 256, 14, 14]               0
           Conv2d-97         [-1, 1024, 14, 14]         262,144
      BatchNorm2d-98         [-1, 1024, 14, 14]           2,048
             ReLU-99         [-1, 1024, 14, 14]               0
   IdentityBlock-100         [-1, 1024, 14, 14]               0
          Conv2d-101          [-1, 256, 14, 14]         262,144
     BatchNorm2d-102          [-1, 256, 14, 14]             512
            ReLU-103          [-1, 256, 14, 14]               0
          Conv2d-104          [-1, 256, 14, 14]         589,824
     BatchNorm2d-105          [-1, 256, 14, 14]             512
            ReLU-106          [-1, 256, 14, 14]               0
          Conv2d-107         [-1, 1024, 14, 14]         262,144
     BatchNorm2d-108         [-1, 1024, 14, 14]           2,048
            ReLU-109         [-1, 1024, 14, 14]               0
   IdentityBlock-110         [-1, 1024, 14, 14]               0
          Conv2d-111          [-1, 256, 14, 14]         262,144
     BatchNorm2d-112          [-1, 256, 14, 14]             512
            ReLU-113          [-1, 256, 14, 14]               0
          Conv2d-114          [-1, 256, 14, 14]         589,824
     BatchNorm2d-115          [-1, 256, 14, 14]             512
            ReLU-116          [-1, 256, 14, 14]               0
          Conv2d-117         [-1, 1024, 14, 14]         262,144
     BatchNorm2d-118         [-1, 1024, 14, 14]           2,048
            ReLU-119         [-1, 1024, 14, 14]               0
   IdentityBlock-120         [-1, 1024, 14, 14]               0
          Conv2d-121          [-1, 256, 14, 14]         262,144
     BatchNorm2d-122          [-1, 256, 14, 14]             512
            ReLU-123          [-1, 256, 14, 14]               0
          Conv2d-124          [-1, 256, 14, 14]         589,824
     BatchNorm2d-125          [-1, 256, 14, 14]             512
            ReLU-126          [-1, 256, 14, 14]               0
          Conv2d-127         [-1, 1024, 14, 14]         262,144
     BatchNorm2d-128         [-1, 1024, 14, 14]           2,048
            ReLU-129         [-1, 1024, 14, 14]               0
   IdentityBlock-130         [-1, 1024, 14, 14]               0
          Conv2d-131          [-1, 256, 14, 14]         262,144
     BatchNorm2d-132          [-1, 256, 14, 14]             512
            ReLU-133          [-1, 256, 14, 14]               0
          Conv2d-134          [-1, 256, 14, 14]         589,824
     BatchNorm2d-135          [-1, 256, 14, 14]             512
            ReLU-136          [-1, 256, 14, 14]               0
          Conv2d-137         [-1, 1024, 14, 14]         262,144
     BatchNorm2d-138         [-1, 1024, 14, 14]           2,048
            ReLU-139         [-1, 1024, 14, 14]               0
   IdentityBlock-140         [-1, 1024, 14, 14]               0
          Conv2d-141            [-1, 512, 7, 7]         524,288
     BatchNorm2d-142            [-1, 512, 7, 7]           1,024
            ReLU-143            [-1, 512, 7, 7]               0
          Conv2d-144            [-1, 512, 7, 7]       2,359,296
     BatchNorm2d-145            [-1, 512, 7, 7]           1,024
            ReLU-146            [-1, 512, 7, 7]               0
          Conv2d-147           [-1, 2048, 7, 7]       1,048,576
     BatchNorm2d-148           [-1, 2048, 7, 7]           4,096
          Conv2d-149           [-1, 2048, 7, 7]       2,097,152
     BatchNorm2d-150           [-1, 2048, 7, 7]           4,096
            ReLU-151           [-1, 2048, 7, 7]               0
       ConvBlock-152           [-1, 2048, 7, 7]               0
          Conv2d-153            [-1, 512, 7, 7]       1,048,576
     BatchNorm2d-154            [-1, 512, 7, 7]           1,024
            ReLU-155            [-1, 512, 7, 7]               0
          Conv2d-156            [-1, 512, 7, 7]       2,359,296
     BatchNorm2d-157            [-1, 512, 7, 7]           1,024
            ReLU-158            [-1, 512, 7, 7]               0
          Conv2d-159           [-1, 2048, 7, 7]       1,048,576
     BatchNorm2d-160           [-1, 2048, 7, 7]           4,096
            ReLU-161           [-1, 2048, 7, 7]               0
   IdentityBlock-162           [-1, 2048, 7, 7]               0
          Conv2d-163            [-1, 512, 7, 7]       1,048,576
     BatchNorm2d-164            [-1, 512, 7, 7]           1,024
            ReLU-165            [-1, 512, 7, 7]               0
          Conv2d-166            [-1, 512, 7, 7]       2,359,296
     BatchNorm2d-167            [-1, 512, 7, 7]           1,024
            ReLU-168            [-1, 512, 7, 7]               0
          Conv2d-169           [-1, 2048, 7, 7]       1,048,576
     BatchNorm2d-170           [-1, 2048, 7, 7]           4,096
            ReLU-171           [-1, 2048, 7, 7]               0
   IdentityBlock-172           [-1, 2048, 7, 7]               0
       AvgPool2d-173           [-1, 2048, 1, 1]               0
          Linear-174                    [-1, 4]           8,196
================================================================
Total params: 23,516,228
Trainable params: 23,516,228
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 270.43
Params size (MB): 89.71
Estimated Total Size (MB): 360.71
----------------------------------------------------------------

ResNet50(
  (conv1): Sequential(
    (0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU()
    (3): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (conv2): Sequential(
    (0): ConvBlock(
      (conv1): Sequential(
        (0): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv2): Sequential(
        (0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv3): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (conv4): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (relu): ReLU(inplace=True)
    )
    (1): IdentityBlock(
      (conv1): Sequential(
        (0): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv2): Sequential(
        (0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv3): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (relu): ReLU(inplace=True)
    )
    (2): IdentityBlock(
      (conv1): Sequential(
        (0): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv2): Sequential(
        (0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv3): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (relu): ReLU(inplace=True)
    )
  )
  (conv3): Sequential(
    (0): ConvBlock(
      (conv1): Sequential(
        (0): Conv2d(256, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv2): Sequential(
        (0): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv3): Sequential(
        (0): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (conv4): Sequential(
        (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (relu): ReLU(inplace=True)
    )
    (1): IdentityBlock(
      (conv1): Sequential(
        (0): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv2): Sequential(
        (0): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv3): Sequential(
        (0): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (relu): ReLU(inplace=True)
    )
    (2): IdentityBlock(
      (conv1): Sequential(
        (0): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv2): Sequential(
        (0): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv3): Sequential(
        (0): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (relu): ReLU(inplace=True)
    )
    (3): IdentityBlock(
      (conv1): Sequential(
        (0): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv2): Sequential(
        (0): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv3): Sequential(
        (0): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (relu): ReLU(inplace=True)
    )
  )
  (conv4): Sequential(
    (0): ConvBlock(
      (conv1): Sequential(
        (0): Conv2d(512, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv2): Sequential(
        (0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv3): Sequential(
        (0): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (conv4): Sequential(
        (0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (relu): ReLU(inplace=True)
    )
    (1): IdentityBlock(
      (conv1): Sequential(
        (0): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv2): Sequential(
        (0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv3): Sequential(
        (0): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (relu): ReLU(inplace=True)
    )
    (2): IdentityBlock(
      (conv1): Sequential(
        (0): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv2): Sequential(
        (0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv3): Sequential(
        (0): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (relu): ReLU(inplace=True)
    )
    (3): IdentityBlock(
      (conv1): Sequential(
        (0): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv2): Sequential(
        (0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv3): Sequential(
        (0): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (relu): ReLU(inplace=True)
    )
    (4): IdentityBlock(
      (conv1): Sequential(
        (0): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv2): Sequential(
        (0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv3): Sequential(
        (0): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (relu): ReLU(inplace=True)
    )
    (5): IdentityBlock(
      (conv1): Sequential(
        (0): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv2): Sequential(
        (0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv3): Sequential(
        (0): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (relu): ReLU(inplace=True)
    )
  )
  (conv5): Sequential(
    (0): ConvBlock(
      (conv1): Sequential(
        (0): Conv2d(1024, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv2): Sequential(
        (0): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv3): Sequential(
        (0): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (conv4): Sequential(
        (0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (relu): ReLU(inplace=True)
    )
    (1): IdentityBlock(
      (conv1): Sequential(
        (0): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv2): Sequential(
        (0): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv3): Sequential(
        (0): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (relu): ReLU(inplace=True)
    )
    (2): IdentityBlock(
      (conv1): Sequential(
        (0): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv2): Sequential(
        (0): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace=True)
      )
      (conv3): Sequential(
        (0): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (relu): ReLU(inplace=True)
    )
  )
  (pool): AvgPool2d(kernel_size=7, stride=7, padding=0)
  (fc): Linear(in_features=2048, out_features=4, bias=True)
)

六、设置超参数

''' 设置超参数 '''
start_epoch = 0
epochs      = 10
learn_rate  = 1e-7 # 初始学习率
loss_fn     = nn.CrossEntropyLoss()  # 创建损失函数
optimizer   = torch.optim.Adam(model.parameters(),lr=learn_rate)
train_loss  = []
train_acc   = []
test_loss   = []
test_acc    = []
epoch_best_acc = 0

七、训练模型

''' 训练循环 '''
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)  # 训练集的大小
    num_batches = len(dataloader)   # 批次数目

    train_loss, train_acc = 0, 0  # 初始化训练损失和正确率
    
    for X, y in dataloader:  # 获取图片及其标签
        X, y = X.to(device), y.to(device)
        
        # 计算预测误差
        pred = model(X)          # 网络输出
        loss = loss_fn(pred, y)  # 计算网络输出和真实值之间的差距，targets为真实值，计算二者差值即为损失
        
        # 反向传播
        optimizer.zero_grad()  # grad属性归零
        loss.backward()        # 反向传播
        optimizer.step()       # 每一步自动更新
        
        # 记录acc与loss
        train_acc  += (pred.argmax(1) == y).type(torch.float).sum().item()
        train_loss += loss.item()
            
    train_acc  /= size
    train_loss /= num_batches

    return train_acc, train_loss


''' 测试函数 '''
def test(dataloader, model, loss_fn):
    size        = len(dataloader.dataset)  # 测试集的大小
    num_batches = len(dataloader)          # 批次数目
    test_loss, test_acc = 0, 0
    
    # 当不进行训练时，停止梯度更新，节省计算内存消耗
    with torch.no_grad():
        for imgs, target in dataloader:
            imgs, target = imgs.to(device), target.to(device)
            
            # 计算loss
            target_pred = model(imgs)
            loss        = loss_fn(target_pred, target)
            
            test_loss += loss.item()
            test_acc  += (target_pred.argmax(1) == target).type(torch.float).sum().item()

    test_acc  /= size
    test_loss /= num_batches

    return test_acc, test_loss


''' 加载之前保存的模型 '''
if not os.path.exists(output) or not os.path.isdir(output):
    os.makedirs(output)
if start_epoch > 0:
    resumeFile = os.path.join(output, 'epoch'+str(start_epoch)+'.pkl')
    if not os.path.exists(resumeFile) or not os.path.isfile(resumeFile):
        start_epoch = 0
    else:
        model.load_state_dict(torch.load(resumeFile))  # 加载模型参数
''' 开始训练模型 '''
print('\nStart training...')
best_model = None
for epoch in range(start_epoch, epochs):
    model.train()
    epoch_train_acc, epoch_train_loss = train(train_dl, model, loss_fn, optimizer)
    model.eval()
    epoch_test_acc, epoch_test_loss = test(test_dl, model, loss_fn)
    
    train_acc.append(epoch_train_acc)
    train_loss.append(epoch_train_loss)
    test_acc.append(epoch_test_acc)
    test_loss.append(epoch_test_loss)
    
    # 获取当前的学习率
    lr = optimizer.state_dict()['param_groups'][0]['lr']
    
    template = ('Epoch:{:2d}, Train_acc:{:.1f}%, Train_loss:{:.3f}, Test_acc:{:.1f}%, Test_loss:{:.3f}, Lr:{:.2E}')
    print(time.strftime('[%Y-%m-%d %H:%M:%S]'), template.format(epoch+1, epoch_train_acc*100, epoch_train_loss, epoch_test_acc*100, epoch_test_loss, lr))
    
    # 保存最佳模型
    if epoch_test_acc>epoch_best_acc:
        ''' 保存最优模型参数 '''
        epoch_best_acc = epoch_test_acc
        best_model = copy.deepcopy(model)
        print(('acc = {:.1f}%, saving model to best.pkl').format(epoch_best_acc*100))
        saveFile = os.path.join(output, 'best.pkl')
        torch.save(best_model.state_dict(), saveFile)
    if epoch_test_acc==1 and epoch_train_acc==1:
        saveFile = os.path.join(output, 'epoch'+str(epoch+1)+'.pkl')
        torch.save(model.state_dict(), saveFile)
print('Done\n')

''' 保存模型参数 '''
saveFile = os.path.join(output, 'epoch'+str(epochs)+'.pkl')
torch.save(model.state_dict(), saveFile)

Start training...
[2023-02-09 14:40:46] Epoch: 1, Train_acc:20.6%, Train_loss:1.645, Test_acc:16.8%, Test_loss:1.576, Lr:1.00E-07
acc = 16.8%, saving model to best.pkl
[2023-02-09 14:41:35] Epoch: 2, Train_acc:23.7%, Train_loss:1.624, Test_acc:20.4%, Test_loss:1.721, Lr:1.00E-07
acc = 20.4%, saving model to best.pkl
[2023-02-09 14:41:46] Epoch: 3, Train_acc:23.2%, Train_loss:1.623, Test_acc:15.9%, Test_loss:1.610, Lr:1.00E-07
[2023-02-09 14:41:57] Epoch: 4, Train_acc:22.8%, Train_loss:1.617, Test_acc:18.6%, Test_loss:1.676, Lr:1.00E-07
[2023-02-09 14:42:08] Epoch: 5, Train_acc:22.1%, Train_loss:1.607, Test_acc:20.4%, Test_loss:1.661, Lr:1.00E-07
[2023-02-09 14:42:19] Epoch: 6, Train_acc:23.2%, Train_loss:1.606, Test_acc:15.9%, Test_loss:1.599, Lr:1.00E-07
[2023-02-09 14:42:30] Epoch: 7, Train_acc:25.2%, Train_loss:1.592, Test_acc:19.5%, Test_loss:1.655, Lr:1.00E-07
[2023-02-09 14:42:40] Epoch: 8, Train_acc:22.6%, Train_loss:1.595, Test_acc:15.0%, Test_loss:1.626, Lr:1.00E-07
[2023-02-09 14:42:50] Epoch: 9, Train_acc:23.9%, Train_loss:1.588, Test_acc:18.6%, Test_loss:1.614, Lr:1.00E-07
[2023-02-09 14:43:01] Epoch:10, Train_acc:24.6%, Train_loss:1.577, Test_acc:22.1%, Test_loss:1.558, Lr:1.00E-07
acc = 22.1%, saving model to best.pkl
Done

八、模型评估

''' 结果可视化 '''
def displayResult(train_acc, test_acc, train_loss, test_loss, start_epoch, epochs, output=''):
    # 隐藏警告
    warnings.filterwarnings("ignore")                # 忽略警告信息
    plt.rcParams['font.sans-serif']    = ['SimHei']  # 用来正常显示中文标签
    plt.rcParams['axes.unicode_minus'] = False       # 用来正常显示负号
    plt.rcParams['figure.dpi']         = 100         # 分辨率
    
    epochs_range = range(start_epoch, epochs)
    
    plt.figure('Result Visualization', figsize=(12, 3))
    plt.subplot(1, 2, 1)
    
    plt.plot(epochs_range, train_acc, label='Training Accuracy')
    plt.plot(epochs_range, test_acc, label='Test Accuracy')
    plt.legend(loc='lower right')
    plt.title('Training and Validation Accuracy')
    
    plt.subplot(1, 2, 2)
    plt.plot(epochs_range, train_loss, label='Training Loss')
    plt.plot(epochs_range, test_loss, label='Test Loss')
    plt.legend(loc='upper right')
    plt.title('Training and Validation Loss')
    plt.savefig(os.path.join(output, 'AccuracyLoss.png'))
    plt.show()


''' 绘制准确率&损失率曲线图 '''
displayResult(train_acc, test_acc, train_loss, test_loss, start_epoch, epochs, output)
''' 模型评估 '''
best_model.eval()
epoch_test_acc, epoch_test_loss = test(test_dl, best_model, loss_fn)
print("EVAL {:.5f}, {:.5f}".format(epoch_test_acc, epoch_test_loss))

九、预测

''' 预测函数 '''
def predict(model, img_path):
    img = Image.open(img_path)
    train_transforms = torchvision.transforms.Compose([
        torchvision.transforms.Resize([224, 224]),  # 将输入图片resize成统一尺寸
        torchvision.transforms.ToTensor(),          # 将PIL Image或numpy.ndarray转换为tensor，并归一化到[0,1]之间
        torchvision.transforms.Normalize(           # 标准化处理-->转换为标准正太分布（高斯分布），使模型更容易收敛
            mean=[0.485, 0.456, 0.406], 
            std=[0.229, 0.224, 0.225])  # 其中 mean=[0.485,0.456,0.406]与std=[0.229,0.224,0.225] 从数据集中随机抽样计算得到的。
    ])
    img = train_transforms(img)
    img = torch.reshape(img, (1, 3, 224, 224))
    output = model(img.cuda())
    #print(output.argmax(1))
    
    _, indices = torch.max(output, 1)
    percentage = torch.nn.functional.softmax(output, dim=1)[0] * 100
    perc = percentage[int(indices)].item()
    result = classeNames[indices]
    print('predicted:', result, perc)

if __name__=='__main__':
    classeNames = ['Bananaquit', 'Black Throated Bushtiti', 'Black skimmer', 'Cockatoo']
    num_classes = len(classeNames)
    
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    print("Using {} device\n".format(device))
    
    model = ResNet50(num_classes).to(device)
    model.load_state_dict(torch.load(os.path.join('output', 'best.pkl')))
    model.eval()
    
    img_path = './data/bird_photos/Bananaquit/009.jpg'
    #img_path = './data/bird_photos/Cockatoo/016.jpg'
    predict(model, img_path)