TensorFlow 2.0 学习笔记--第五章 神经网络卷积计算

第五章 神经网络卷积计算

5.1 卷积计算过程

  • 全连接NN:每个神经元与前向相邻层的每一个神经元都有连接关系,输入是特征,输出为预测的结果。
    \(参数个数:\sum_{各层}(前层\times后层+后层)\)

    5-1

    由于实际项目中的图片是高分辨率彩色图,参数过多,将可能导致模型过拟合。
    故实际应用时会先对原始图像进行特征提取,再把提取到的特征送给全连接网络。

  • 卷积 (Convolutional)

    卷积计算可认为是一种有效提取图像特征的方法
    输入特征图上滑动,遍历输入特征图中的每个像素点。每一个步长,卷积核会与输入特征图出现重合区域,重合区域对应元素相乘、求和再加上偏置项得到输出特征的一个像素点。

    要使卷积核的通道数与输入特征图的通道数一致。

    输入特征图的深度(channel数),决定了当前层卷积核的深度;
    当前层卷积核的个数,决定了当前层输出特征图的深度。

    5-2

    • 单通道

    5-3

    • 三通道

    5-4

    5-5

    当有n个卷积核时,会有n张输出特征图,叠加在输出特征图的后面。

5.2 感受野

  • 感受野(Receptive Field):卷积神经网络各输出特征图的每个像素点,在原始输入图片上映射区域的大小。

    5-6

    上述两个感受野相同,故特征提取能力是一样的。

  • 如何选择,则需要考虑计算量

    5-7

    \[9(x-2)(x-2) + 9(x-2-2)(x-2-2)=18x^2-180x+180\\ 25(x-4)(x-4) = 25x^2-200x+400 \]

5.3 全零填充 (Padding)

5-8

  • 卷积特征图维度计算公式

    5-9

    TensorFlow用参数padding='SAME' 或者 padding='VALID' 表示是否为全零填充

5.4 TensorFlow 描述卷积计算层

tf.keras.layers.Conv2D(
    filters=卷积核个数,
    kernel_size=卷积核尺寸,  #  正方形写核长整数,或(核高h,核宽w)
    strides=滑动步长,  # 横纵向相同写步长整数,或(纵向步长h,横向步长w),默认1
    padding='same' or 'valid',  # 使用全零填充是'same',不使用是'valid'(默认)
    activation='relu' or 'sigmoid' or 'tanh' or 'softmax'等,  # 如有BN(批标准化)此处不写
    input_shape=(高, 宽, 通道数)  # 输入特征图维度,可省略
)
model = tf.keras.models.Sequential([
    Conv2D(6, 5, padding='valid', activation='sigmoid'),
    MaxPool2D(2, 2),
    Conv2D(6, (5, 5), padding='valid', activation='sigmoid'),
    MaxPool2D(2, (2, 2)),
    Conv2D(filters=6, kernel_size=(5, 5), padding='valid', activation='sigmoid'),
    MaxPool2D(pool_size(2, 2), strides=2),
    Flatten(),
    Dense(10, activation='softmax')
])

5.5 批标准化(Batch Normalization, BN)

标准化:使数据符合0均值,1为标准差的分布。
批标准化:对一小批数据(batch),做标准化处理。

批标准化后,第k个卷积核的输出特征图(feature map)中第i个像素点

\[{H'}_{i}^{k}=\frac{H_i^k-\mu_{batch}^k}{\delta_{batch}^k} \]

5-10

\(H_i^k\): 批标准化前,第k个卷积核,输出特征图第i个像素点

\(\mu_{batch}^k\): 批标准化前,第k个卷积核,batch张输出特征图中所有像素点平均值

\(\delta_{batch}^k\): 批标准化前,第k个卷积核,batch张输出特征图中所有像素点标准差

\(\mu_{}^k=\frac{1}{m}\sum_{i=1}^m H_i^k\) \(\delta_{batch}^k =\sqrt{\delta+\frac{1}{m}\sum_{i=1}^m (H_i^k-\mu_{batch}^k)^2}\)

  • 为每个卷积核**引入可训练参数\(\gamma\)\(\beta\),调整批归一化的力度。

    5-11

  • BN层位于卷积层之后,激活层之前。

    卷积(Convolutional)批标准化(BN)激活(Activation)

  • TensorFlow 描述批标准化

    tf.keras.layers.BatchNormalization()

    model = tf.keras.models.Sequential([
        Conv2D(filters=6, kernel_size=(5, 5), padding='same'),  # 卷积层
        BatchNormalization(),  # BN层
        Activation('relu'),  # 激活层
        MaxPool2D(pool_size(2, 2), strides=2, padding='same'),  #池化层
        Dropout(0.2)  # Dropout层
    ])
    

5.6 池化(Pooling)

池化用于减少特征数据量。
最大池化可提取图片纹理,均值池化可保留背景特征。

5-12

  • TensorFlow 描述池化

    # 最大池化
    tf.keras.layers.MaxPool2D(
        pool_size=池化核尺寸,  # 正方形写核长整数,或(核高h,核宽w)
        sreides=池化步长,  # 步长整数,或(纵向步长h,横向步长w),默认为pool_size
        padding='valid' or 'same'  # 使用全零填充是'same',不使用是'valid'(默认)
    )
    
    # 均值池化
    tf.keras.layers.MaxPool2D(
        pool_size=池化核尺寸,  # 正方形写核长整数,或(核高h,核宽w)
        sreides=池化步长,  # 步长整数,或(纵向步长h,横向步长w),默认为pool_size
        padding='valid' or 'same'  # 使用全零填充是'same',不使用是'valid'(默认)
    )
    
    model = tf.keras.models.Sequential([
        Conv2D(filters=6, kernel_size=(5, 5), padding='same'),  # 卷积层
        BatchNormalization(),  # BN层
        Activation('relu'),  # 激活层
        MaxPool2D(pool_size=(2, 2), strides=2, padding='same'),  # 池化层
        Dropout(0.2)  # Dropout层
    ])
    

5.7 舍弃(Dropout)

在神经网络训练时,(为了缓解神经网络的过拟合)将一部分神经元按照一定概率从神经网络中暂时舍弃。神经网络使用时,被舍弃的神经元恢复链接。

5-13

tf.keras.layers.Dropout(舍弃的概率)

model = tf.keras.models.Sequential([
    Conv2D(filters=6, kernel_size=(5, 5), padding='same'),  # 卷积层
    BatchNormalization(),  # BN层
    Activation('relu'),  # 激活层
    MaxPool2D(pool_size=(2, 2), strides=2, padding='same'),  # 池化层
    Dropout(0.2)  # Dropout层  随机舍弃20%的神经元
])

5.8 卷积神经网络小结

卷积神经网络:借助卷积核提取特征后,送入全连接网络。

  • 卷积神经网络的主要模块

    卷积(Convolutional)批标准化(BN)激活(Activation)池化(Pooling)全连接(FC)

    前四个是对输入特征提取

  • 卷积是什么? 卷积就是特征提取器,就是CBAPD

    model = tf.keras.models.Sequential([
        Conv2D(filters=6, kernel_size=(5, 5), padding='same'),  # C
        BatchNormalization(),  # B
        Activation('relu'),  # A
        MaxPool2D(pool_size=(2, 2), strides=2, padding='same'),  # P
        Dropout(0.2)  # D
    ])
    

5.9 CIFAR10数据集

提供5万张32*32像素点的十分类彩色图片和标签,用于训练。 提供1万张32*32像素点的十分类彩色图片和标签,用于测试。

5-14

  • 导入cifar10数据集:

    cifar10 = tf.keras.datasets.cifar10
    (x_train, y_train), (x_test, y_test) = cifar10.load_data()
    
    plt.imshow(x_train[0])  # 绘制图片
    plt.show()
    
    print('x_train[0]:\n', x_train[0])
    print('y_train[0]:', y_train[0])
    print('x_test.shape:', x_test.shape)
    

5.10 卷积神经网络搭建示例

C(核:6*5*5,步长:1,填充:same)
B(yes)
A(relu)
P(max,核:2*2,步长:2,填充:same)
D(0.2)

Flatten
Dense(神经元:128,激活:relu,Dropout:0.2)
Dense(神经元:10,激活:softmax)
# 其他不同网络结构也仅在此处class定义结构的不同
class Baseline(Model):
    def __init__(self):
        super(Baseline, self).__init__()
        self.c1 = Conv2D(filters=6, kernel_size=(5, 5), padding='same')
        self.b1 = BatchNormalization()
        self.a1 = Activation()
        self.p1 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
        self.d1 = Dropout(0.2)
        
        self.flatten = Flatten()
        self.f1 = Dense(128, activation='relu')
        self.d2 = Dropout(0.2)
        self.f2 = Dense(10, activation='softmax')
    
    def call(self, x):
        x = self.c1(x)
        x = self.b1(x)
        x = self.a1(x)
        x = self.p1(x)
        x = self.d1(x)
        
        x = self.flatten(x)
        x = self.f1(x)
        x = self.d2(x)
        y = self.f2(x)
        return y
  • 完整cifar10代码

    import tensorflow as tf
    import os
    import numpy as np
    from matplotlib import pyplot as plt
    from tensorflow.keras.layers import Conv2D, BatchNormalization, Activation, MaxPool2D, Dropout, Flatten, Dense
    from tensorflow.keras import Model
    
    np.set_printoptions(threshold=np.inf)
    
    cifar10 = tf.keras.datasets.cifar10
    (x_train, y_train), (x_test, y_test) = cifar10.load_data()
    x_train, x_test = x_train / 255.0, x_test / 255.0
    
    
    class Baseline(Model):
        def __init__(self):
            super(Baseline, self).__init__()
            self.c1 = Conv2D(filters=6, kernel_size=(5, 5), padding='same')  # 卷积层
            self.b1 = BatchNormalization()  # BN层
            self.a1 = Activation('relu')  # 激活层
            self.p1 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')  # 池化层
            self.d1 = Dropout(0.2)  # dropout层
    
            self.flatten = Flatten()
            self.f1 = Dense(128, activation='relu')
            self.d2 = Dropout(0.2)
            self.f2 = Dense(10, activation='softmax')
    
        def call(self, x):
            x = self.c1(x)
            x = self.b1(x)
            x = self.a1(x)
            x = self.p1(x)
            x = self.d1(x)
    
            x = self.flatten(x)
            x = self.f1(x)
            x = self.d2(x)
            y = self.f2(x)
            return y
    
    
    model = Baseline()
    
    model.compile(optimizer='adam',
                  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
                  metrics=['sparse_categorical_accuracy'])
    
    checkpoint_save_path = "./checkpoint/Baseline.ckpt"
    if os.path.exists(checkpoint_save_path + '.index'):
        print('-------------load the model-----------------')
        model.load_weights(checkpoint_save_path)
    
    cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
                                                     save_weights_only=True,
                                                     save_best_only=True)
    
    history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,
                        callbacks=[cp_callback])
    model.summary()
    
    # print(model.trainable_variables)
    file = open('./weights.txt', 'w')
    for v in model.trainable_variables:
        file.write(str(v.name) + '\n')
        file.write(str(v.shape) + '\n')
        file.write(str(v.numpy()) + '\n')
    file.close()
    
    ###############################################    show   ###############################################
    
    # 显示训练集和验证集的acc和loss曲线
    acc = history.history['sparse_categorical_accuracy']
    val_acc = history.history['val_sparse_categorical_accuracy']
    loss = history.history['loss']
    val_loss = history.history['val_loss']
    
    plt.subplot(1, 2, 1)
    plt.plot(acc, label='Training Accuracy')
    plt.plot(val_acc, label='Validation Accuracy')
    plt.title('Training and Validation Accuracy')
    plt.legend()
    
    plt.subplot(1, 2, 2)
    plt.plot(loss, label='Training Loss')
    plt.plot(val_loss, label='Validation Loss')
    plt.title('Training and Validation Loss')
    plt.legend()
    plt.show()
    
    
    # 运行结果
    # Epoch 4/5
    # 1563/1563 [==============================] - 4s 3ms/step - loss: 1.1299 - sparse_categorical_accuracy: 0.5955 - val_loss: 1.4539 - val_sparse_categorical_accuracy: 0.5108
    # Epoch 5/5
    # 1563/1563 [==============================] - 4s 3ms/step - loss: 1.1136 - sparse_categorical_accuracy: 0.6032 - val_loss: 1.1590 - val_sparse_categorical_accuracy: 0.5951
    # Model: "baseline"
    # _________________________________________________________________
    # Layer (type)                 Output Shape              Param #   
    # =================================================================
    # conv2d (Conv2D)              multiple                  456       
    # _________________________________________________________________
    # batch_normalization (BatchNo multiple                  24        
    # _________________________________________________________________
    # activation (Activation)      multiple                  0         
    # _________________________________________________________________
    # max_pooling2d (MaxPooling2D) multiple                  0         
    # _________________________________________________________________
    # dropout (Dropout)            multiple                  0         
    # _________________________________________________________________
    # flatten (Flatten)            multiple                  0         
    # _________________________________________________________________
    # dense (Dense)                multiple                  196736    
    # _________________________________________________________________
    # dropout_1 (Dropout)          multiple                  0         
    # _________________________________________________________________
    # dense_1 (Dense)              multiple                  1290      
    # =================================================================
    # Total params: 198,506
    # Trainable params: 198,494
    # Non-trainable params: 12
    # _________________________________________________________________
    # 
    # Process finished with exit code 0
    

5.11 经典卷积网络——LeNet

LeNet由Yann LeCun于1998年提出,卷积网络开篇之作。

Yann Lecun, Leon Bottou, Y. Bengio, Patrick Haffner. Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 1998

通过共享卷积核,减小了网络参数。

5-15

None表示LeNet提出当时还没有这些操作。

  • LeNet网络结构

    class LeNet5(Model):
        def __init__(self):
            super(LeNet5, self).__init__()
            self.c1 = Conv2D(filters=6, kernel_size=(5, 5), activation='sigmoid')
            self.p1 = MaxPool2D(pool_size=(2, 2), strides=2)
            self.c2 = Conv2D(filters=16, kernel_size=(5, 5), activation='sigmoid')
            self.p2 = MaxPool2D(pool_size=(2, 2), strides=2)
            
            self.flatten = Flatten()
            self.f1 = Dense(120, activation='sigmoid')
            self.f2 = Dense(84, activation='sigmoid')
            self.f3 = Dense(10, activation='softmax')
    
        def call(self, x):
            x = self.c1(x)
            x = self.p1(x)
    
            x = self.c2(x)
            x = self.p2(x)
    
            x = self.flatten(x)
            x = self.f1(x)
            x = self.f2(x)
            y = self.f3(x)
            return y
    

5.12 经典卷积网络——AlexNet

AlexNet网络诞生于2012年,当年ImageNet竞赛的冠军,Top5错误率为16.4%。

Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks. In NIPS, 2012

5-16

  • AlexNet class 代码

    class AlexNet8(Model):
        def __init__(self):
            super(AlexNet8, self).__init__()
            self.c1 = Conv2D(filters=96, kernel_size=(3, 3))
            self.b1 = BatchNormalization()
            self.a1 = Activation('relu')
            self.p1 = MaxPool2D(pool_size=(3, 3), strides=2)
            
            self.c2 = Conv2D(filters=256, kernel_size=(3, 3))
            self.b2 = BatchNormalization()
            self.a2 = Activation('relu')
            self.p2 = MaxPool2D(pool_size=(3, 3), strides=2)
            
            self.c3 = Conv2D(filters=384, kernel_size=(3, 3), padding='same', activation='relu')
                             
            self.c4 = Conv2D(filters=384, kernel_size=(3, 3), padding='same', activation='relu')
                             
            self.c5 = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')
            self.p3 = MaxPool2D(pool_size=(3, 3), strides=2)
    
            self.flatten = Flatten()
            self.f1 = Dense(2048, activation='relu')
            self.d1 = Dropout(0.5)
            self.f2 = Dense(2048, activation='relu')
            self.d2 = Dropout(0.5)
            self.f3 = Dense(10, activation='softmax')
    
        def call(self, x):
            x = self.c1(x)
            x = self.b1(x)
            x = self.a1(x)
            x = self.p1(x)
    
            x = self.c2(x)
            x = self.b2(x)
            x = self.a2(x)
            x = self.p2(x)
    
            x = self.c3(x)
    
            x = self.c4(x)
    
            x = self.c5(x)
            x = self.p3(x)
    
            x = self.flatten(x)
            x = self.f1(x)
            x = self.d1(x)
            x = self.f2(x)
            x = self.d2(x)
            y = self.f3(x)
            return y
    

5.13 经典卷积网络——VGGNet

VGGNet诞生于2014年,当年ImageNet竞赛的亚军,Top5错误率减小到7.3%。

K. Simonyan, A. Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition.In ICLR, 2015.

5-17

  • VGGNet class 代码

    class VGG16(Model):
        def __init__(self):
            super(VGG16, self).__init__()
            self.c1 = Conv2D(filters=64, kernel_size=(3, 3), padding='same')  # 卷积层1
            self.b1 = BatchNormalization()  # BN层1
            self.a1 = Activation('relu')  # 激活层1
            
            self.c2 = Conv2D(filters=64, kernel_size=(3, 3), padding='same')
            self.b2 = BatchNormalization()  # BN层1
            self.a2 = Activation('relu')  # 激活层1
            self.p1 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
            self.d1 = Dropout(0.2)  # dropout层
    
            self.c3 = Conv2D(filters=128, kernel_size=(3, 3), padding='same')
            self.b3 = BatchNormalization()  # BN层1
            self.a3 = Activation('relu')  # 激活层1
            
            self.c4 = Conv2D(filters=128, kernel_size=(3, 3), padding='same')
            self.b4 = BatchNormalization()  # BN层1
            self.a4 = Activation('relu')  # 激活层1
            self.p2 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
            self.d2 = Dropout(0.2)  # dropout层
    
            self.c5 = Conv2D(filters=256, kernel_size=(3, 3), padding='same')
            self.b5 = BatchNormalization()  # BN层1
            self.a5 = Activation('relu')  # 激活层1
            
            self.c6 = Conv2D(filters=256, kernel_size=(3, 3), padding='same')
            self.b6 = BatchNormalization()  # BN层1
            self.a6 = Activation('relu')  # 激活层1
            
            self.c7 = Conv2D(filters=256, kernel_size=(3, 3), padding='same')
            self.b7 = BatchNormalization()
            self.a7 = Activation('relu')
            self.p3 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
            self.d3 = Dropout(0.2)
    
            self.c8 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
            self.b8 = BatchNormalization()  # BN层1
            self.a8 = Activation('relu')  # 激活层1
            
            self.c9 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
            self.b9 = BatchNormalization()  # BN层1
            self.a9 = Activation('relu')  # 激活层1
            
            self.c10 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
            self.b10 = BatchNormalization()
            self.a10 = Activation('relu')
            self.p4 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
            self.d4 = Dropout(0.2)
    
            self.c11 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
            self.b11 = BatchNormalization()  # BN层1
            self.a11 = Activation('relu')  # 激活层1
            
            self.c12 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
            self.b12 = BatchNormalization()  # BN层1
            self.a12 = Activation('relu')  # 激活层1
            
            self.c13 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
            self.b13 = BatchNormalization()
            self.a13 = Activation('relu')
            self.p5 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
            self.d5 = Dropout(0.2)
    
            self.flatten = Flatten()  # 未计入层数
            self.f1 = Dense(512, activation='relu')
            self.d6 = Dropout(0.2)
            self.f2 = Dense(512, activation='relu')
            self.d7 = Dropout(0.2)
            self.f3 = Dense(10, activation='softmax')
    
        def call(self, x):
            x = self.c1(x)
            x = self.b1(x)
            x = self.a1(x)
            x = self.c2(x)
            x = self.b2(x)
            x = self.a2(x)
            x = self.p1(x)
            x = self.d1(x)
    
            x = self.c3(x)
            x = self.b3(x)
            x = self.a3(x)
            x = self.c4(x)
            x = self.b4(x)
            x = self.a4(x)
            x = self.p2(x)
            x = self.d2(x)
    
            x = self.c5(x)
            x = self.b5(x)
            x = self.a5(x)
            x = self.c6(x)
            x = self.b6(x)
            x = self.a6(x)
            x = self.c7(x)
            x = self.b7(x)
            x = self.a7(x)
            x = self.p3(x)
            x = self.d3(x)
    
            x = self.c8(x)
            x = self.b8(x)
            x = self.a8(x)
            x = self.c9(x)
            x = self.b9(x)
            x = self.a9(x)
            x = self.c10(x)
            x = self.b10(x)
            x = self.a10(x)
            x = self.p4(x)
            x = self.d4(x)
    
            x = self.c11(x)
            x = self.b11(x)
            x = self.a11(x)
            x = self.c12(x)
            x = self.b12(x)
            x = self.a12(x)
            x = self.c13(x)
            x = self.b13(x)
            x = self.a13(x)
            x = self.p5(x)
            x = self.d5(x)
    
            x = self.flatten(x)
            x = self.f1(x)
            x = self.d6(x)
            x = self.f2(x)
            x = self.d7(x)
            y = self.f3(x)
            return y
    

5.14 经典卷积网络——InceptionNet

InceptionNet诞生于2014年,当年ImageNet竞赛冠军,Top5错误率为6.67%。

Szegedy C, Liu W, Jia Y, et al. Going Deeper with Convolutions. In CVPR, 2015.

5-18

同一层网络内使用不同尺寸的卷积核,提升了模型感知力;使用了批标准化,缓解了梯度消失。

  • InceptionNet class 代码

    class ConvBNRelu(Model):
        def __init__(self, ch, kernelsz=3, strides=1, padding='same'):
            super(ConvBNRelu, self).__init__()
            self.model = tf.keras.models.Sequential([
                Conv2D(ch, kernelsz, strides=strides, padding=padding),
                BatchNormalization(),
                Activation('relu')
            ])
    
        def call(self, x):
            x = self.model(x, training=False) #在training=False时,BN通过整个训练集计算均值、方差去做批归一化,training=True时,通过当前batch的均值、方差去做批归一化。推理时 training=False效果好
            return x
        
        
    class InceptionBlk(Model):
        def __init__(self, ch, strides=1):
            super(InceptionBlk, self).__init__()
            self.ch = ch
            self.strides = strides
            self.c1 = ConvBNRelu(ch, kernelsz=1, strides=strides)
            self.c2_1 = ConvBNRelu(ch, kernelsz=1, strides=strides)
            self.c2_2 = ConvBNRelu(ch, kernelsz=3, strides=1)
            self.c3_1 = ConvBNRelu(ch, kernelsz=1, strides=strides)
            self.c3_2 = ConvBNRelu(ch, kernelsz=5, strides=1)
            self.p4_1 = MaxPool2D(3, strides=1, padding='same')
            self.c4_2 = ConvBNRelu(ch, kernelsz=1, strides=strides)
    
        def call(self, x):
            x1 = self.c1(x)
            x2_1 = self.c2_1(x)
            x2_2 = self.c2_2(x2_1)
            x3_1 = self.c3_1(x)
            x3_2 = self.c3_2(x3_1)
            x4_1 = self.p4_1(x)
            x4_2 = self.c4_2(x4_1)
            # concat along axis=channel
            x = tf.concat([x1, x2_2, x3_2, x4_2], axis=3)
            return x
    
    class Inception10(Model):
        def __init__(self, num_blocks, num_classes, init_ch=16, **kwargs):
            super(Inception10, self).__init__(**kwargs)
            self.in_channels = init_ch
            self.out_channels = init_ch
            self.num_blocks = num_blocks
            self.init_ch = init_ch
            self.c1 = ConvBNRelu(init_ch)
            self.blocks = tf.keras.models.Sequential()
            for block_id in range(num_blocks):
                for layer_id in range(2):
                    if layer_id == 0:
                        block = InceptionBlk(self.out_channels, strides=2)
                    else:
                        block = InceptionBlk(self.out_channels, strides=1)
                    self.blocks.add(block)
                # enlarger out_channels per block
                self.out_channels *= 2
            self.p1 = GlobalAveragePooling2D()
            self.f1 = Dense(num_classes, activation='softmax')
    
        def call(self, x):
            x = self.c1(x)
            x = self.blocks(x)
            x = self.p1(x)
            y = self.f1(x)
            return y
        
    model = Inception10(num_blocks=2, num_classes=10)
    

5.15 经典卷积网络——ResNet

ResNet诞生于2015年,当年ImageNet竞赛冠军,Top5错误率为3.57%。

Kaiming He, Xiangyu Zhang, Shaoqing Ren. Deep Residual Learning for Image Recognition. In CPVR, 2016

  • 网络层数加深提高识别准确率

    模型名称 网络层数
    LeNet 5
    AlexNet 8
    VGGNet 16/19
    InceptionNet V1 22

    5-19

ResNet提出了层间残差跳连,引入了前方信息,缓解梯度消失,使神经网络层数增加成为可能。

  • 残差跳连

    5-20

    Inception块中的”+“是沿深度方向叠加(千层蛋糕层数叠加)
    ResNet块中的”+“是特征图对应元素值相加(矩阵值相加)

  • 连接方式

    5-21

    两种堆叠不同的方法:(堆叠前后的维度会不相同)

    5-22

    class ResnetBlock(Model):
    
        def __init__(self, filters, strides=1, residual_path=False):
            super(ResnetBlock, self).__init__()
            self.filters = filters
            self.strides = strides
            self.residual_path = residual_path
    
            self.c1 = Conv2D(filters, (3, 3), strides=strides, padding='same', use_bias=False)
            self.b1 = BatchNormalization()
            self.a1 = Activation('relu')
    
            self.c2 = Conv2D(filters, (3, 3), strides=1, padding='same', use_bias=False)
            self.b2 = BatchNormalization()
    
            # residual_path为True时,对输入进行下采样,即用1x1的卷积核做卷积操作,保证x能和F(x)维度相同,顺利相加
            if residual_path:
                self.down_c1 = Conv2D(filters, (1, 1), strides=strides, padding='same', use_bias=False)
                self.down_b1 = BatchNormalization()
            
            self.a2 = Activation('relu')
    
        def call(self, inputs):
            residual = inputs  # residual等于输入值本身,即residual=x
            # 将输入通过卷积、BN层、激活层,计算F(x)
            x = self.c1(inputs)
            x = self.b1(x)
            x = self.a1(x)
    
            x = self.c2(x)
            y = self.b2(x)
    
            if self.residual_path:
                residual = self.down_c1(inputs)
                residual = self.down_b1(residual)
    
            out = self.a2(y + residual)  # 最后输出的是两部分的和,即F(x)+x或F(x)+Wx,再过激活函数
            return out
    
  • ResNet 表示

    5-23

    class ResNet18(Model):
    
        def __init__(self, block_list, initial_filters=64):  # block_list表示每个block有几个卷积层
            super(ResNet18, self).__init__()
            self.num_blocks = len(block_list)  # 共有几个block
            self.block_list = block_list
            self.out_filters = initial_filters
            self.c1 = Conv2D(self.out_filters, (3, 3), strides=1, padding='same', use_bias=False)
            self.b1 = BatchNormalization()
            self.a1 = Activation('relu')
            self.blocks = tf.keras.models.Sequential()
            # 构建ResNet网络结构
            for block_id in range(len(block_list)):  # 第几个resnet block
                for layer_id in range(block_list[block_id]):  # 第几个卷积层
    
                    if block_id != 0 and layer_id == 0:  # 对除第一个block以外的每个block的输入进行下采样
                        block = ResnetBlock(self.out_filters, strides=2, residual_path=True)
                    else:
                        block = ResnetBlock(self.out_filters, residual_path=False)
                    self.blocks.add(block)  # 将构建好的block加入resnet
                self.out_filters *= 2  # 下一个block的卷积核数是上一个block的2倍
            self.p1 = tf.keras.layers.GlobalAveragePooling2D()
            self.f1 = tf.keras.layers.Dense(10, activation='softmax', kernel_regularizer=tf.keras.regularizers.l2())
    
        def call(self, inputs):
            x = self.c1(inputs)
            x = self.b1(x)
            x = self.a1(x)
            x = self.blocks(x)
            x = self.p1(x)
            y = self.f1(x)
            return y
    model = ResNet18([2, 2, 2, 2])
    

5.16 经典卷积网络小结

  • LeNet

    卷积网络开篇之作,共享卷积核,减少网络参数。

  • AlexNet

    使用relu激活函数,提升训练速度;
    使用Dropout,缓解过拟合。

  • VGGNet

    小尺寸卷积核减少参数,网络结构规整,适合并行加速。

  • InceptionNet

    一层内使用不同尺寸卷积核,提升感知力;使用BN(批标准化)操作,缓解梯度消失。

  • ResNet

    层间残差跳连,引入前方信息,缓解模型退化,使神经网络层数加深成为可能。

posted @ 2021-07-18 17:41  AlsoRan  阅读(262)  评论(0编辑  收藏  举报