卷积神经网络的简单可视化

本次将进行卷积神经网络权重的简单可视化。

在本篇教程的前半部分，我们会首先定义一个及其简单的 CNN 模型，并手工指定一些过滤器权重参数，作为卷积核参数。

后半部分，我们会使用 FashionMNIST 数据集，并且定义一个 2 层的 CNN 模型，将模型训练至准确率在 85% 以上，再进行模型卷积核的可视化。

1. 简单卷积网络模型的可视化

1.1 指定过滤器卷积层的可视化

在下面的练习中，我们将手动定义几个类似索比尔算子的过滤器，并将它们指定给一个极其简单地卷积神经网络模型。然后可视化卷积层 4 个过滤器的输出（即 feature maps）。

加载目标图像

import cv2
import matplotlib.pyplot as plt
%matplotlib inline

img_path = 'images/udacity_sdc.png'
bgr_img = cv2.imread(img_path)

gray_img = cv2.cvtColor(bgr_img, cv2.COLOR_BGR2GRAY)
gray_img = gray_img.astype("float32")/255

plt.imshow(gray_img, cmap='gray')
plt.show()

手动定义过滤器

import numpy as np

filter_vals = np.array([[-1, -1, 1, 1], [-1, -1, 1, 1], [-1, -1, 1, 1], [-1, -1, 1, 1]])

# 变化产生更丰富的过滤器
filter_1 = filter_vals
filter_2 = -filter_1
filter_3 = filter_1.T
filter_4 = -filter_3
filters = np.array([filter_1, filter_2, filter_3, filter_4])

fig = plt.figure(figsize=(10, 5))
for i in range(4):
    ax = fig.add_subplot(1, 4, i+1, xticks=[], yticks=[])
    ax.imshow(filters[i], cmap='gray')
    ax.set_title('Filter %s' % str(i+1))
    width, height = filters[i].shape
    for x in range(width):
        for y in range(height):
            ax.annotate(str(filters[i][x][y]), xy=(y,x),
                       horizontalalignment='center',
                       verticalalignment='center', 
                       color='white' if filters[i][x][y] < 0 else 'black')

定义简单卷积神经网络

import torch
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self, weight):
        super(Net, self).__init__()
        k_height, k_width = weight.shape[2:]
        self.conv = nn.Conv2d(1, 4, kernel_size=(k_height, k_width), bias=False)
        self.conv.weight = torch.nn.Parameter(weight)
        self.pool = nn.MaxPool2d(4,4)
        
    def forward(self, x):
        conv_x = self.conv(x)
        activated_x = F.relu(conv_x)
        pooled_x = self.pool(activated_x)
        
        return conv_x, activated_x, pooled_x
    
# filters 的大小为 4 4 4
# weight 的大小被增加为 4 1 4 4，1 的维度是针对输入的一个通道
weight = torch.from_numpy(filters).unsqueeze(1).type(torch.FloatTensor)
model = Net(weight)

print('Filters shape: ', filters.shape)
print('weights shape: ', weight.shape)
print(model)

Filters shape:  (4, 4, 4)
weights shape:  torch.Size([4, 1, 4, 4])
Net(
  (conv): Conv2d(1, 4, kernel_size=(4, 4), stride=(1, 1), bias=False)
  (pool): MaxPool2d(kernel_size=4, stride=4, padding=0, dilation=1, ceil_mode=False)
)

可视化卷积输出

定义一个函数 viz_layer，在这个方法可以可视化某一层卷积的输出。

def viz_layer(layer, n_filters=4):
    fig = plt.figure(figsize=(20, 20))
    
    for i in range(n_filters):
        ax = fig.add_subplot(1, n_filters, i+1, xticks=[], yticks=[])
        ax.imshow(np.squeeze(layer[0,i].data.numpy()), cmap='gray')
        ax.set_title('Output %s' % str(i+1))

# 输出原图
plt.imshow(gray_img, cmap='gray')
# 格式化输出过滤器（卷积核）
fig = plt.figure(figsize=(12, 6))
fig.subplots_adjust(left=0, right=1.5, bottom=0.8, top=1, hspace=0.05, wspace=0.05)
for i in range(4):
    ax = fig.add_subplot(1, 4, i+1, xticks=[], yticks=[])
    ax.imshow(filters[i], cmap='gray')
    ax.set_title('Filter %s' % str(i+1))
    
# 为 gray img 添加 1 个 batch 维度，以及 1 个 channel 维度，并转化为 tensor
gray_img_tensor = torch.from_numpy(gray_img).unsqueeze(0).unsqueeze(1)
print(gray_img.shape)
print(gray_img_tensor.shape)

# 将输入图传入模型，获得输出
conv_layer, activated_layer, pooled_layer = model(gray_img_tensor)

# 可视化卷积输出
viz_layer(conv_layer)

(213, 320)
torch.Size([1, 1, 213, 320])

# 可视化卷积后激活函数后的输出
viz_layer(activated_layer)

1.2 指定过滤器池化层的可视化

下面可视化池化层后的输出。

# 可视化池化层后的输出
viz_layer(pooled_layer)

2. 多层卷积网络模型的可视化

在下面的练习中，我们将定义一个相对复杂点的神经网络，并使用 FashionMNIST 数据集训练至 85% 以上的准确率，其后再对神经网络进行可视化分析。

2.1 加载 FashionMNIST 数据集

FashionMNIST 相当于一种对 MNIST 数据集的升级。MNIST 数据集的数字识别在目前来说，模式比较简单，可能作为深度神经网络模型的目标数据集稍显简单。FashionMNIST 将图像内容变为“时尚衣物”，图像格式不变，使用起来几乎与 MNIST 无异，且比 MNIST 更能考验模型对数据模式的学习能力。

FashionMNIST 的类别列表：

0：T-shirt/top（T恤） 
1：Trouser（裤子） 
2：Pullover（套衫） 
3：Dress（裙子） 
4：Coat（外套） 
5：Sandal（凉鞋） 
6：Shirt（汗衫） 
7：Sneaker（运动鞋） 
8：Bag（包）

加载 FashionMNIST 数据集

import torch
import torchvision

from torchvision.datasets import FashionMNIST
from torch.utils.data import DataLoader
from torchvision import transforms

data_transform = transforms.ToTensor()

train_data = FashionMNIST(root='./data', train=True,
                         download=False, transform=data_transform)
test_data = FashionMNIST(root='./data', train=False,
                         download=False, transform=data_transform)

# Print out some stats about the training and test data
print('Train data, number of images: ', len(train_data))
print('Test data, number of images: ', len(test_data))

Train data, number of images:  60000
Test data, number of images:  10000

创建数据加载器

batch_size = 20

train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=True)

# specify the image classes
classes = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 
           'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

可视化目标数据集的部分数据

import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline

dataiter = iter(train_loader)
images, labels = dataiter.next()
images = images.numpy()

# plot the images in the batch, along with the corresponding labels
fig = plt.figure(figsize=(25, 4))
for idx in np.arange(batch_size):
    ax = fig.add_subplot(2, batch_size/2, idx+1, xticks=[], yticks=[])
    ax.imshow(np.squeeze(images[idx]), cmap='gray')
    ax.set_title(classes[labels[idx]])#### 加载 FashionMNIST 数据集

2.2 训练多层卷积模型

定义模型

下面定义一个具有两层卷积的模型，加入的 dropout 在一定程度上起到防止过拟合的作用。

import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        
        self.conv1 = nn.Conv2d(1, 16, 3, padding=1)
        self.pool1 = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(16, 32, 3, padding=1)
        self.pool2 = nn.MaxPool2d(2, 2)
        self.activation_l = nn.ReLU()
        
        self.fc = nn.Linear(32 * 7 * 7, 24)
        self.out = nn.Linear(24, 10)
        self.dropout = nn.Dropout(p=0.5)
        self.activation_out = nn.Softmax(dim=1)
        
    def forward(self, x):
        x = self.activation_l(self.conv1(x))
        x = self.pool1(x)
        x = self.activation_l(self.conv2(x))
        x = self.pool2(x)
        
        x = x.view(x.size(0), -1)
        x = self.activation_l(self.fc(x))
        x = self.dropout(x)
        x = self.activation_out(self.out(x))
        
        return x

训练模型

import torch.optim as optim

criterion = nn.CrossEntropyLoss()

optimizer = torch.optim.Adam(net.parameters())

def train(n_epochs):
    for epoch in range(n_epochs):
        running_loss = 0.0
        for batch_i, data in enumerate(train_loader):
            inputs, labels = data
            optimizer.zero_grad()
            outputs = net(inputs)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()
            
            running_loss += loss.item()
            
            if batch_i % 1000 == 999:
                print('Epoch: {}, Batch: {}, Avg. Loss: {}'.format(epoch + 1, batch_i+1, running_loss/1000))
                running_loss = 0.0
                
    print('Finished Training')
    
n_epochs = 10

train(n_epochs)

model_dir = 'saved_models/'
model_name = 'model_best.pt'

torch.save(net.state_dict(), model_dir+model_name)

加载训练的模型

net = Net()

net.load_state_dict(torch.load('saved_models/model_best.pt'))

print(net)

Net(
  (conv1): Conv2d(1, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (activation_l): ReLU()
  (fc): Linear(in_features=1568, out_features=24, bias=True)
  (out): Linear(in_features=24, out_features=10, bias=True)
  (dropout): Dropout(p=0.5)
  (activation_out): Softmax()
)

在测试数据集上测试模型

test_loss = torch.zeros(1)
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))

print(class_correct)
print(test_loss)

[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
tensor([ 0.])

net.eval()

criterion = torch.nn.CrossEntropyLoss()

for batch_i, data in enumerate(test_loader):
    inputs, labels = data
    output = net(inputs)
    loss = criterion(outputs, labels)
    
    # update average test loss 
    test_loss = test_loss + ( (torch.ones(1) / (batch_i+1)) * (loss.data - test_loss) )
    
    _, predicted = torch.max(output.data, 1)
    
    correct = np.squeeze(predicted.eq(labels.data.view_as(predicted)))
    
    for i in range(batch_size):
        label = labels.data[i]
        class_correct[label] += correct[i].item()
        class_total[label] += 1
        
print('Test Loss: {:.6f}\n'.format(test_loss.numpy()[0]))

for i in range(10):
    if class_total[i] > 0:
        print('Test Accuracy of %5s: %2d%% (%2d/%2d)' % (
            classes[i], 100 * class_correct[i] / class_total[i],
            np.sum(class_correct[i]), np.sum(class_total[i])))
    else:
        print('Test Accuracy of %5s: N/A (no training examples)' % (classes[i]))

        
print('\nTest Accuracy (Overall): %2d%% (%2d/%2d)' % (
    100. * np.sum(class_correct) / np.sum(class_total),
    np.sum(class_correct), np.sum(class_total)))

Test Loss: 2.362950

Test Accuracy of T-shirt/top: 85% (850/1000)
Test Accuracy of Trouser: 96% (963/1000)
Test Accuracy of Pullover: 84% (842/1000)
Test Accuracy of Dress: 91% (911/1000)
Test Accuracy of  Coat: 85% (856/1000)
Test Accuracy of Sandal: 98% (989/1000)
Test Accuracy of Shirt: 49% (495/1000)
Test Accuracy of Sneaker: 94% (948/1000)
Test Accuracy of   Bag: 97% (978/1000)
Test Accuracy of Ankle boot: 93% (930/1000)

Test Accuracy (Overall): 87% (8762/10000)

2.3 特征可视化

模型得到训练并且在测试数据上可以达到 87% 的准确率，下面让我们进行可视化。

可视化策略是从模型中将各卷积层的参数提取出来，作为独立的过滤器，使用 OpenCV 的 filter2D 函数，施加在一张从测试集抽样出的图像中。观察过滤器对图像起到的作用，并尝试去解释当前过滤器对原图起到了怎样的滤波作用。

从数据集中抽取单张图片

dataiter = iter(test_loader)
images, labels = dataiter.next()
images = images.numpy()

idx = 15
img = np.squeeze(images[idx])

import cv2
plt.imshow(img, cmap='gray')

<matplotlib.image.AxesImage at 0x124832a90>

进行第一层卷积核的可视化

weights = net.conv1.weight.data
w = weights.numpy()
print(w.shape)

fig = plt.figure(figsize=(30, 10))
columns = 4 * 2
row = 4
for i in range(0, columns * row):
    fig.add_subplot(row, columns, i+1)
    if ((i%2)==0):
        plt.imshow(w[int(i/2)][0], cmap='gray')
    else:
        c = cv2.filter2D(img, -1, w[int((i-1)/2)][0])
        plt.imshow(c, cmap='gray')
plt.show()

(16, 1, 3, 3)

进行第一层卷积核的可视化

weights = net.conv2.weight.data
w = weights.numpy()
print(w.shape)

fig = plt.figure(figsize=(30, 20))
columns = 4 * 2
row = 8
for i in range(0, columns * row):
    fig.add_subplot(row, columns, i+1)
    if ((i%2)==0):
        plt.imshow(w[int(i/2)][0], cmap='gray')
    else:
        c = cv2.filter2D(img, -1, w[int((i-1)/2)][0])
        plt.imshow(c, cmap='gray')
plt.show()

(32, 16, 3, 3)

可以看到一些卷积核起到了边缘检测的功能，不同的卷积核对不同方向，不同的纹理，或者说不同的图像内容敏感。

感觉这种人以主观想法可视化卷积的方法还不够丰满，可能这就算是简单的神经网络的可视化方法。除了卷积核的可视化，还可以进行全连接层的可视化。

关于全连接层的可视化，有教程表示是通过可视化类似类别间不同数据单例的“嵌入向量”距离进行可视化的，可能还需要对全连接层产生的“嵌入向量”进行 T-SNE 将为后再进行可视化。如果后续遇到了相关内容，会在本文中再补上。

后记

本文内容参考自 Udacity 计算机视觉纳米学位练习，官方源码连接：

https://github.com/udacity/CVND_Exercises/tree/master/1_5_CNN_Layers

posted @ 2019-08-16 23:04 Alex777 阅读(8594) 评论(0) 编辑收藏举报

刷新页面返回顶部

面白い试炼

卷积神经网络的简单可视化

卷积神经网络的简单可视化

1. 简单卷积网络模型的可视化

1.1 指定过滤器卷积层的可视化

加载目标图像

手动定义过滤器

定义简单卷积神经网络

可视化卷积输出

1.2 指定过滤器池化层的可视化

2. 多层卷积网络模型的可视化

2.1 加载 FashionMNIST 数据集

加载 FashionMNIST 数据集

创建数据加载器

可视化目标数据集的部分数据

2.2 训练多层卷积模型

定义模型

训练模型

加载训练的模型

在测试数据集上测试模型

2.3 特征可视化

从数据集中抽取单张图片

进行第一层卷积核的可视化

进行第一层卷积核的可视化

后记

公告