pytorch简单识别MNIST的全连接神经网络

环境：

python 3.11.10

pytorch 2.3.0

本文通过PyTorch框架来构建、训练以及评估一个简单的全连接神经网络，以便理解神经网络的基本结构，并通过实际操作获得第一手的经验。选择的任务是在经典的MNIST手写数字数据集上进行数字识别，这是学习深度学习不可或缺的一个实验。

一、PyTorch概览

PyTorch是一个开源的机器学习库，广泛用于计算机视觉和自然语言处理等领域。它由Facebook的人工智能研究团队开发，以其灵活性和速度受到研究人员和工业界的青睐。

二、网络构建

网络结构相对基础：包括两个有128个神经元的隐藏层，以及一个适用于10类分类任务的输出层。每个隐藏层后面都跟有一个ReLU激活函数，以增加网络的非线性能力，使其能够学习复杂的数据模式。

import torch
import torch.nn as nn

class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(784, 128) 
        self.fc2 = nn.Linear(128, 128)  
        self.output = nn.Linear(128, 10)  

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = self.output(x)
        return x

这段代码创建了一个简单的全连接神经网络，适用于输入特征维度为784（例如，处理后的28x28图片），并通过两个隐藏层向10个输出节点（10个类别）映射的分类任务。每个隐藏层后使用ReLU函数引入非线性，以增强模型的表达能力。这段代码详解如下：

（1）导入必要的库：

import torch
import torch.nn as nn

这两行代码导入了PyTorch (torch) 和其神经网络模块 (torch.nn)。torch.nn 包含建立神经网络所需的所有构建块，如各种类型的层和激活函数。

（2）定义神经网络类：

class SimpleNN(nn.Module):

这里定义了一个名为 SimpleNN 的新类，它继承自 nn.Module。在PyTorch中，所有的神经网络模型都应继承自 nn.Module，这样可以利用到该基类中的很多功能，如参数管理和模块化。

（3）初始化方法：

def __init__(self):
    super(SimpleNN, self).__init__()

在类的初始化方法中，首先调用 super() 函数初始化基类 nn.Module。这是固定写法，保证了神经网络基础功能的正确初始化。

（4）定义神经网络层：

self.fc1 = nn.Linear(784, 128)
self.fc2 = nn.Linear(128, 128)
self.output = nn.Linear(128, 10)

这部分代码定义了三个全连接层（也称为线性层）：

self.fc1 是第一个全连接层，输入特征为784（通常是处理过的28x28像素的MNIST图像），输出特征为128。
self.fc2 是第二个全连接层，它接收前一层128个特征的输出，并输出128个特征。
self.output 是输出层，接收来自上一层的128个特征的输出，并将其映射到10个输出，通常表示分类任务中的类别数量。

Linear运算方法：

Linear参数量的计算方法：参数量为：w * h + h, 如果 bias = False, 则为 w * h, 所以第一层数量为:784*128 + 128 = 100480，使用summary(model)打印出模型参数也是一致：

=================================================================
Layer (type:depth-idx)                   Param #
=================================================================
SimpleNN                                 --
├─Linear: 1-1                            100,480
├─Linear: 1-2                            16,512
├─Linear: 1-3                            1,290
=================================================================
Total params: 118,282
Trainable params: 118,282
Non-trainable params: 0
=================================================================

（5）定义前向传播方法：

def forward(self, x):
    x = torch.relu(self.fc1(x))
    x = torch.relu(self.fc2(x))
    x = self.output(x)
    return x

这个 forward 方法定义了数据通过网络的方式，即神经网络的前向传播。

首先，数据 x 被传入第一个全连接层 fc1，然后通过ReLU激活函数进行非线性变换。

处理后的输出再次通过第二个全连接层 fc2 和另一个ReLU激活函数。

最后，通过输出层 output 得到最终的输出。

三、数据加载与处理

对于数据加载与处理，我们使用torchvision库来加载MNIST数据集，并利用transforms进行归一化处理，使神经网络的训练过程更为高效。

MNIST包含70,000张手写数字图像: 60,000张用于培训，10,000张用于测试。图像是灰度的，28x28像素的，并且居中的，以减少预处理和加快运行。

from torchvision import transforms, datasets
from torch.utils.data import DataLoader


# 数据加载与处理
transform = transforms.Compose([
    transforms.ToTensor(),  # 将图片数据转换为Tensor
    transforms.Normalize((0.5,), (0.5,)),  # 对数据进行归一化处理
])

# 加载训练数据集
train_set = datasets.MNIST(root='../data/mnist', train=True, download=True, transform=transform)
# DataLoader进行数据封装。batch_size批尺寸。shuffle将序列的所有元素随机排序
train_loader = DataLoader(train_set, batch_size=64, shuffle=True)

# 加载测试数据集
test_set = datasets.MNIST(root='../data/mnist', train=False, transform=transform)
test_loader = DataLoader(test_set, batch_size=64, shuffle=False)

# 打印部分图片
# 取一个批次查看数据格式
# 数据的shape为：[batch_size, channel, height, weight]
# 其中batch_size为自己设定，channel，height和weight分别是图片的通道数，高度和宽度。
(example_data, example_targets) = next(iter(test_loader))
print('example_targets:', example_targets.shape)
print('example_data:', example_data.shape)

这段代码展示了如何使用torchvision和torch.utils.data包中的工具来加载和预处理MNIST数据集，以便进行机器学习或深度学习任务。通过这种方式，可以方便地在训练和测试过程中按批获取数据，同时进行必要的数据预处理。这段代码详解如下：

（1）导入必要的库：

from torchvision import transforms, datasets
from torch.utils.data import DataLoader

torchvision 是处理图像数据的库，它提供了常用的数据集和图像变换功能。
transforms 用于数据预处理和增强。
datasets 用于加载数据集。
DataLoader 是PyTorch中一个非常重要的类，它将一个可迭代的数据集封装成一个迭代器，方便批处理和数据洗牌等操作。

（2）定义数据预处理操作：

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

这部分代码定义了将要对数据执行的变换操作，Compose 创建了一个变换序列。
ToTensor() 将PIL图像或NumPy ndarray 转换为FloatTensor，并在[0., 1.]范围内缩放图像的像素强度值。
Normalize((0.5,), (0.5,)) 对张量图像进行标准化，给出的均值(mean)和标准差(std)应用于所有三个通道。这里均值和标准差设置为0.5，意味着[0, 1]的输入将被标准化到[-1, 1]。对于灰度图（如MNIST），只需要给出一个通道的均值和标准差。

（3）加载训练集：

train_set = datasets.MNIST(root='./data', train=True, download=True, transform=transform)

使用datasets.MNIST加载MNIST训练数据集。root参数指定数据存储的路径，train=True表明是加载训练集（train=False加载测试集），download=True指在数据不在指定路径时自动下载，transform=transform应用之前定义的预处理操作。

（4）创建训练数据加载器：

train_loader = DataLoader(train_set, batch_size=64, shuffle=True)

DataLoader封装了train_set，batch_size=64指定了每个批次的大小，shuffle=True表示在每个epoch开始时，数据将被打乱，这有助于模型泛化。

（5）加载测试集：

test_set = datasets.MNIST(root='./data', train=False, transform=transform)

类似于训练集加载方式，train=False表明此处加载的是测试集。

（6）创建测试数据加载器：

test_loader = DataLoader(test_set, batch_size=64, shuffle=False)

为测试数据集创建一个DataLoader，shuffle=False通常用于测试数据，因为在测试阶段不需要打乱数据。

(example_data, example_targets) = next(iter(test_loader))
print('example_targets:', example_targets.shape)
print('example_data:', example_data.shape)

打印出来的数据格式：

example_targets: torch.Size([64])
example_data: torch.Size([64, 1, 28, 28])

这意味着我们有64个例子的28x28像素的灰度(即没有rgb通道)。

打印出数据集的图片，查看原始图片效果

fig = plt.figure()
for i in range(6):
    plt.subplot(2, 3, i + 1)
    plt.tight_layout()
    plt.imshow(example_data[i][0], cmap='gray', interpolation='none')
    plt.title("Ground Truth: {}".format(example_targets[i]))
    plt.xticks([])
    plt.yticks([])
plt.show()

四、训练过程

在定义了模型和数据加载器之后，我们通过编写一个训练循环来训练模型。在此过程中，我们采用CrossEntropyLoss作为损失函数，并选择Adam作为优化器。

optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()

for epoch in range(5):  # 训练5个Epoch
    for images, labels in train_loader:
        images = images.view(-1, 28*28)  
        optimizer.zero_grad()
        output = model(images)
        loss = criterion(output, labels)
        loss.backward()
        optimizer.step()

这段代码通过设定训练周期，不断对数据进行前向传播、损失计算、反向传播和参数更新，以此来训练神经网络模型，优化其在给定数据上的性能。这段代码详解如下：

（1）设置优化器：

optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

这里使用的是Adam优化器，是一种自适应学习率的算法，常用于训练深度学习模型。
model.parameters() 返回模型中所有需要被训练的参数。
lr=0.001 设置了学习率为0.001。学习率是一个超参数，用来控制在优化过程中参数更新的步长大小。

（2）定义损失函数：

criterion = nn.CrossEntropyLoss()

使用交叉熵损失函数，它是分类任务中常用的损失函数之一。它同时考虑了正确分类的概率并进行了对数变换，以胜避数值不稳定。

（3）训练循环：

for epoch in range(5):  # 训练5个Epoch
    for images, labels in train_loader:

外层循环变量epoch用于迭代整个数据集，循环次数表示为5个训练周期（或“epoch”），每个epoch包括对整个训练集的一次遍历。

内层循环通过train_loader迭代器遍历训练数据。train_loader将数据集分成多个批次，其中每个批次包含了图像和相应的标签。

（4）数据预处理：

images = images.view(-1, 28*28)

将图像张量从二维图像(每个图像28x28像素)转换为一维向量，因为我们的模型是全连接的（SimpleNN）。-1在这里的意思是自动计算这一维的大小，确保整个批次的数据能被正确reshape。如果是多维的需要做对应的变换

（5）梯度归零：

optimizer.zero_grad()

在每次的梯度计算之前需要将梯度归零（重置），否则梯度会累加到已有的梯度上，这是因为PyTorch在调用.backward()时默认会累积梯度。

（6）前向传播：

output = model(images)

执行模型的前向传播步骤，输出每个类别的预测结果。

（7）计算损失：

loss = criterion(output, labels)

根据模型输出和真实标签计算损失值。

（8）反向传播：

loss.backward()

执行反向传播，计算关于损失函数的所有模型参数的梯度。

（9）参数更新：

optimizer.step()

根据计算得到的梯度更新模型参数，以最小化损失函数。

五、评估与分析

在模型被训练后，我们对其性能进行评估，以了解模型在未见过的数据上的表现。此外，我们也可以分析模型在哪些类别上表现良好，哪些类别上还有改进的空间。

def evaluate_model(model, data_loader):
    model.eval()
    total_correct = 0
    total = 0
    with torch.no_grad():
        for images, labels in data_loader:
            images = images.view(-1, 28*28)
            output = model(images)
            _, predicted = torch.max(output.data, 1)
            total += labels.size(0)
            total_correct += (predicted == labels).sum().item()
    print(f'Accuracy: {100 * total_correct / total:.2f}%')

evaluate_model(model, test_loader)

这段代码主要用于模型训练后的性能验证，帮助了解模型在未知数据上的泛化能力。通过这种方式，可以很直观地看到模型准确率的具体数值。这段代码详解如下：

（1）函数定义：

def evaluate_model(model, data_loader):

定义名为 evaluate_model 的函数，接受两个参数：model —— 训练好的模型，data_loader —— 用于评估该模型的数据加载器。

（2）模型评估模式：

model.eval()

调用 .eval() 将模型设置到评估模式。这是必要的步骤，因为某些模型层（如：Dropout、BatchNorm等）在训练和评估时的行为是不同的。

（3）初始化计数器：

total_correct = 0
total = 0

初始化 total_correct 变量来记录预测正确的样本数量，初始化 total 变量来记录总样本数量。

（4）禁用梯度计算：

with torch.no_grad():

使用 torch.no_grad() 上下文管理器在评估模型时禁用梯度计算。这是为了提高计算效率，及节省内存，因为在评估过程中不需要进行反向传播。

（5）数据加载并处理：

for images, labels in data_loader:
    images = images.view(-1, 28*28)

迭代数据加载器中的数据。数据加载器返回一批图像和对应的标签。

将每批图像数据通过 view 方法调整为二维张量，以匹配模型的输入格式（模型期望输入为向量形式）。

（6）模型预测：

output = model(images)
_, predicted = torch.max(output.data, 1)

使用模型对输入图像进行前向传播，得到预测结果。

torch.max(output.data, 1) 返回每一行的最大值的索引，即预测的类别标签。

（7）统计正确的预测数量和总数量：

total += labels.size(0)
total_correct += (predicted == labels).sum().item()

更新总样本数。labels.size(0) 给出当前批次的样本数量。

更新正确预测的总数。比较 predicted 和 labels 得到一个布尔张量，然后使用 .sum().item() 得到正确预测的数量。

（8）打印准确率：

print(f'Accuracy: {100 * total_correct / total:.2f}%')

最后输出准确率，计算方式是正确预测的数量除以总样本数量，转换为百分比格式输出。

（9）调用函数评估模型：

evaluate_model(model, test_loader)

使用上面定义的 evaluate_model 函数和参数 model（训练好的模型）及 test_loader（测试数据加载器）来评估模型性能。

六、完整代码和运行结果

调用计算：

import torch
import torch.nn as nn
from torchvision import transforms, datasets
from torch.utils.data import DataLoader
from PIL import Image
import torch.nn.functional as F
import time
import os
import torchvision
import matplotlib.pyplot as plt
from torchinfo import summary
import cv2
import numpy as np

# 训练开始时间
train_start = time.time()
# 设定是否使用CUDA（如果可用）
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# 数据加载与处理
transform = transforms.Compose([
    transforms.ToTensor(),  # 将图片数据转换为Tensor
    transforms.Normalize((0.5,), (0.5,)),  # 对数据进行归一化处理
])
# 加载训练数据集
train_set = datasets.MNIST(root='../data/mnist', train=True, download=True, transform=transform)
# DataLoader进行数据封装。batch_size批尺寸。shuffle将序列的所有元素随机排序
train_loader = DataLoader(train_set, batch_size=64, shuffle=True)

# 加载测试数据集
test_set = datasets.MNIST(root='../data/mnist', train=False, transform=transform)
test_loader = DataLoader(test_set, batch_size=64, shuffle=False)

# 打印部分图片
# 取一个批次查看数据格式
# 数据的shape为：[batch_size, channel, height, weight]
# 其中batch_size为自己设定，channel，height和weight分别是图片的通道数，高度和宽度。
(example_data, example_targets) = next(iter(test_loader))
print('example_targets:', example_targets.shape)
print('example_data:', example_data.shape)

fig = plt.figure()
for i in range(6):
    plt.subplot(2, 3, i + 1)
    plt.tight_layout()
    plt.imshow(example_data[i][0], cmap='gray', interpolation='none')
    plt.title("Ground Truth: {}".format(example_targets[i]))
    plt.xticks([])
    plt.yticks([])
plt.show()

# 网络构建
class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        # 定义第一层全连接层：输入特征784，输出特征128, 参数量为：w * h + h, 如果 bias = False, 则 为 w * h,所以为:784*128 + 128 = 100480
        self.fc1 = nn.Linear(784, 128)
        # 定义第二层全连接层：输入特征128，输出特征128
        self.fc2 = nn.Linear(128, 128)
        # 定义输出层：输入特征128，输出特征10（对应10个数字类别）
        self.output = nn.Linear(128, 10)

    def forward(self, x):
        # 对第一层的输出应用ReLU激活函数
        x = torch.relu(self.fc1(x))
        # 对第二层的输出也应用ReLU激活函数
        x = torch.relu(self.fc2(x))
        # 通过输出层得到最终的分类结果
        x = self.output(x)
        return x# 评估函数
def evaluate_model(model, data_loader):
    model.eval()  # 将模型设置为评估模式
    total_correct = 0
    total = 0
    with torch.no_grad():  # 禁止梯度计算
        for images, labels in data_loader:
            # 将图片和标签数据转移到相同的设备
            images, labels = images.view(-1, 28*28).to(device), labels.to(device)
            output = model(images)  # 前向传播得到预测结果
            _, predicted = torch.max(output.data, 1)  # 得到预测的类别
            total += labels.size(0)
            total_correct += (predicted == labels).sum().item()  # 统计正确预测的数量
    # 打印准确率
    print(f'Accuracy: {100 * total_correct / total:.2f}%')

# 训练过程
def train():
    # 选择优化器
    optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
    # 损失函数
    criterion = nn.CrossEntropyLoss()
    for epoch in range(5):  # 训练5个循环周期
        for images, labels in train_loader:
            # 调整图片形状并转移到相同的设备
            images, labels = images.view(-1, 28*28).to(device), labels.to(device)
            optimizer.zero_grad()  # 清除历史梯度
            output = model(images)  # 前向传播计算模型输出
            loss = criterion(output, labels)  # 计算损失
            loss.backward()  # 反向传播计算梯度
            optimizer.step()  # 更新模型参数

path = '../model/mymnist.pth'
if os.path.exists(path) is not True:
    # 将模型实例化并转移到定义的设备（CPU或GPU）
    model = SimpleNN().to(device)
    train()
    # 使用测试数据集评估模型性能
    evaluate_model(model, test_loader)
    train_end = time.time()  # 训练结束时间
    print("训练时间: {:.2f}秒".format(train_end - train_start))
    torch.save(model, path)
else:
    model = torch.load(path)

summary(model)

# 定义一个函数来预处理图片
def preprocess_image(image_path):
    img = Image.open(image_path)
    img = img.convert('L') # 将图像转化为灰度图像
    img = img.resize((28, 28))
    # 获取图片的宽度和高度
    width, height = img.size
    img = np.array(img)

    dst = np.zeros((height, width), np.uint8)
    for i in range(height):
        for j in range(width):
            dst[i, j] = 255 - img[i, j]
    img = dst
    img = np.array(img).astype(np.float32)
    img = img.reshape(1, 28*28) # 变化后[1, 784]
    img = torch.from_numpy(img)
    img = img.to(device)
    return img

# 预测
def practice():
    # 使用模型进行预测
    plt.figure()
    labels = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    # 这里替换为你的图片路径列表
    image_paths = ['../test/2.png', '../test/3.png', '../test/4.png', '../test/0.png', '../test/9.png']
    for i, image_path in enumerate(image_paths):
        # 预处理图片
        img = preprocess_image(image_path)
        if img is not None:
            with torch.no_grad():  # 在测试模式下关闭梯度计算，以加速计算速度
                output = model(img)  # 将图像输入到模型中进行预测
            predicted = torch.softmax(output, dim=1)   # 获取预测结果中的最大值索引作为分类标签
            # 每个预测值的概率
            probability = predicted.cpu().detach().numpy()
            # 找出最大概率值的索引
            predicted = torch.argmax(predicted, dim=1)
            index = predicted.cpu().numpy()[0]
            # 预测结果
            pred = labels[index]
            print('预测结果', predicted, pred, probability)
            # 显示图片和预测结果
            plt.subplot(1, len(image_paths), i + 1)
            plt.imshow(img.reshape(28, 28), cmap='gray')
            plt.axis('off')
            plt.title('value:' + str(pred))
    plt.show()

practice()

也可以使用卷积模型来识别，主要就是修改模型和处理数据的地方：

import torch
import torch.nn as nn
from torchvision import transforms, datasets
from torch.utils.data import DataLoader
from PIL import Image
import torch.nn.functional as F
import time
import os
import torchvision
import matplotlib.pyplot as plt
from torchinfo import summary
import cv2
import numpy as np

# 训练开始时间
train_start = time.time()
# 设定是否使用CUDA（如果可用）
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# 数据加载与处理
transform = transforms.Compose([
    transforms.ToTensor(),  # 将图片数据转换为Tensor
    transforms.Normalize((0.5,), (0.5,)),  # 对数据进行归一化处理
])
# 加载训练数据集
train_set = datasets.MNIST(root='../data/mnist', train=True, download=True, transform=transform)
# DataLoader进行数据封装。batch_size批尺寸。shuffle将序列的所有元素随机排序
train_loader = DataLoader(train_set, batch_size=64, shuffle=True)

# 加载测试数据集
test_set = datasets.MNIST(root='../data/mnist', train=False, transform=transform)
test_loader = DataLoader(test_set, batch_size=64, shuffle=False)

# 打印部分图片
# 取一个批次查看数据格式
# 数据的shape为：[batch_size, channel, height, weight]
# 其中batch_size为自己设定，channel，height和weight分别是图片的通道数，高度和宽度。
(example_data, example_targets) = next(iter(test_loader))
print('example_targets:', example_targets.shape)
print('example_data:', example_data.shape)

fig = plt.figure()
for i in range(6):
    plt.subplot(2, 3, i + 1)
    plt.tight_layout()
    plt.imshow(example_data[i][0], cmap='gray', interpolation='none')
    plt.title("Ground Truth: {}".format(example_targets[i]))
    plt.xticks([])
    plt.yticks([])
plt.show()

# 网络构建
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)

    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

# 评估函数
def evaluate_model(model, data_loader):
    model.eval()  # 将模型设置为评估模式
    total_correct = 0
    total = 0
    with torch.no_grad():  # 禁止梯度计算
        for images, labels in data_loader:
            # 将图片和标签数据转移到相同的设备
            images, labels = images.to(device), labels.to(device)
            output = model(images)  # 前向传播得到预测结果
            _, predicted = torch.max(output.data, 1)  # 得到预测的类别
            total += labels.size(0)
            total_correct += (predicted == labels).sum().item()  # 统计正确预测的数量
    # 打印准确率
    print(f'Accuracy: {100 * total_correct / total:.2f}%')

# 训练过程
def train():
    # 选择优化器
    optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
    # 损失函数
    criterion = nn.CrossEntropyLoss()
    for epoch in range(5):  # 训练5个循环周期
        for images, labels in train_loader:
            # 调整图片形状并转移到相同的设备
            images, labels = images.to(device), labels.to(device)
            # print(images.shape, images, labels.shape, labels)
            optimizer.zero_grad()  # 清除历史梯度
            output = model(images)  # 前向传播计算模型输出
            loss = criterion(output, labels)  # 计算损失
            loss.backward()  # 反向传播计算梯度
            optimizer.step()  # 更新模型参数

path = '../model/mymnist.pth'
if os.path.exists(path) is not True:
    # 将模型实例化并转移到定义的设备（CPU或GPU）
    model = Net().to(device)
    train()
    # 使用测试数据集评估模型性能
    evaluate_model(model, test_loader)
    train_end = time.time()  # 训练结束时间
    print("训练时间: {:.2f}秒".format(train_end - train_start))
    torch.save(model, path)
else:
    model = torch.load(path)

summary(model)

# 定义一个函数来预处理图片
def preprocess_image(image_path):
    img = Image.open(image_path)
    img = img.convert('L') # 将图像转化为灰度图像
    img = img.resize((28, 28))
    # 获取图片的宽度和高度
    width, height = img.size
    img = np.array(img)

    dst = np.zeros((height, width), np.uint8)
    for i in range(height):
        for j in range(width):
            dst[i, j] = 255 - img[i, j]
    img = dst
    img = np.array(img).astype(np.float32)
    img = np.expand_dims(img, 0)
    img = np.expand_dims(img, 0)  # 扩展后，为[1，1，28，28]
    img = torch.from_numpy(img)
    img = img.to(device)
    return img

# 预测
def practice():
    # 使用模型进行预测
    plt.figure()
    labels = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    # 这里替换为你的图片路径列表
    image_paths = ['../test/2.png', '../test/3.png', '../test/4.png', '../test/0.png', '../test/9.png']
    for i, image_path in enumerate(image_paths):
        # 预处理图片
        img = preprocess_image(image_path)
        if img is not None:
            with torch.no_grad():  # 在测试模式下关闭梯度计算，以加速计算速度
                output = model(img)  # 将图像输入到模型中进行预测
            predicted = torch.softmax(output, dim=1)   # 获取预测结果中的最大值索引作为分类标签
            # 每个预测值的概率
            probability = predicted.cpu().detach().numpy()
            # 找出最大概率值的索引
            predicted = torch.argmax(predicted, dim=1)
            index = predicted.cpu().numpy()[0]
            # 预测结果
            pred = labels[index]
            print('预测结果', predicted, pred, probability)
            # 显示图片和预测结果
            plt.subplot(1, len(image_paths), i + 1)
            plt.imshow(img.reshape(28, 28), cmap='gray')
            plt.axis('off')
            plt.title('value:' + str(pred))
    plt.show()

practice()

也可以用下面的方式，图片进行反色操作：

# 定义一个函数来预处理图片
def preprocess_image(image_path):
    img = Image.open(image_path)
    img = img.convert('L') # 将图像转化为灰度图像
    img = img.resize((28, 28))
    # 使用进行反色操作
    img = img.point(lambda _: 255 - _)

    # 获取图片的宽度和高度
    prac_img = transforms.Compose([transforms.Resize((28, 28)),transforms.ToTensor()])
    img = prac_img(img)
    img = torch.reshape(img, (1, 1, 28, 28))
    img = img.to(device)
    return img

或者：

from PIL import Image, ImageChops
img = ImageChops.invert(img)

转载：

https://zhuanlan.zhihu.com/p/696017829

https://juejin.cn/post/7238938488598626364

posted on 2024-11-11 19:35 xuanm 阅读(561) 评论(0) 收藏举报

刷新页面返回顶部

众妙之门

公告

五、评估与分析