土堆Pytorch教程笔记
环境安装
- 安装anaconda(安装过程中勾上环境变量),cmd的base环境下创建虚拟环境,进入虚拟环境中安装Pytorch:在官网选择对应配置的生成的安装命令(确认GPU是支持的CUDA的,且GPU驱动升级到对应的版本:nvidia-smi命令查看),复制该命令,进入刚创建的虚拟环境中运行该命令,输入python命令,进入虚拟环境中对应的python命令交互行,import torch未报错则安装成功,torch.cuda.is_available( )返回True则能用cuda加速,否则应该是显卡驱动版本不对
- Pycharm,设置项目的环境,选择已存在的环境,conda环境,选择自己创建的虚拟环境即可,若未自动显示,则需找到anaconda安装目录下的envs目录下对应虚拟环境中的python.exe,底部python console可作为测试的地方使用,方便看变量的值
- Jupyter,可分块执行,在anaconda图形化界面选择base或自定义的虚拟环境安装即可,在cmd中对应的环境中输入jupyter notebook启动或直接在anaconda图形化界面中启动jupyter ,shift + 回车:运行当前单元
两大法宝函数
-
dir( ):打开,看见
-
help( ):说明书
若将pytorch视作一个盒子,包含1、2、3、4个分隔区,每个分隔区中有不同的工具,若3号分隔区中有a、b、c三个工具,则:
dir( pytorch ):输出1、2、3、4个分隔区
dir( pytorch.3 ):输出a、b、c三个工具
help( pytorch.3.a ):输出a工具的说明
测试:python console中import torch,然后输入dir( torch),可以看到许多分隔区,其中包括有cuda,dir( torch.cuda)同样输出了很多分隔区,其中包括有is_available,help( torch.cuda.is_available )输出is_available( )方法的说明
- 相对路径:如有A.py处于Project项目下的train文件夹下,B.jpg处于Project项目下的image文件夹下,在A中定义B的相对路径:../image/B.jpg
tensorboard可视化
- SummaryWriter类(torch.utils.tensorboard.SummaryWriter):
from torch.utils.tensorboard import SummaryWriter
from PIL import Image
import numpy as np
writer = SummaryWriter("logs") # 设置事件文件的路径
img_path = "data/train/ants_image/0013035.jpg"
img_PIL = Image.open(img_path)
img = np.array(img_PIL)
print(type(img))
writer.add_image(tag="test", img_tensor=img, global_step=1, dataformats="HWC") # 从PIL数据类型转换为numpy类型,需要按要求指定通道含义
for i in range(100):
writer.add_scalar(tag="y=2x", scalar_value=2 * i, global_step=i)
writer.flush()
writer.close()
# 运行完成后在Terminal打开日志文件(用的绝对路径):tensorboard --logdir="F:\project\dl\Tudui-Pytorch\class2-Tensorboard\logs" 点开端口链接
读取图片的两种方式:
img = PIL.Image.open(img_path) # img类型为PIL类型
cv_img = cv2.imread(img_path) # cv_img为numpy的n维数组类型
一般都需要转换为tensor类型供后续使用
transforms图片变换
- torchvision.transforms.ToTensor( )
- torchvision.transforms.Compose([ ])........
用法:自定义对图片进行处理的管道,里面可包含一个或多个变换,用pipeline对图片进行处理,常见的transforms参考官方文档,多关注输入和输出,以及参数
from torchvision import transforms
from PIL import Image
import cv2
# 绝对路径
img = Image.open("F:\\project\\dl\\Tudui-Pytorch\\class1-duqushuju\\dataset\\train\\ants_image\\0013035.jpg")
print(type(img)) # PIL类型
cv_img = cv2.imread("F:\\project\\dl\\Tudui-Pytorch\\class1-duqushuju\\dataset\\train\\ants_image\\0013035.jpg")
print(type(cv_img)) # numpy.ndarray类型
# 一种变换
pipeline = transforms.ToTensor() # 将PIL或numpy.ndarray转换为tensor
img = pipeline(img)
# 多种变换组合
pipeline = transforms.Compose([
transforms.CenterCrop(10),
transforms.ToTensor(),
transforms.Resize((256,256))
])
img = pipeline(img)
注意:使用绝对路径时要手动加上双斜杠,单斜杠仅表示转义
注意:类中的__call__方法,该类实例化对象,用对象( 参数 )的方式就会调用call方法
torchvision中的数据集
import torchvision
from torch.utils.tensorboard import SummaryWriter
pipeline = torchvision.transforms.Compose([
torchvision.transforms.ToTensor()
])
train_set = torchvision.datasets.CIFAR10(root="./dataset", train=True, transform=pipeline, download=True)
test_set = torchvision.datasets.CIFAR10(root="./dataset", train=False, transform=pipeline, download=True)
print(train_set.classes)
img, label = train_set[0]
print(img)
print(label) # 代表某一类的数字
print(train_set.classes[label]) # 数字代表的类别名
writer = SummaryWriter("logs")
for i in range(10):
img, label = train_set[i]
writer.add_image(tag="test_img", img_tensor=img, global_step=i)
writer.flush()
writer.close()
加载数据
- Dataset类(torch.utils.data.Dataset):自定义某种方式获取到数据及其标签
- 如何获取每条item(数据)
- 总共有多少条数据
from torch.utils.data import Dataset
from PIL import Image
import os
import cv2
# 自定义数据类继承Dataset类
class MyData(Dataset):
# 构造函数,实例化时被调用,一般定义全局变量
def __init__(self, root_dir, label_dir):
self.root_dir = root_dir
self.label_dir = label_dir
self.path = os.path.join(self.root_dir, self.label_dir) # 路径拼接
self.img_path_list = os.listdir(self.path) # 获取path下所有图片名构成的路径List
# 重写getitem方法,自定义根据索引获取数据
def __getitem__(self, idx):
img_name = self.img_path_list[idx]
img_path = os.path.join(self.root_dir, self.label_dir, img_name)
img = Image.open(img_path) # img类型为PIL类型
# cv_img = cv2.imread(img_path) cv_img为numpy的n维数组类型
label = self.label_dir # 路径名作为label
return img, label
def __len__(self):
return len(self.img_path_list)
root_dir = "dataset/train"
ants_label_dir = "ants"
bees_label_dir = "bees"
# 实例化数据类对象
ants_datasets = MyData(root_dir, ants_label_dir)
bees_datasets = MyData(root_dir, bees_label_dir)
# 根据索引获取某项数据
img, label = ants_datasets[0]
img1, label1 = bees_datasets[2]
img.show()
img1.show()
print(type(img))
train_dataset = ants_datasets + bees_datasets # 按照原顺序,两个数据集相加
- DataLoader类:将获取到的数据以某种方式打包,以便送进模型中
import torchvision
from torch.utils.data import DataLoader
train_set = torchvision.datasets.CIFAR10("./dataset", train=True, transform=torchvision.transforms.ToTensor(),
download=True)
# train_set用什么接,具体可看数据集中的getitem的返回值,如这里就可以点进CIFAR10中看到里面的__getitem__方法的返回值
img, target = train_set[0]
print(img.shape)
print(target)
# shuffle:每个epoch取的具体每批数据是否一样
# drop_last:数据总数除batch_size除不尽时,是否舍弃最后一批数据
train_loader = DataLoader(dataset=train_set, batch_size=32, shuffle=True, drop_last=True)
# 按批遍历train_set中的全部数据
# 等价写法 for imgs, targets in train_loader:
for data in train_loader:
# data中包含一批数据,img是将批数量的图片叠加到了一起,同理target
imgs, targets = data
print(imgs.shape) # torch.Size([32, 3, 32, 32]),第一位为批量大小,第二维为通道数
print(targets.shape) # torch.Size([32])
神经网络的构成:torch.nn
import torch
from torch import nn
# 自定义模型类继承nn.Module
class MyNN(nn.Module):
def __init__(self):
super(MyNN, self).__init__()
def forward(self, x):
return x + 1
model = MyNN()
x = torch.tensor(1.0)
result = model(x) # 父类的call方法中调用了forward,所以使用实例对象(输入)的形式就会自动调forward
print(result) # tensor(2.)
- torch.nn是实例化的使用
- torch.nn.functional是方法的使用
# torch.reshape用法
input = torch.tensor([[1, 2, 0, 3, 1],
[0, 1, 2, 3, 1],
[1, 2, 1, 0, 0],
[5, 2, 3, 1, 1],
[2, 1, 0, 1, 1]], dtype=torch.float32)
print(input.shape) # torch.Size([5, 5])
input = torch.reshape(input, (-1, 1, 5, 5))
print(input.shape) # torch.Size([1, 1, 5, 5])
torch.reshape(X, (-1, 3, 30, 30)) # 将X:torch.size([64,6,30,30])转换为--->torch.size([xxx,3,30,30]) 自动计算-1的那个维度
例子:
from torch import nn
import torch.nn
from torch.utils.tensorboard import SummaryWriter
class MyNN(nn.Module):
def __init__(self):
super(MyNN, self).__init__()
# self.conv1 = nn.Conv2d(in_channels=3, out_channels=32, kernel_size=5, stride=1, padding=2)
# self.maxpool1 = nn.MaxPool2d(kernel_size=2)
# self.conv2 = nn.Conv2d(in_channels=32, out_channels=32, kernel_size=5, stride=1, padding=2)
# self.maxpool2 = nn.MaxPool2d(kernel_size=2)
# self.conv3 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=5, stride=1, padding=2)
# self.maxpool3 = nn.MaxPool2d(kernel_size=2)
# self.flaten = nn.Flatten()
# self.linear1 = nn.Linear(in_features=1024, out_features=64)
# self.linear2 = nn.Linear(in_features=64, out_features=10)
# 等价写法:
self.seq = nn.Sequential(
nn.Conv2d(3, 32, 5, stride=1, padding=2),
nn.MaxPool2d(2),
nn.Conv2d(32, 32, 5, stride=1, padding=2),
nn.MaxPool2d(2),
nn.Conv2d(32, 64, 5, stride=1, padding=2),
nn.MaxPool2d(2),
nn.Flatten(),
nn.Linear(1024, 64),
nn.Linear(64, 10)
)
def forward(self, x):
# x = self.conv1(x)
# x = self.maxpool1(x)
# x = self.conv2(x)
# x = self.maxpool2(x)
# x = self.conv3(x)
# x = self.maxpool3(x)
# x = self.flaten(x)
# x = self.linear1(x)
# x = self.linear2(x)
x = self.seq(x)
return x
my_nn = MyNN()
print(my_nn)
# 检查网络正确性
X = torch.ones((64, 3, 32, 32)) # 模拟输入:batch_size为64,通道数为3,32*32大小
Y = my_nn(X)
print(Y.shape) # torch.Size([64, 10])
# 在tensorboard中可视化模型结构
writer = SummaryWriter("logs")
writer.add_graph(my_nn, X)
writer.close()
损失(torch.nn)、梯度、优化器(torch.optim)
Loss Function:
- 计算真实标记和预测输出间的差距
- 为更新参数提供依据(反向传播)
- 损失值.backward( ):反向传播计算梯度
optimizer:更新参数
import torch
import torchvision.datasets
from torch import nn
from torch.utils.data import DataLoader
test_set = torchvision.datasets.CIFAR10("./dataset", train=False, transform=torchvision.transforms.ToTensor(), download=True)
test_loader = DataLoader(dataset=test_set, batch_size=32, drop_last=True)
class MyNN(nn.Module):
def __init__(self):
super(MyNN, self).__init__()
self.seq = nn.Sequential(
nn.Conv2d(3, 32, 5, stride=1, padding=2),
nn.MaxPool2d(2),
nn.Conv2d(32, 32, 5, stride=1, padding=2),
nn.MaxPool2d(2),
nn.Conv2d(32, 64, 5, stride=1, padding=2),
nn.MaxPool2d(2),
nn.Flatten(),
nn.Linear(1024, 64),
nn.Linear(64, 10)
)
def forward(self, x):
x = self.seq(x)
return x
model = MyNN()
criteria = nn.CrossEntropyLoss() # 损失函数
optimizer = torch.optim.SGD(params=model.parameters(), lr=0.01) # 优化器
for epoch in range(20):
test_loss = 0
for data in test_loader:
imgs, targets = data
outputs = model(imgs)
optimizer.zero_grad() # 梯度清零
loss = criteria(outputs, targets) # 利用损失函数计算损失值
test_loss += loss # 累加当前epoch的损失
loss.backward() # 损失值调用反向传播计算梯度
optimizer.step() # 更新参数
print(test_loss)
使用已有的模型:torchvision.models
import torchvision
from torch import nn
vgg16 = torchvision.models.vgg16(pretrained=False) # 包含网络结构和初始化参数
pretrained_vgg16 = torchvision.models.vgg16(pretrained=True) # 还包含预训练好的参数
print(pretrained_vgg16) # 由于实在ImageNet上训练好的,网络的最后一层输出的维度是1000,表示从1000类别中预测某类
train_data = torchvision.datasets.CIFAR10("./dataset", train=True, transform=torchvision.transforms.ToTensor(), download=True)
# CIFAR10只包含10个类别,可以在pretrained_vgg16最后再添加一层
pretrained_vgg16.add_module(name="linear", module=nn.Linear(1000, 10))
# 也可加在模型的classifier部分的最后
pretrained_vgg16.classifier.add_module(name="linear", module=nn.Linear(1000, 10))
print(pretrained_vgg16)
# 也可直接修改vgg16的某层,将classifier部分中的Linear(4096, 1000)改为Linear(4096, 10)
vgg16.classifier[6] = nn.Linear(4096, 10)
print(vgg16)
模型的保存和加载
保存:
import torch
import torchvision
vgg16 = torchvision.models.vgg16(pretrained=False)
# 保存方式1(保存了模型结构和参数)
torch.save(vgg16, "vgg16-model.pth")
# 保存方式2(仅保存了模型参数:字典形式)
torch.save(vgg16.state_dict(), "vgg16-model2.pth")
加载:
import torch
import torchvision
# 加载模型(对应保存方式1) 注意:用第一种方式加载自定义的模型时,需要把模型文件或模型源码导入当前文件中
model = torch.load("vgg16-model.pth")
print(model)
# 加载模型(对应保存方式2)
model_param_dic = torch.load("vgg16-model2.pth")
vgg16 = torchvision.models.vgg16(pretrained=False)
vgg16.load_state_dict(model_param_dic)
print(vgg16)
完整训练套路
- GPU 训练:将model、损失函数、数据放到cuda设备上
MyModel.py:
import torch
from torch import nn
class MyNN(nn.Module):
def __init__(self):
super(MyNN, self).__init__()
self.seq = nn.Sequential(
nn.Conv2d(3, 32, 5, stride=1, padding=2),
nn.MaxPool2d(2),
nn.Conv2d(32, 32, 5, stride=1, padding=2),
nn.MaxPool2d(2),
nn.Conv2d(32, 64, 5, stride=1, padding=2),
nn.MaxPool2d(2),
nn.Flatten(),
nn.Linear(64 * 4 * 4, 64),
nn.Linear(64, 10)
)
def forward(self, x):
x = self.seq(x)
return x
# 此文件被导入调用时,下面部分不参与执行
if __name__ == '__main__':
# 简单测试以下网络的正确性
model = MyNN()
input = torch.ones((64, 3, 32, 32))
output = model(input)
print(output.shape) # torch.Size([64, 10])
train.py:
import torch
import torchvision.datasets
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
from MyModel import *
# 准备数据集
train_set = torchvision.datasets.CIFAR10("./dataset", train=True, transform=torchvision.transforms.ToTensor(), download=True)
test_set = torchvision.datasets.CIFAR10("./dataset", train=False, transform=torchvision.transforms.ToTensor(), download=True)
# 查看数据集大小
print(len(train_set))
print(len(test_set))
# 打包数据集
train_loader = DataLoader(train_set, batch_size=64, shuffle=True, drop_last=True)
test_loader = DataLoader(test_set, batch_size=64, shuffle=True, drop_last=True)
# 模型
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = MyNN().to(DEVICE)
# 损失、优化器以及一些超参数
criteria = nn.CrossEntropyLoss().to(DEVICE)
LR = 0.01
EPOCHS = 20
optimizer = torch.optim.SGD(model.parameters(), lr=LR)
writer = SummaryWriter("train-logs") # tensorboard可视化
batch_num = 0 # 无论处于哪个epoch,当前处在第几批
# 训练
for epoch in range(EPOCHS):
print("-----开始第{}轮epoch训练-----".format(epoch + 1))
current_epoch_batch_num = 0 # 当前epoch中的第几批
model.train() # 训练模式
for data in train_loader:
imgs, labels = data
imgs, labels = imgs.to(DEVICE), labels.to(DEVICE)
output = model(imgs)
optimizer.zero_grad()
loss = criteria(output, labels)
loss.backward()
optimizer.step()
current_epoch_batch_num += 1 # 累计当前epoch中的批数
batch_num += 1 # 累计总批数
# 在当前epoch中,每扫过100批才打印出那批的损失
if current_epoch_batch_num % 100 == 0:
print("第{}个epoch中第{}批的损失为:{}".format(epoch + 1, current_epoch_batch_num, loss.item()))
writer.add_scalar("train-loss", scalar_value=loss.item(), global_step=batch_num) # tensorboard可视化
# 测试:固定梯度(准确说是验证,用来确定超参数)
test_loss = 0
test_acc = 0
model.eval() # 测试模式
with torch.no_grad():
for data in test_loader:
imgs, labels = data
imgs, labels = imgs.to(DEVICE), labels.to(DEVICE)
output = model(imgs)
loss = criteria(output, labels)
test_loss += loss.item()
pred = output.argmax(1) # 按轴取得每个样本预测的最大值的索引(索引对应label编号!)
pred_right_sum = (pred == labels).sum() # 比对每个样本预测和真实标记,结果为true/false组成的数组,求和时会自动转换为数字
test_acc += pred_right_sum # 将当前批预测正确的数量累加进总计数器
print("第{}个epoch上的测试集损失为:{}".format(epoch + 1, test_loss))
print("第{}个epoch上整体测试集上的准确率:{}".format(epoch + 1, test_acc / len(test_set)))
writer.add_scalar("test-loss", scalar_value=test_loss, global_step=epoch + 1) # tensorboard可视化
writer.add_scalar("test-acc", scalar_value=test_acc / len(test_set), global_step=epoch + 1)
# 保存每个epoch训练出的模型
torch.save(model, "NO.{}_model.pth".format(epoch + 1)) # 保存模型结构和参数
print("当前epoch的模型已保存")
# torch.save(model.state_dict(), "NO.{}_model_param_dic.pth".format(epoch + 1)) # 仅保存参数字典
writer.flush()
writer.close()
测试模型套路
import torch
import torchvision.transforms
from PIL import Image
# 网上下载一张狗的图片
img = Image.open("../images/dog.png")
pipeline = torchvision.transforms.Compose([
torchvision.transforms.Resize((32, 32)),
torchvision.transforms.ToTensor()
])
img = pipeline(img)
print(img.shape)
img = torch.reshape(img, (1, 3, 32, 32)) # reshape一下,增加维度,满足输入格式
print(img.shape)
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = torch.load("NO.20_model.pth") # 加载模型
model.eval()
with torch.no_grad():
img = img.cuda() # 等价写法:img = img.to(DEVICE),因为模型是在cuda上训练保存的,现在要保持输入数据一致
output = model(img)
print(output.argmax(1)) # tensor([5], device='cuda:0') # 预测为第五类:表示狗,正确
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 终于写完轮子一部分:tcp代理 了,记录一下
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· 别再用vector<bool>了!Google高级工程师:这可能是STL最大的设计失误
· 单元测试从入门到精通
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理