PyTorch常用代码段

1，基本配置

导入包和版本查询

import torch
import torch.nn as nn
import torchvision
print(torch.__version__)
print(torch.version.cuda)
print(torch.backends.cudnn.version())
print(torch.cuda.get_device_name(0))

显卡设置

如果只需要一张显卡

1 2	`# Device configuration` `device` `=` `torch.device('cuda'` `if` `torch.cuda.is_available()` `else` `'cpu')`

如果需要指定多张显卡，比如0，1号显卡。

1 2	`import` `os` `os.environ['CUDA_VISIBLE_DEVICES']` `=` `'0,1'`

也可以在命令行运行代码时设置显卡：

1	`CUDA_VISIBLE_DEVICES=0,1` `python train.py`

清除显存：

1	`torch.cuda.empty_cache()`

也可以使用在命令行重置GPU的指令：

1	`nvidia-smi` `--gpu-reset` `-i [gpu_id]`

2. 张量(Tensor)处理

张量基本信息

tensor = torch.randn(3,4,5)
print(tensor.type())  # 数据类型
print(tensor.size())  # 张量的shape，是个元组
print(tensor.dim())   # 维度的数量

命名张量

张量命名是一个非常有用的方法，这样可以方便地使用维度的名字来做索引或其他操作，大大提高了可读性、易用性，防止出错。

# 在PyTorch 1.3之前，需要使用注释
# Tensor[N, C, H, W]
images = torch.randn(32, 3, 56, 56)
images.sum(dim=1)
images.select(dim=1, index=0)
 
# PyTorch 1.3之后
NCHW = [‘N’, ‘C’, ‘H’, ‘W’]
images = torch.randn(32, 3, 56, 56, names=NCHW)
images.sum('C')
images.select('C', index=0)
# 也可以这么设置
tensor = torch.rand(3,4,1,2,names=('C', 'N', 'H', 'W'))
# 使用align_to可以对维度方便地排序
tensor = tensor.align_to('N', 'C', 'H', 'W')

数据类型转换

# 设置默认类型，pytorch中的FloatTensor远远快于DoubleTensor
torch.set_default_tensor_type(torch.FloatTensor)
 
# 类型转换
tensor = tensor.cuda()
tensor = tensor.cpu()
tensor = tensor.float()
tensor = tensor.long()

torch.Tensor与np.ndarray转换

除了CharTensor，其他所有CPU上的张量都支持转换为numpy格式然后再转换回来。

ndarray = tensor.cpu().numpy()
tensor = torch.from_numpy(ndarray).float()
tensor = torch.from_numpy(ndarray.copy()).float() # If ndarray has negative stride.

Torch.tensor与PIL.Image转换

# pytorch中的张量默认采用[N, C, H, W]的顺序，并且数据范围在[0,1]，需要进行转置和规范化
# torch.Tensor -> PIL.Image
image = PIL.Image.fromarray(torch.clamp(tensor*255, min=0, max=255).byte().permute(1,2,0).cpu().numpy())
image = torchvision.transforms.functional.to_pil_image(tensor)  # Equivalently way
 
# PIL.Image -> torch.Tensor
path = r'./figure.jpg'
tensor = torch.from_numpy(np.asarray(PIL.Image.open(path))).permute(2,0,1).float() / 255
tensor = torchvision.transforms.functional.to_tensor(PIL.Image.open(path)) # Equivalently way

np.ndarray与PIL.Image的转换‘

image = PIL.Image.fromarray(ndarray.astype(np.uint8))
 
ndarray = np.asarray(PIL.Image.open(path))

从只包含一个元素的张量中提取值

1	`value` `=` `torch.rand(1).item()`

3. 模型定义和操作

一个简单两层卷积网络的示例

# convolutional neural network (2 convolutional layers)
class ConvNet(nn.Module):
    def __init__(self, num_classes=10):
        super(ConvNet, self).__init__()
        self.layer1 = nn.Sequential(
            nn.Conv2d(1, 16, kernel_size=5, stride=1, padding=2),
            nn.BatchNorm2d(16),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2))
        self.layer2 = nn.Sequential(
            nn.Conv2d(16, 32, kernel_size=5, stride=1, padding=2),
            nn.BatchNorm2d(32),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2))
        self.fc = nn.Linear(7*7*32, num_classes)
 
    def forward(self, x):
        out = self.layer1(x)
        out = self.layer2(out)
        out = out.reshape(out.size(0), -1)
        out = self.fc(out)
        return out
 
 
model = ConvNet(num_classes).to(device)

双线性汇合（bilinear pooling）

X = torch.reshape(N, D, H * W)                        # Assume X has shape N*D*H*W
X = torch.bmm(X, torch.transpose(X, 1, 2)) / (H * W)  # Bilinear pooling
assert X.size() == (N, D, D)
X = torch.reshape(X, (N, D * D))
X = torch.sign(X) * torch.sqrt(torch.abs(X) + 1e-5)   # Signed-sqrt normalization
X = torch.nn.functional.normalize(X)                  # L2 normalization

多卡同步 BN（Batch normalization）*

当使用 torch.nn.DataParallel 将代码运行在多张 GPU 卡上时，PyTorch 的 BN 层默认操作是各卡上数据独立地计算均值和标准差，同步 BN 使用所有卡上的数据一起计算 BN 层的均值和标准差，缓解了当批量大小（batch size）比较小时对均值和标准差估计不准的情况，是在目标检测等任务中一个有效的提升性能的技巧。

1	`sync_bn` `=` `torch.nn.SyncBatchNorm(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)`

将已有网络的所有BN层改为同步BN层

def convertBNtoSyncBN(module, process_group=None):
    '''Recursively replace all BN layers to SyncBN layer.
 
    Args:
        module[torch.nn.Module]. Network
    '''
    if isinstance(module, torch.nn.modules.batchnorm._BatchNorm):
        sync_bn = torch.nn.SyncBatchNorm(module.num_features, module.eps, module.momentum, 
                                         module.affine, module.track_running_stats, process_group)
        sync_bn.running_mean = module.running_mean
        sync_bn.running_var = module.running_var
        if module.affine:
            sync_bn.weight = module.weight.clone().detach()
            sync_bn.bias = module.bias.clone().detach()
        return sync_bn
    else:
        for name, child_module in module.named_children():
            setattr(module, name) = convert_syncbn_model(child_module, process_group=process_group))
        return module

计算模型整体参数量

1	`num_parameters` `=` `sum(torch.numel(parameter)` `for` `parameter` `in` `model.parameters())`

查看网络中的参数

可以通过model.state_dict()或者model.named_parameters()函数查看现在的全部可训练参数（包括通过继承得到的父类中的参数）

params = list(model.named_parameters())
(name, param) = params[28]
print(name)
print(param.grad)
print('-------------------------------------------------')
(name2, param2) = params[29]
print(name2)
print(param2.grad)
print('----------------------------------------------------')
(name1, param1) = params[30]
print(name1)
print(param1.grad)

模型可视化（使用pytorchviz）

....

4. 模型训练和测试

分类模型训练代码

# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
 
# Train the model
total_step = len(train_loader)
for epoch in range(num_epochs):
    for i ,(images, labels) in enumerate(train_loader):
        images = images.to(device)
        labels = labels.to(device)
 
        # Forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)
 
        # Backward and optimizer
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
 
        if (i+1) % 100 == 0:
            print('Epoch: [{}/{}], Step: [{}/{}], Loss: {}'
                  .format(epoch+1, num_epochs, i+1, total_step, loss.item()))

分类模型测试代码

# Test the model
model.eval()  # eval mode(batch norm uses moving mean/variance 
              #instead of mini-batch mean/variance)
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
 
    print('Test accuracy of the model on the 10000 test images: {} %'
          .format(100 * correct / total))

自定义loss

继承torch.nn.Module类写自己的loss。

class MyLoss(torch.nn.Moudle):
    def __init__(self):
        super(MyLoss, self).__init__()
 
    def forward(self, x, y):
        loss = torch.mean((x - y) ** 2)
        return loss

标签平滑（label smoothing）

写一个label_smoothing.py的文件，然后在训练代码里引用，用LSR代替交叉熵损失即可。label_smoothing.py内容如下：

import torch
import torch.nn as nn
 
 
class LSR(nn.Module):
 
    def __init__(self, e=0.1, reduction='mean'):
        super().__init__()
 
        self.log_softmax = nn.LogSoftmax(dim=1)
        self.e = e
        self.reduction = reduction
 
    def _one_hot(self, labels, classes, value=1):
        """
            Convert labels to one hot vectors
 
        Args:
            labels: torch tensor in format [label1, label2, label3, ...]
            classes: int, number of classes
            value: label value in one hot vector, default to 1
 
        Returns:
            return one hot format labels in shape [batchsize, classes]
        """
 
        one_hot = torch.zeros(labels.size(0), classes)
 
        #labels and value_added  size must match
        labels = labels.view(labels.size(0), -1)
        value_added = torch.Tensor(labels.size(0), 1).fill_(value)
 
        value_added = value_added.to(labels.device)
        one_hot = one_hot.to(labels.device)
 
        one_hot.scatter_add_(1, labels, value_added)
 
        return one_hot
 
    def _smooth_label(self, target, length, smooth_factor):
        """convert targets to one-hot format, and smooth
        them.
        Args:
            target: target in form with [label1, label2, label_batchsize]
            length: length of one-hot format(number of classes)
            smooth_factor: smooth factor for label smooth
 
        Returns:
            smoothed labels in one hot format
        """
        one_hot = self._one_hot(target, length, value=1 - smooth_factor)
        one_hot += smooth_factor / (length - 1)
 
        return one_hot.to(target.device)
 
    def forward(self, x, target):
 
        if x.size(0) != target.size(0):
            raise ValueError('Expected input batchsize ({}) to match target batch_size({})'
                    .format(x.size(0), target.size(0)))
 
        if x.dim() < 2:
            raise ValueError('Expected input tensor to have least 2 dimensions(got {})'
                    .format(x.size(0)))
 
        if x.dim() != 2:
            raise ValueError('Only 2 dimension tensor are implemented, (got {})'
                    .format(x.size()))
 
 
        smoothed_target = self._smooth_label(target, x.size(1), self.e)
        x = self.log_softmax(x)
        loss = torch.sum(- x * smoothed_target, dim=1)
 
        if self.reduction == 'none':
            return loss
 
        elif self.reduction == 'sum':
            return torch.sum(loss)
 
        elif self.reduction == 'mean':
            return torch.mean(loss)
 
        else:
            raise ValueError('unrecognized option, expect reduction to be one of none, mean, sum')

模型训练可视化

PyTorch可以使用tensorboard来可视化训练过程。安装和运行TensorBoard。

1 2	`pip install tensorboard` `tensorboard` `--logdir=runs`

使用SummaryWriter类来收集和可视化相应的数据，放了方便查看，可以使用不同的文件夹，比如'Loss/train'和'Loss/test'。

from torch.utils.tensorboard import SummaryWriter
import numpy as np
 
writer = SummaryWriter()
 
for n_iter in range(100):
    writer.add_scalar('Loss/train', np.random.random(), n_iter)
    writer.add_scalar('Loss/test', np.random.random(), n_iter)
    writer.add_scalar('Accuracy/train', np.random.random(), n_iter)
    writer.add_scalar('Accuracy/test', np.random.random(), n_iter)

保存与加载断点

注意为了能够恢复训练，我们需要同时保存模型和优化器的状态，以及当前的训练轮数。

start_epoch = 0
# Load checkpoint.
if resume: # resume为参数，第一次训练时设为0，中断再训练时设为1
    model_path = os.path.join('model', 'best_checkpoint.pth.tar')
    assert os.path.isfile(model_path)
    checkpoint = torch.load(model_path)
    best_acc = checkpoint['best_acc']
    start_epoch = checkpoint['epoch']
    model.load_state_dict(checkpoint['model'])
    optimizer.load_state_dict(checkpoint['optimizer'])
    print('Load checkpoint at epoch {}.'.format(start_epoch))
    print('Best accuracy so far {}.'.format(best_acc))
 
# Train the model
for epoch in range(start_epoch, num_epochs): 
    ... 
 
    # Test the model
    ...
 
    # save checkpoint
    is_best = current_acc > best_acc
    best_acc = max(current_acc, best_acc)
    checkpoint = {
        'best_acc': best_acc,
        'epoch': epoch + 1,
        'model': model.state_dict(),
        'optimizer': optimizer.state_dict(),
    }
    model_path = os.path.join('model', 'checkpoint.pth.tar')
    best_model_path = os.path.join('model', 'best_checkpoint.pth.tar')
    torch.save(checkpoint, model_path)
    if is_best:
        shutil.copy(model_path, best_model_path)

5,其他注意事项

不要使用太大的线性层。因为nn.Linear(m,n)使用的是O(mn)的内存，线性层太大很容易超出现有显存。

不要在太长的序列上使用RNN。因为RNN反向传播使用的是BPTT算法，其需要的内存和输入序列的长度呈线性关系。

model(x) 前用 model.train() 和 model.eval() 切换网络状态。

不需要计算梯度的代码块用 with torch.no_grad() 包含起来。

model.eval() 和 torch.no_grad() 的区别在于，model.eval() 是将网络切换为测试状态，例如 BN 和dropout在训练和测试阶段使用不同的计算方法。torch.no_grad() 是关闭 PyTorch 张量的自动求导机制，以减少存储使用和加速计算，得到的结果无法进行 loss.backward()。

model.zero_grad()会把整个模型的参数的梯度都归零, 而optimizer.zero_grad()只会把传入其中的参数的梯度归零.

torch.nn.CrossEntropyLoss 的输入不需要经过 Softmax。torch.nn.CrossEntropyLoss 等价于 torch.nn.functional.log_softmax + torch.nn.NLLLoss。

loss.backward() 前用 optimizer.zero_grad() 清除累积梯度。

torch.utils.data.DataLoader 中尽量设置 pin_memory=True，对特别小的数据集如 MNIST 设置 pin_memory=False 反而更快一些。num_workers 的设置需要在实验中找到最快的取值。

用 del 及时删除不用的中间变量，节约 GPU 存储。

使用 inplace 操作可节约 GPU 存储，如

1	`x` `=` `torch.nn.functional.relu(x, inplace=True)`

减少 CPU 和 GPU 之间的数据传输。例如如果你想知道一个 epoch 中每个 mini-batch 的 loss 和准确率，先将它们累积在 GPU 中等一个 epoch 结束之后一起传输回 CPU 会比每个 mini-batch 都进行一次 GPU 到 CPU 的传输更快。

使用半精度浮点数 half() 会有一定的速度提升，具体效率依赖于 GPU 型号。需要小心数值精度过低带来的稳定性问题。

时常使用 assert tensor.size() == (N, D, H, W) 作为调试手段，确保张量维度和你设想中一致。

除了标记 y 外，尽量少使用一维张量，使用 n*1 的二维张量代替，可以避免一些意想不到的一维张量计算结果。

__EOF__

本文作者：lishuaics
本文链接：https://www.cnblogs.com/L-shuai/p/15813677.html
关于博主：IT小白
版权声明：本博客所有文章除特别声明外，均采用 BY-NC-SA 许可协议。转载请注明出处！
声援博主：如果您觉得文章对您有帮助，可以点击文章右下角【推荐】一下。您的鼓励是博主的最大动力！

posted @ 2022-01-17 15:24 lishuaics 阅读(129) 评论(0) 编辑收藏举报

刷新页面返回顶部

登录后才能查看或发表评论，立即登录或者逛逛博客园首页

相关博文：

· pytorc使用多个GPU同时训练模型

· Embedding和Word2Vec用法

· [转]Pytorch常用代码块

· Pytorch常用代码段汇总

· PyTorch常用代码块

阅读排行：
· 震惊！C++程序真的从main开始吗？99%的程序员都答错了
· 【硬核科普】Trae如何「偷看」你的代码？零基础破解AI编程运行原理
· 单元测试从入门到精通
· 上周热点回顾（3.3-3.9）
· winform 绘制太阳，地球，月球运作规律

公告

PyTorch常用代码段

发表于 2022-01-17 15:24阅读：129评论：0推荐：0

python 深度学习

WeChat

关注

跳至底部

昵称： lishuaics
园龄： 4年8个月
粉丝： 6
关注： 2

+加关注

一言（ヒトコト）

Why so serious?

——小丑

随笔档案 (28)

文章档案 (1)

2022年1月(1)

Youth

PyTorch常用代码段

1，基本配置

导入包和版本查询

显卡设置

2. 张量(Tensor)处理

张量基本信息

命名张量

数据类型转换

torch.Tensor与np.ndarray转换

Torch.tensor与PIL.Image转换

np.ndarray与PIL.Image的转换‘

从只包含一个元素的张量中提取值

3. 模型定义和操作

一个简单两层卷积网络的示例

双线性汇合（bilinear pooling）

多卡同步 BN（Batch normalization）*

将已有网络的所有BN层改为同步BN层

计算模型整体参数量

查看网络中的参数

模型可视化（使用pytorchviz）

4. 模型训练和测试

分类模型训练代码

分类模型测试代码

自定义loss

标签平滑（label smoothing）

模型训练可视化

保存与加载断点

5,其他注意事项

公告

lishuaics

PyTorch常用代码段

一言（ヒトコト）

搜索

常用链接

最新随笔

我的标签

积分与排名

随笔档案 (28)

文章档案 (1)

阅读排行榜

推荐排行榜