PyTorch常用代码段
1,基本配置
导入包和版本查询
1 2 3 4 5 6 7 | import torch import torch.nn as nn import torchvision print (torch.__version__) print (torch.version.cuda) print (torch.backends.cudnn.version()) print (torch.cuda.get_device_name( 0 )) |
显卡设置
如果只需要一张显卡
1 2 | # Device configuration device = torch.device( 'cuda' if torch.cuda.is_available() else 'cpu' ) |
如果需要指定多张显卡,比如0,1号显卡。
1 2 | import os os.environ[ 'CUDA_VISIBLE_DEVICES' ] = '0,1' |
也可以在命令行运行代码时设置显卡:
1 | CUDA_VISIBLE_DEVICES = 0 , 1 python train.py |
清除显存:
1 | torch.cuda.empty_cache() |
也可以使用在命令行重置GPU的指令:
1 | nvidia - smi - - gpu - reset - i [gpu_id] |
2. 张量(Tensor)处理
张量基本信息
1 2 3 4 | tensor = torch.randn( 3 , 4 , 5 ) print (tensor. type ()) # 数据类型 print (tensor.size()) # 张量的shape,是个元组 print (tensor.dim()) # 维度的数量 |
命名张量
张量命名是一个非常有用的方法,这样可以方便地使用维度的名字来做索引或其他操作,大大提高了可读性、易用性,防止出错。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | # 在PyTorch 1.3之前,需要使用注释 # Tensor[N, C, H, W] images = torch.randn( 32 , 3 , 56 , 56 ) images. sum (dim = 1 ) images.select(dim = 1 , index = 0 ) # PyTorch 1.3之后 NCHW = [‘N’, ‘C’, ‘H’, ‘W’] images = torch.randn( 32 , 3 , 56 , 56 , names = NCHW) images. sum ( 'C' ) images.select( 'C' , index = 0 ) # 也可以这么设置 tensor = torch.rand( 3 , 4 , 1 , 2 ,names = ( 'C' , 'N' , 'H' , 'W' )) # 使用align_to可以对维度方便地排序 tensor = tensor.align_to( 'N' , 'C' , 'H' , 'W' ) |
数据类型转换
1 2 3 4 5 6 7 8 | # 设置默认类型,pytorch中的FloatTensor远远快于DoubleTensor torch.set_default_tensor_type(torch.FloatTensor) # 类型转换 tensor = tensor.cuda() tensor = tensor.cpu() tensor = tensor. float () tensor = tensor. long () |
torch.Tensor与np.ndarray转换
除了CharTensor,其他所有CPU上的张量都支持转换为numpy格式然后再转换回来。
1 2 3 | ndarray = tensor.cpu().numpy() tensor = torch.from_numpy(ndarray). float () tensor = torch.from_numpy(ndarray.copy()). float () # If ndarray has negative stride. |
Torch.tensor与PIL.Image转换
1 2 3 4 5 6 7 8 9 | # pytorch中的张量默认采用[N, C, H, W]的顺序,并且数据范围在[0,1],需要进行转置和规范化 # torch.Tensor -> PIL.Image image = PIL.Image.fromarray(torch.clamp(tensor * 255 , min = 0 , max = 255 ).byte().permute( 1 , 2 , 0 ).cpu().numpy()) image = torchvision.transforms.functional.to_pil_image(tensor) # Equivalently way # PIL.Image -> torch.Tensor path = r './figure.jpg' tensor = torch.from_numpy(np.asarray(PIL.Image. open (path))).permute( 2 , 0 , 1 ). float () / 255 tensor = torchvision.transforms.functional.to_tensor(PIL.Image. open (path)) # Equivalently way |
np.ndarray与PIL.Image的转换‘
1 2 3 | image = PIL.Image.fromarray(ndarray.astype(np.uint8)) ndarray = np.asarray(PIL.Image. open (path)) |
从只包含一个元素的张量中提取值
1 | value = torch.rand( 1 ).item() |
3. 模型定义和操作
一个简单两层卷积网络的示例
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | # convolutional neural network (2 convolutional layers) class ConvNet(nn.Module): def __init__( self , num_classes = 10 ): super (ConvNet, self ).__init__() self .layer1 = nn.Sequential( nn.Conv2d( 1 , 16 , kernel_size = 5 , stride = 1 , padding = 2 ), nn.BatchNorm2d( 16 ), nn.ReLU(), nn.MaxPool2d(kernel_size = 2 , stride = 2 )) self .layer2 = nn.Sequential( nn.Conv2d( 16 , 32 , kernel_size = 5 , stride = 1 , padding = 2 ), nn.BatchNorm2d( 32 ), nn.ReLU(), nn.MaxPool2d(kernel_size = 2 , stride = 2 )) self .fc = nn.Linear( 7 * 7 * 32 , num_classes) def forward( self , x): out = self .layer1(x) out = self .layer2(out) out = out.reshape(out.size( 0 ), - 1 ) out = self .fc(out) return out model = ConvNet(num_classes).to(device) |
双线性汇合(bilinear pooling)
1 2 3 4 5 6 | X = torch.reshape(N, D, H * W) # Assume X has shape N*D*H*W X = torch.bmm(X, torch.transpose(X, 1 , 2 )) / (H * W) # Bilinear pooling assert X.size() = = (N, D, D) X = torch.reshape(X, (N, D * D)) X = torch.sign(X) * torch.sqrt(torch. abs (X) + 1e - 5 ) # Signed-sqrt normalization X = torch.nn.functional.normalize(X) # L2 normalization |
多卡同步 BN(Batch normalization)*
当使用 torch.nn.DataParallel 将代码运行在多张 GPU 卡上时,PyTorch 的 BN 层默认操作是各卡上数据独立地计算均值和标准差,同步 BN 使用所有卡上的数据一起计算 BN 层的均值和标准差,缓解了当批量大小(batch size)比较小时对均值和标准差估计不准的情况,是在目标检测等任务中一个有效的提升性能的技巧。
1 | sync_bn = torch.nn.SyncBatchNorm(num_features, eps = 1e - 05 , momentum = 0.1 , affine = True , track_running_stats = True ) |
将已有网络的所有BN层改为同步BN层
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | def convertBNtoSyncBN(module, process_group = None ): '''Recursively replace all BN layers to SyncBN layer. Args: module[torch.nn.Module]. Network ''' if isinstance (module, torch.nn.modules.batchnorm._BatchNorm): sync_bn = torch.nn.SyncBatchNorm(module.num_features, module.eps, module.momentum, module.affine, module.track_running_stats, process_group) sync_bn.running_mean = module.running_mean sync_bn.running_var = module.running_var if module.affine: sync_bn.weight = module.weight.clone().detach() sync_bn.bias = module.bias.clone().detach() return sync_bn else : for name, child_module in module.named_children(): setattr (module, name) = convert_syncbn_model(child_module, process_group = process_group)) return module |
计算模型整体参数量
1 | num_parameters = sum (torch.numel(parameter) for parameter in model.parameters()) |
查看网络中的参数
可以通过model.state_dict()或者model.named_parameters()函数查看现在的全部可训练参数(包括通过继承得到的父类中的参数)
1 2 3 4 5 6 7 8 9 10 11 12 | params = list (model.named_parameters()) (name, param) = params[ 28 ] print (name) print (param.grad) print ( '-------------------------------------------------' ) (name2, param2) = params[ 29 ] print (name2) print (param2.grad) print ( '----------------------------------------------------' ) (name1, param1) = params[ 30 ] print (name1) print (param1.grad) |
模型可视化(使用pytorchviz)
....
4. 模型训练和测试
分类模型训练代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | # Loss and optimizer criterion = nn.CrossEntropyLoss() optimizer = torch.optim.Adam(model.parameters(), lr = learning_rate) # Train the model total_step = len (train_loader) for epoch in range (num_epochs): for i ,(images, labels) in enumerate (train_loader): images = images.to(device) labels = labels.to(device) # Forward pass outputs = model(images) loss = criterion(outputs, labels) # Backward and optimizer optimizer.zero_grad() loss.backward() optimizer.step() if (i + 1 ) % 100 = = 0 : print ( 'Epoch: [{}/{}], Step: [{}/{}], Loss: {}' . format (epoch + 1 , num_epochs, i + 1 , total_step, loss.item())) |
分类模型测试代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | # Test the model model. eval () # eval mode(batch norm uses moving mean/variance #instead of mini-batch mean/variance) with torch.no_grad(): correct = 0 total = 0 for images, labels in test_loader: images = images.to(device) labels = labels.to(device) outputs = model(images) _, predicted = torch. max (outputs.data, 1 ) total + = labels.size( 0 ) correct + = (predicted = = labels). sum ().item() print ( 'Test accuracy of the model on the 10000 test images: {} %' . format ( 100 * correct / total)) |
自定义loss
继承torch.nn.Module类写自己的loss。
1 2 3 4 5 6 7 | class MyLoss(torch.nn.Moudle): def __init__( self ): super (MyLoss, self ).__init__() def forward( self , x, y): loss = torch.mean((x - y) * * 2 ) return loss |
标签平滑(label smoothing)
写一个label_smoothing.py的文件,然后在训练代码里引用,用LSR代替交叉熵损失即可。label_smoothing.py内容如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | import torch import torch.nn as nn class LSR(nn.Module): def __init__( self , e = 0.1 , reduction = 'mean' ): super ().__init__() self .log_softmax = nn.LogSoftmax(dim = 1 ) self .e = e self .reduction = reduction def _one_hot( self , labels, classes, value = 1 ): """ Convert labels to one hot vectors Args: labels: torch tensor in format [label1, label2, label3, ...] classes: int, number of classes value: label value in one hot vector, default to 1 Returns: return one hot format labels in shape [batchsize, classes] """ one_hot = torch.zeros(labels.size( 0 ), classes) #labels and value_added size must match labels = labels.view(labels.size( 0 ), - 1 ) value_added = torch.Tensor(labels.size( 0 ), 1 ).fill_(value) value_added = value_added.to(labels.device) one_hot = one_hot.to(labels.device) one_hot.scatter_add_( 1 , labels, value_added) return one_hot def _smooth_label( self , target, length, smooth_factor): """convert targets to one-hot format, and smooth them. Args: target: target in form with [label1, label2, label_batchsize] length: length of one-hot format(number of classes) smooth_factor: smooth factor for label smooth Returns: smoothed labels in one hot format """ one_hot = self ._one_hot(target, length, value = 1 - smooth_factor) one_hot + = smooth_factor / (length - 1 ) return one_hot.to(target.device) def forward( self , x, target): if x.size( 0 ) ! = target.size( 0 ): raise ValueError( 'Expected input batchsize ({}) to match target batch_size({})' . format (x.size( 0 ), target.size( 0 ))) if x.dim() < 2 : raise ValueError( 'Expected input tensor to have least 2 dimensions(got {})' . format (x.size( 0 ))) if x.dim() ! = 2 : raise ValueError( 'Only 2 dimension tensor are implemented, (got {})' . format (x.size())) smoothed_target = self ._smooth_label(target, x.size( 1 ), self .e) x = self .log_softmax(x) loss = torch. sum ( - x * smoothed_target, dim = 1 ) if self .reduction = = 'none' : return loss elif self .reduction = = 'sum' : return torch. sum (loss) elif self .reduction = = 'mean' : return torch.mean(loss) else : raise ValueError( 'unrecognized option, expect reduction to be one of none, mean, sum' ) |
模型训练可视化
PyTorch可以使用tensorboard来可视化训练过程。安装和运行TensorBoard。
1 2 | pip install tensorboard tensorboard - - logdir = runs |
使用SummaryWriter类来收集和可视化相应的数据,放了方便查看,可以使用不同的文件夹,比如'Loss/train'和'Loss/test'。
1 2 3 4 5 6 7 8 9 10 | from torch.utils.tensorboard import SummaryWriter import numpy as np writer = SummaryWriter() for n_iter in range ( 100 ): writer.add_scalar( 'Loss/train' , np.random.random(), n_iter) writer.add_scalar( 'Loss/test' , np.random.random(), n_iter) writer.add_scalar( 'Accuracy/train' , np.random.random(), n_iter) writer.add_scalar( 'Accuracy/test' , np.random.random(), n_iter) |
保存与加载断点
注意为了能够恢复训练,我们需要同时保存模型和优化器的状态,以及当前的训练轮数。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | start_epoch = 0 # Load checkpoint. if resume: # resume为参数,第一次训练时设为0,中断再训练时设为1 model_path = os.path.join( 'model' , 'best_checkpoint.pth.tar' ) assert os.path.isfile(model_path) checkpoint = torch.load(model_path) best_acc = checkpoint[ 'best_acc' ] start_epoch = checkpoint[ 'epoch' ] model.load_state_dict(checkpoint[ 'model' ]) optimizer.load_state_dict(checkpoint[ 'optimizer' ]) print ( 'Load checkpoint at epoch {}.' . format (start_epoch)) print ( 'Best accuracy so far {}.' . format (best_acc)) # Train the model for epoch in range (start_epoch, num_epochs): ... # Test the model ... # save checkpoint is_best = current_acc > best_acc best_acc = max (current_acc, best_acc) checkpoint = { 'best_acc' : best_acc, 'epoch' : epoch + 1 , 'model' : model.state_dict(), 'optimizer' : optimizer.state_dict(), } model_path = os.path.join( 'model' , 'checkpoint.pth.tar' ) best_model_path = os.path.join( 'model' , 'best_checkpoint.pth.tar' ) torch.save(checkpoint, model_path) if is_best: shutil.copy(model_path, best_model_path) |
5,其他注意事项
不要使用太大的线性层。因为nn.Linear(m,n)使用的是O(mn)的内存,线性层太大很容易超出现有显存。
不要在太长的序列上使用RNN。因为RNN反向传播使用的是BPTT算法,其需要的内存和输入序列的长度呈线性关系。
model(x) 前用 model.train() 和 model.eval() 切换网络状态。
不需要计算梯度的代码块用 with torch.no_grad() 包含起来。
model.eval() 和 torch.no_grad() 的区别在于,model.eval() 是将网络切换为测试状态,例如 BN 和dropout在训练和测试阶段使用不同的计算方法。torch.no_grad() 是关闭 PyTorch 张量的自动求导机制,以减少存储使用和加速计算,得到的结果无法进行 loss.backward()。
model.zero_grad()会把整个模型的参数的梯度都归零, 而optimizer.zero_grad()只会把传入其中的参数的梯度归零.
torch.nn.CrossEntropyLoss 的输入不需要经过 Softmax。torch.nn.CrossEntropyLoss 等价于 torch.nn.functional.log_softmax + torch.nn.NLLLoss。
loss.backward() 前用 optimizer.zero_grad() 清除累积梯度。
torch.utils.data.DataLoader 中尽量设置 pin_memory=True,对特别小的数据集如 MNIST 设置 pin_memory=False 反而更快一些。num_workers 的设置需要在实验中找到最快的取值。
用 del 及时删除不用的中间变量,节约 GPU 存储。
使用 inplace 操作可节约 GPU 存储,如
1 | x = torch.nn.functional.relu(x, inplace = True ) |
减少 CPU 和 GPU 之间的数据传输。例如如果你想知道一个 epoch 中每个 mini-batch 的 loss 和准确率,先将它们累积在 GPU 中等一个 epoch 结束之后一起传输回 CPU 会比每个 mini-batch 都进行一次 GPU 到 CPU 的传输更快。
使用半精度浮点数 half() 会有一定的速度提升,具体效率依赖于 GPU 型号。需要小心数值精度过低带来的稳定性问题。
时常使用 assert tensor.size() == (N, D, H, W) 作为调试手段,确保张量维度和你设想中一致。
除了标记 y 外,尽量少使用一维张量,使用 n*1 的二维张量代替,可以避免一些意想不到的一维张量计算结果。
__EOF__

本文链接:https://www.cnblogs.com/L-shuai/p/15813677.html
关于博主:IT小白
版权声明:本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!
声援博主:如果您觉得文章对您有帮助,可以点击文章右下角【推荐】一下。您的鼓励是博主的最大动力!
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 单元测试从入门到精通
· 上周热点回顾(3.3-3.9)
· winform 绘制太阳,地球,月球 运作规律