随笔 - 70 文章 - 0 评论 - 0 阅读 - 1889

unet网络解析

Unet网络结构学习记录
导入包
#!/user/bin/python
# coding=utf-8
import numpy as np
import torch
import torch.nn as nn 能够访问PyTorch中定义的所有神经网络层（如全连接层、卷积层、池化层等）、损失函数（如交叉熵损失、均方误差损失等）以及激活函数（如ReLU、Sigmoid等
import torch.nn.functional as F  包含了许多神经网络中常用的函数，这些函数通常是无状态的，即它们不保存任何可学习的参数（权重或偏置）。这些函数主要用于在模型的 forward 方法中直接调用，以执行诸如激活、池化、归一化等操作
from torch.optim import lr_scheduler, optimizer  orch.optim.lr_scheduler 模块包含了一系列用于调整学习率的调度器。
import torchvision
import os, sys
import cv2 as cv
from torch.utils.data import DataLoader, sampler
#数据读取
class SegmentationDataset(object):
    def __init__(self, image_dir, mask_dir):
        self.images = []
        self.masks = []
        files = os.listdir(image_dir)
        sfiles = os.listdir(mask_dir)
        # 使用 os.listdir 列出该目录下的所有文件和目录
        for i in range(len(sfiles)):
            img_file = os.path.join(image_dir, files[i])
            mask_file = os.path.join(mask_dir, sfiles[i])#用于将多个路径组件合并成一个完整的路径。
            # print(img_file, mask_file)
            self.images.append(img_file)   将img_file（一个图像文件或图像数据的引用）添加到self.images列表中。这里self指的是当前类的实例，而images是该实例的一个属性，它存储了一个列表，用于存放图像数据。
            self.masks.append(mask_file)   将mask_file（一个掩码文件或掩码数据的引用）添加到self.masks列表中。同样，self.masks也是当前类实例的一个属性，只不过它用于存储掩码数据
                                            可以使图像与掩码一一对应
　　def __len__(self):   定义了一个类，并且希望这个类的对象能够被用在需要知道其“长度”或“大小”的情况中
        return len(self.images)

    def num_of_samples(self):        ？？？？？为什么需要返回两个len（self.images）
        return len(self.images)

    def __getitem__(self, idx):           它允许类的实例对象使用索引操作符（[]）来访问其元素或属性。
        if torch.is_tensor(idx):     
            idx = idx.tolist()#从张量转为列表
            image_path = self.images[idx]
            mask_path = self.masks[idx]    masks 属性中获取索引为 idx 的元素，索引可以是整数（对于列表、元组等）或键（对于字典）

        else:
            image_path = self.images[idx]
            mask_path = self.masks[idx]
        img = cv.imread(image_path, cv.IMREAD_GRAYSCALE)   cv.IMREAD_GRAYSCALE 是一个标志，指定了图像应该以灰度模式读取。这意味着图像将被转换为单通道灰度图像，其中每个像素的亮度值表示其灰度级，范围从0（黑色）到255（白色）
        mask = cv.imread(mask_path, cv.IMREAD_GRAYSCALE)   cv.imread() 是OpenCV中用于读取图像文件的函数。它接受两个主要参数：图像文件的路径和读取模式。

# 输入图像
        img = np.float32(img) / 255.0#将图像数据转换为浮点数并归一化
        img = np.expand_dims(img, 0)#增加一个新的维度

        # 目标标签0 ~ 1， 对于
        mask[mask <= 128] = 0
        mask[mask > 128] = 1               目的是？分为两类，类似有与无
        mask = np.expand_dims(mask, 0)
        sample = {'image': torch.from_numpy(img), 'mask': torch.from_numpy(mask),}     数据转化为张量
        return sample
class UNetModel(torch.nn.Module):      torch.nn.Module 是所有神经网络模块的基类。当你想要定义一个自己的神经网络模型时，通常会通过继承这个基类来实现。

    def __init__(self, in_features=1, out_features=2, init_features=32):#init_features用于指定初始卷积层或某个嵌入层的输出特征数量
        super(UNetModel, self).__init__()
        features = init_features     输入的通道数
        self.encode_layer1 = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels=in_features, out_channels=features, kernel_size=3, padding=1, stride=1),
            torch.nn.BatchNorm2d(num_features=features),#批量归一化层
            torch.nn.ReLU(),
            torch.nn.Conv2d(in_channels=features, out_channels=features, kernel_size=3, padding=1, stride=1),
            torch.nn.BatchNorm2d(num_features=features),
            torch.nn.ReLU()     第一次卷积
        )
        self.pool1 = torch.nn.MaxPool2d(kernel_size=2, stride=2)    池化1
        self.encode_layer2 = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels=features, out_channels=features*2, kernel_size=3, padding=1, stride=1),
            torch.nn.BatchNorm2d(num_features=features*2),
            torch.nn.ReLU(),
            torch.nn.Conv2d(in_channels=features*2, out_channels=features*2, kernel_size=3, padding=1, stride=1),
            torch.nn.BatchNorm2d(num_features=features * 2),
            torch.nn.ReLU()     第二次卷积
        )
        self.pool2 = torch.nn.MaxPool2d(kernel_size=2, stride=2)   池化2
        self.encode_layer3 = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels=features*2, out_channels=features*4, kernel_size=3, padding=1, stride=1),
            torch.nn.BatchNorm2d(num_features=features * 4),
            torch.nn.ReLU(),
            torch.nn.Conv2d(in_channels=features*4, out_channels=features*4, kernel_size=3, padding=1, stride=1),
            torch.nn.BatchNorm2d(num_features=features * 4),
            torch.nn.ReLU()     第三次卷积
        )
        self.pool3 = torch.nn.MaxPool2d(kernel_size=2, stride=2)    池化3
        self.encode_layer4 = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels=features*4, out_channels=features*8, kernel_size=3, padding=1, stride=1),
            torch.nn.BatchNorm2d(num_features=features * 8),
            torch.nn.ReLU(),
            torch.nn.Conv2d(in_channels=features*8, out_channels=features*8, kernel_size=3, padding=1, stride=1),
            torch.nn.BatchNorm2d(num_features=features * 8),
            torch.nn.ReLU(),    第四次卷积
        )
        self.pool4 = torch.nn.MaxPool2d(kernel_size=2, stride=2)    池化4
        self.encode_decode_layer = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels=features*8, out_channels=features*16, kernel_size=3, padding=1, stride=1),
            torch.nn.BatchNorm2d(num_features=features * 16),
            torch.nn.ReLU(),
            torch.nn.Conv2d(in_channels=features*16, out_channels=features*16, kernel_size=3, padding=1, stride=1),
            torch.nn.BatchNorm2d(num_features=features * 16),
            torch.nn.ReLU()    一个卷积块，用于在编码器和解码器之间的某个点处理特征图。
        )
        self.upconv4 = torch.nn.ConvTranspose2d(
            features * 16, features * 8, kernel_size=2, stride=2   转置卷积或反卷积）通常用于解码器部分，以执行上采样操作
        )
        self.decode_layer4 = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels=features*16, out_channels=features*8, kernel_size=3, padding=1, stride=1),
            torch.nn.BatchNorm2d(num_features=features*8),
            torch.nn.ReLU(),
            torch.nn.Conv2d(in_channels=features*8, out_channels=features*8, kernel_size=3, padding=1, stride=1),
            torch.nn.BatchNorm2d(num_features=features * 8),
            torch.nn.ReLU(),                                       卷积5
        )
        self.upconv3 = torch.nn.ConvTranspose2d(
            features * 8, features * 4, kernel_size=2, stride=2     上采样1
        )
        self.decode_layer3 = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels=features*8, out_channels=features*4, kernel_size=3, padding=1, stride=1),
            torch.nn.BatchNorm2d(num_features=features * 4),
            torch.nn.ReLU(),
            torch.nn.Conv2d(in_channels=features*4, out_channels=features*4, kernel_size=3, padding=1, stride=1),
            torch.nn.BatchNorm2d(num_features=features * 4),
            torch.nn.ReLU()                                          卷积6
        )
        self.upconv2 = torch.nn.ConvTranspose2d(
            features * 4, features * 2, kernel_size=2, stride=2       上采样2
        )
        self.decode_layer2 = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels=features*4, out_channels=features*2, kernel_size=3, padding=1, stride=1),
            torch.nn.BatchNorm2d(num_features=features * 2),
            torch.nn.ReLU(),
            torch.nn.Conv2d(in_channels=features*2, out_channels=features*2, kernel_size=3, padding=1, stride=1),
            torch.nn.BatchNorm2d(num_features=features * 2),
            torch.nn.ReLU()                                        卷积7
        )
        self.upconv1 = torch.nn.ConvTranspose2d(
            features * 2, features, kernel_size=2, stride=2       上采样3
        )
        self.decode_layer1 = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels=features*2, out_channels=features, kernel_size=3, padding=1, stride=1),
            torch.nn.BatchNorm2d(num_features=features),
            torch.nn.ReLU(),
            torch.nn.Conv2d(in_channels=features, out_channels=features, kernel_size=3, padding=1, stride=1),
            torch.nn.BatchNorm2d(num_features=features),
            torch.nn.ReLU()                                          卷积8
        )
        self.out_layer = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels=features, out_channels=out_features, kernel_size=1, padding=0, stride=1),
        )

    def forward(self, x):
        enc1 = self.encode_layer1(x)
        enc2 = self.encode_layer2(self.pool1(enc1))
        enc3 = self.encode_layer3(self.pool2(enc2))
        enc4 = self.encode_layer4(self.pool3(enc3))

        bottleneck = self.encode_decode_layer(self.pool4(enc4))
        dec4 = self.upconv4(bottleneck)
        dec4 = torch.cat((dec4, enc4), dim=1)#用于沿指定维度连接张量
        dec4 = self.decode_layer4(dec4)

        dec3 = self.upconv3(dec4)
        dec3 = torch.cat((dec3, enc3), dim=1)
        dec3 = self.decode_layer3(dec3)

        dec2 = self.upconv2(dec3)
        dec2 = torch.cat((dec2, enc2), dim=1)
        dec2 = self.decode_layer2(dec2)

        dec1 = self.upconv1(dec2)
        dec1 = torch.cat((dec1, enc1), dim=1)
        dec1 = self.decode_layer1(dec1)

        out = self.out_layer(dec1)
        return out                                          向前传播
#训练
if __name__ == '__main__':
    index = 0          用作循环的计数器、数组或列表的索引、或者是在遍历数据结构时跟踪当前位置的标识。若对象不方便计数时，可用index，（例如 while（true））一般代码中 for i in range，可以不需要index
    num_epochs = 50 
    train_on_gpu = True
    unet = UNetModel().cuda()
    # model_dict = unet.load_state_dict(torch.load('unet_road_model-100.pt'))

torch.load() 函数用于加载保存的 PyTorch 模型或张量，而 load_state_dict() 是 PyTorch 模型类的一个方法，用于加载模型的状态字典（即模型的参数）

    image_dir = r'D:\daima\CrackForest-dataset-master\CrackForest-dataset-master\train'
    mask_dir = r'D:\daima\CrackForest-dataset-master\CrackForest-dataset-master\png'
    dataloader = SegmentationDataset(image_dir, mask_dir)
    optimizer = torch.optim.SGD(unet.parameters(), lr=0.01, momentum=0.9)
    train_loader = DataLoader(
        dataloader, batch_size=1, shuffle=False)
"Shuffle" 在编程和数据处理中，尤其是在处理数组、列表或数据集时，指的是将元素或记录的顺序随机打乱的过程。这个过程通常用于确保数据的随机性，以便在训练机器学习模型时避免偏差或过度拟合特定数据顺序的模型
    for epoch in range(num_epochs):
        train_loss = 0.0
        for i_batch, sample_batched in enumerate(train_loader):enumerate(train_loader) 函数用于获取每个批次的索引（i_batch）和该批次的数据（sample_batched）
            images_batch, target_labels = \
                sample_batched['image'], sample_batched['mask']  进行批处理
            print(target_labels.min())
            print(target_labels.max())

            if train_on_gpu:
                images_batch, target_labels = images_batch.cuda(), target_labels.cuda()
            optimizer.zero_grad()

            # forward pass: compute predicted outputs by passing inputs to the model
            m_label_out_ = unet(images_batch)   正在执行一个前向传播过程
            # print(m_label_out_)
            # calculate the batch loss
            target_labels = target_labels.contiguous().view(-1)   确保target_labels张量在内存中是连续的
            m_label_out_ = m_label_out_.transpose(1,3).transpose(1, 2).contiguous().view(-1, 2) transpose可将行列进行位置的调换，例如可将（3*480*320）变成（480*320*3），view将数组进行相乘，除以2
            target_labels = target_labels.long()
            loss = torch.nn.functional.cross_entropy(m_label_out_, target_labels)torch.nn.functional.cross_entropy 是 PyTorch 中的一个函数，用于计算交叉熵损失（cross-entropy loss）
            print(loss)
            # backward pass: compute gradient of the loss with respect to model parameters
            loss.backward()

            # perform a single optimization step (parameter update)
            optimizer.step()

            # update training loss
            train_loss += loss.item()
            if index % 100 == 0:
                print('step: {} \tcurrent Loss: {:.6f} '.format(index, loss.item()))
            index += 1
            # test(unet)
        # 计算平均损失
        train_loss = train_loss / dataloader.num_of_samples()
        # 显示训练集与验证集的损失函数
        print('Epoch: {} \tTraining Loss: {:.6f} '.format(epoch, train_loss))
        # test(unet)
    # save model
    unet.eval()
    torch.save(unet.state_dict(), 'unet_road_model.pt')保存数据
#测试
def test(unet):
    model_dict=unet.load_state_dict(torch.load('unet_road_model.pt'))将保存的文件 'unet_road_model.pt' 中的权重加载到一个名为 unet 的U-Net模型实例中。
    root_dir = r'D:\daima\CrackForest-dataset-master\CrackForest-dataset-master\test'
    fileNames = os.listdir(root_dir)
    for f in fileNames:
        image = cv.imread(os.path.join(root_dir, f), cv.IMREAD_GRAYSCALE)
        h, w = image.shape
        img = np.float32(image) /255.0
        img = np.expand_dims(img, 0)
        x_input = torch.from_numpy(img).view( 1, 1, h, w) 张量有1个样本（batch size为1），1个通道（灰度图），高度为h，宽度为w
        probs = unet(x_input.cuda())
        m_label_out_ = probs.transpose(1, 3).transpose(1, 2).contiguous().view(-1, 2)
        grad, output = m_label_out_.data.max(dim=1)会对m_label_out_的每一行（即每个样本）沿着dim=1（即列）方向找到最大值。最大值本身（存储在grad中）和这些最大值在每行中的索引（存储在output中
        output[output > 0] = 255
        predic_ = output.view(h, w).cpu().detach().numpy()

        # print(predic_)
        # print(predic_.max())
        # print(predic_.min())

        # print(predic_)
        # print(predic_.shape)
        # cv.imshow("input", image)
        result = cv.resize(np.uint8(predic_), (w, h))np.uint8(predic_)这部分代码将predic_数组转换为无符号8位整数类型，cv.resize(src, dsize, ...)函数用于调整图像的大小

        cv.imshow("unet-segmentation-demo", result)
        cv.waitKey(0)等待用户按键
    　　 cv.destroyAllWindows()用于销毁所有由OpenCV创建的高GUI窗口

1、跳跃连接

（Skip Connection）是深度学习中一种常用的架构技巧，特别是在处理图像分割、目标检测等任务时，跳跃连接在提升模型性能方面发挥着重要作用。在U-Net模型中，跳跃连接是其核心特点之一，通过跳跃连接，U-Net能够更有效地融合编码器中的低层次特征和解码器中的高层次特征，从而提升分割精度。

跳跃连接的作用

1.特征融合：跳跃连接允许模型将编码器中的高分辨率特征与解码器中的上采样特征相结合，有助于恢复丢失的空间信息，提高分割精度。

2.梯度流动：在深度网络中，梯度在反向传播时可能会逐渐消失或爆炸，导致训练困难。跳跃连接可以促进梯度流动，使得浅层网络的梯度能够直接传递到深层网络，有助于解决梯度消失或爆炸的问题。

3.信息重用：通过跳跃连接，模型可以重用编码器中的特征信息，这些特征信息在解码过程中仍然具有价值，有助于提升模型的整体性能。

U-Net中的跳跃连接方法

在U-Net模型中，跳跃连接通常有以下几种方法：

1.基本跳跃连接：这是U-Net中最基本的跳跃连接方法。在编码器和解码器对应的层级之间，直接将编码器的特征图（可能需要经过裁剪或填充以匹配解码器特征图的尺寸）与解码器的上采样特征图进行拼接（Concatenate）。这种方法能够直接融合不同层级的特征信息。

2.卷积跳跃连接：在跳跃连接之前，对编码器的特征图进行卷积操作，以调整其通道数或进行其他形式的特征变换。这种方法可以引入额外的非线性变换，增强特征的表达能力。

3.残差跳跃连接：类似于残差网络（ResNet）中的残差连接，先将编码器的特征图进行卷积操作，然后将其与解码器的上采样特征图相加（而不是拼接）。这种方法可以避免特征信息的丢失，并有助于缓解深层网络中的梯度消失问题。

4.金字塔跳跃连接：这是一种更高级的跳跃连接方法，不仅将编码器的特征图与对应层级的解码器特征图进行连接，还可以与其他尺度的特征图进行连接。这种方法能够利用不同层级和尺度的特征信息，提升模型的分割能力。

2、Softmax激活函数

是一种常用于多分类问题中的激活函数，它将一个向量（通常是一个神经网络的输出层）转换成另一个向量，使得每一个元素的范围都在(0, 1)之间，并且所有元素的和为1。这种转换可以解释为概率分布，其中每个元素代表属于某一类的概率

import torch import torch.nn as nn import torch.nn.functional as F # 假设我们有一个简单的神经网络 class SimpleNet(nn.Module): def __init__(self, num_inputs, num_outputs): super(SimpleNet, self).__init__() self.fc = nn.Linear(num_inputs, num_outputs) # 全连接层 def forward(self, x): # 只需通过全连接层，不需要显式地应用softmax x = self.fc(x) return x # 实例化网络 model = SimpleNet(num_inputs=10, num_outputs=3) # 假设输入特征数为10，输出类别数为3 # 假设的输入和标签 inputs = torch.randn(1, 10) # 批量大小为1，特征数为10 labels = torch.tensor([2], dtype=torch.long) # 假设的类别标签 # 定义损失函数（内部会隐式地应用softmax） criterion = nn.CrossEntropyLoss() # 前向传播 outputs = model(inputs) # 计算损失（softmax在内部被应用） loss = criterion(outputs, labels) print(loss)

3、交叉熵损失函数

（Cross-Entropy Loss Function）是深度学习和机器学习中常用的一种损失函数，它主要用于衡量模型预测的概率分布与真实标签的概率分布之间的差异。以下是关于交叉熵损失函数的详细解释

4、sample_batched 通常是一个包含图像数据和标签的元组（tuple）

sample_batched['image']：这部分数据通常包含了多个图像样本，它们被组织成一个批次（batch）。在深度学习中，为了提高计算效率和内存利用率，我们通常不会一次只处理一个图像样本，而是会同时处理多个样本。这些图像数据通常以多维数组（在Python中常使用NumPy数组或PyTorch的Tensor等）的形式存在，其中维度可能包括（但不限于）：批次大小（batch size）、图像通道数（如RGB图像的3个通道）、图像高度、图像宽度。

sample_batched['mask']：与sample_batched['image']相对应，这部分数据包含了与图像样本一一对应的掩模（mask））数据。掩模数据在图像分割等任务中非常重要，它用于指示图像中哪些区域是感兴趣的目标区域，哪些区域是背景或其他非目标区域。掩模数据通常也是多维数组的形式，其维度与图像数据类似，但每个像素的值通常不是颜色值，而是表示该像素是否属于目标区域的标识符（如0表示背景，1表示目标区域）。

5：target_labels = target_labels.contiguous().view(-1)

target_labels.contiguous()：这个操作确保target_labels张量在内存中是连续的。在PyTorch中，当对张量进行切片、转置等操作后，其内部数据可能会变得不连续（即，数据在内存中的布局不是线性的，这可能会影响后续操作的性能）。.contiguous()方法会返回一个新的张量，该张量的数据在内存中是连续的。然而，如果target_labels原本就是连续的，这个操作可能不会有任何效果（除了创建一个新的、内容相同的张量）。

.view(-1)：这个操作改变了张量的形状，但不改变其数据。-1是一个特殊的值，它告诉PyTorch自动计算这个维度的大小，以保持张量中元素的总数不变。这通常用于将多维张量“展平”成一维张量。例如，如果target_labels原本的形状是(N, C)（其中N是批次大小，C是类别数或其他维度），那么.view(-1)会将其变为一个形状为(N*C,)的一维张量。

6：m_label_out_ = m_label_out_.transpose(1,3).transpose(1, 2).contiguous().view(-1, 2)

m_label_out_.transpose(1,3)：这个操作交换了m_label_out_张量中第1维和第3维的位置。这通常用于调整图像数据的通道顺序（例如，从PyTorch的CHW格式转换为HWC格式，尽管在这个上下文中可能不完全是这样）。然而，由于transpose操作可能会使张量变得不连续，因此后续操作可能需要注意这一点。

.transpose(1, 2)：紧接着，这个操作交换了上一步结果中第1维和第2维的位置。这进一步调整了张量的形状，可能是为了将特定的维度（如高度和宽度）对齐到期望的位置。

.contiguous()：与第一行中的.contiguous()相同，这个操作确保了经过两次转置后的张量在内存中是连续的。

.view(-1, 2)：最后，这个操作将张量重新塑形为一个二维张量，其中第二维的大小被固定为2。这通常意味着m_label_out_（在转置和重塑之前）可能包含了某种形式的二分类输出（例如，每个像素点属于某个类别的概率和不属于该类别的概率），现在这些输出被展平并重新组织成了一个二维张量，其中每行包含两个元素（可能代表两个类别的得分或概率）。

7、torch.nn.functional.cross_entropy

是 PyTorch 中的一个函数，用于计算交叉熵损失（cross-entropy loss），这是一种常用于分类问题的损失函数。它结合了 log_softmax 和 NLLLoss（负对数似然损失）在一个单一的函数中，使得计算更加高效。

7、m_label_out_ 应该是一个包含了模型预测输出的张量，其形状通常是 [batch_size, num_classes, ...]，其中 ... 表示可能存在的其他维度（如空间维度，在图像分割任务中）。然而，在使用 cross_entropy 函数时，m_label_out_ 的形状需要被调整为 [batch_size * ..., num_classes]，其中 ... 表示除了批次大小和类别数之外的所有维度都被展平。这通常通过 .view(-1, num_classes) 或类似的形状调整操作来实现。

8、target_labels 是一个包含了真实标签的张量，其形状应该是 [batch_size * ...]，与 m_label_out_ 展平后的第一个维度相匹配。每个元素都是一个整数，表示对应样本的类别索引。

然而，在你的代码中，m_label_out_ 被通过 .transpose 和 .view(-1, 2) 调整为了一个二维张量，其中第二维的大小为2，这通常意味着模型是在进行二分类任务。如果确实如此，那么这段代码是合理的。但需要注意的是，target_labels 也应该被相应地调整，以确保它是一个一维张量，且每个元素都是0或1（或其他两个类别的索引，如果类别索引不是从0开始的）。

如果 m_label_out_ 的第二维确实代表了两个类别的得分（如通过sigmoid激活函数得到的概率），并且 target_labels 包含了对应的类别索引（在这个二分类场景下，通常是0和1），那么这行代码将计算每个样本的交叉熵损失，并自动对所有样本的损失进行平均（这是 cross_entropy 函数的默认行为）

9、probs.transpose(1, 3)：

这个操作是在交换probs张量的第1维和第3维。在PyTorch中，张量的维度通常按照(N, C, H, W)的顺序来理解，其中N是批量大小（batch size），C是通道数（对于图像来说，可能是颜色通道），H是高度，W是宽度。但是，probs的具体维度可能取决于你的模型和任务。这个transpose操作可能是在调整维度以便进行后续的操作。

.transpose(1, 2)：

紧接着，这个操作交换了上一步结果张量的第1维和第2维。这通常是为了将通道维度（如果之前被移动了）和高度维度（或其他某个维度）调整到正确的位置，以便于后续的视图（view）变换或计算。

..contiguous()：

.这个操作确保张量在内存中是连续的。在进行transpose等操作后，张量在内存中的布局可能会变得不连续，这会影响后续操作的性能，甚至导致错误。contiguous()方法会返回一个新的张量，该张量在内存中是连续的，并且具有与原张量相同的数据。

..view(-1, 2)：

最后，这个操作改变了张量的形状。-1是一个特殊的值，它告诉PyTorch自动计算这一维的大小，以便张量的总元素数保持不变。2意味着我们想要将张量重塑为两列，而行数（-1指定的部分）则根据张量中剩余的元素数自动确定。这个操作通常用于将多维张量“展平”为二维张量，以便进行某些类型的计算或操作，比如分类任务中的logits到概率的转换（尽管这里probs可能已经是概率了）。

posted on 2024-07-24 16:30 风起- 阅读(160) 评论(0) 编辑收藏举报

刷新页面返回顶部

（评论功能已被禁用）

相关博文：

· HelloRS32学习总结

· 卷积神经网络

· PyTorch 人工智能基础知识：1~5

· 装甲板CNN试题

· PyTorch 2.2 中文官方教程（一）

阅读排行：
· TypeScript + Deepseek 打造卜卦网站：技术与玄学的结合
· Manus的开源复刻OpenManus初探
· 写一个简单的SQL生成工具
· AI 智能体引爆开源社区「GitHub 热点速览」
· C#/.NET/.NET Core技术前沿周刊 | 第 29 期（2025年3.1-3.9）

公告

昵称：风起-
园龄： 9个月
粉丝： 2
关注： 2

+加关注

2025年3月

日

一

二

三

四

五

六

unet网络解析

跳跃连接的作用

U-Net中的跳跃连接方法

5：target_labels = target_labels.contiguous().view(-1)

6：m_label_out_ = m_label_out_.transpose(1,3).transpose(1, 2).contiguous().view(-1, 2)

公告

搜索

常用链接

随笔分类

随笔档案

阅读排行榜

推荐排行榜