学习笔记20：图像语义分割

图像语义分割形象化描述

图像语义分割是指像素级地识别图像，即标注出图像中每个像素所属的对象类别。

目标：一般是将一张RGB图像（height*width*3）或是灰度图（height*width*1）作为输入，输出的是分割图，其中每一个像素包含了其类别的标签（height*width*1）

Unet网络架构

Unet的左侧是convolution layers
右侧则是upsamping layers
convolutions layers中每个pooling layer前输出值会concatenate到对应的upsamping层的输出值中
前半部分作用是特征提取，后半部分是上采样。在一些文献中也把这样的结构叫做编码器-解码器结构。
上采样部分会融合特征提取部分的输出，这样做实际上是将多尺度特征融合在了一起，以最后一个上采样为例，它的特征既来自第一个卷积block的输出(同尺度特征)，也来自上采样的输出(大尺度特征)

获取原图和分割图路径

all_pics = glob.glob(r'E:\HKdataset\HKdataset\training\*.png') # 获取所有图片
images = [p for p in all_pics if 'matte' not in p] # 获取原图
annotations = [p for p in all_pics if 'matte' in p] # 获取分割图

制作数据集

np.random.seed(2021)
index = np.random.permutation(len(images))
images = np.array(images)[index]
anno = np.array(annotations)[index]

all_test_pics = glob.glob(r'E:\HKdataset\HKdataset\testing\*.png')
test_images = [p for p in all_test_pics if 'matte' not in p]
test_anno = [p for p in all_test_pics if 'matte' in p]

transform = transforms.Compose([
                    transforms.Resize((256, 256)),
                    transforms.ToTensor(),
])

class Portrait_dataset(data.Dataset):
    def __init__(self, img_paths, anno_paths):
        self.imgs = img_paths
        self.annos = anno_paths
        
    def __getitem__(self, index):
        img = self.imgs[index]
        anno = self.annos[index]
        pil_img = Image.open(img)    
        img_tensor = transform(pil_img)
        pil_anno = Image.open(anno)    
        anno_tensor = transform(pil_anno)
        anno_tensor = torch.squeeze(anno_tensor).type(torch.long) # 去掉维数为 1 的维度
        anno_tensor[anno_tensor > 0] = 1 # 将分割图转化为只存在0和1两个像素的图像
        return img_tensor, anno_tensor
    
    def __len__(self):
        return len(self.imgs)

train_dataset = Portrait_dataset(images, anno)
test_dataset = Portrait_dataset(test_images, test_anno)
train_dl = data.DataLoader(train_dataset, batch_size=4, shuffle=True)
test_dl = data.DataLoader(test_dataset, batch_size=4)

定义模型

下采样模型

一个下采样模型包括一层池化+两层卷积

第一层卷积channel的数量由in_channels->out_channels，第二层卷积channel数量由out_channels->out_channels

class Downsample(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(Downsample, self).__init__()
        self.conv_relu = nn.Sequential(
                            nn.Conv2d(in_channels, out_channels, 
                                      kernel_size=3, padding=1),
                            nn.ReLU(inplace=True),
                            nn.Conv2d(out_channels, out_channels, 
                                      kernel_size=3, padding=1),
                            nn.ReLU(inplace=True)
            )
        self.pool = nn.MaxPool2d(kernel_size=2)
    def forward(self, x, is_pool=True):
        if is_pool:
            x = self.pool(x)
        x = self.conv_relu(x)
        return x

上采样模型

上采样模型包括两层卷积+一层上采样，上采样采用反卷积

第一层卷积channel的数量由2 * channels->channels，第二层卷积channel数量由channels->channels

上采样再将channel数量减半

class Upsample(nn.Module):
    def __init__(self, channels):
        super(Upsample, self).__init__()
        self.conv_relu = nn.Sequential(
                            nn.Conv2d(2*channels, channels, 
                                      kernel_size=3, padding=1),
                            nn.ReLU(inplace=True),
                            nn.Conv2d(channels, channels,  
                                      kernel_size=3, padding=1),
                            nn.ReLU(inplace=True)
            )
        self.upconv_relu = nn.Sequential(
                               nn.ConvTranspose2d(channels, 
                                                  channels//2, 
                                                  kernel_size=3,
                                                  stride=2,
                                                  padding=1,
                                                  output_padding=1),
                               nn.ReLU(inplace=True)
            )
        
    def forward(self, x):
        x = self.conv_relu(x)
        x = self.upconv_relu(x)
        return x

模型

模型构成是由5层下采样模型，1个上采样层，3个上采样模型，2层卷积(3*3)，1层卷积(1*1)输出

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.down1 = Downsample(3, 64)
        self.down2 = Downsample(64, 128)
        self.down3 = Downsample(128, 256)
        self.down4 = Downsample(256, 512)
        self.down5 = Downsample(512, 1024)
        
        self.up = nn.Sequential(
                               nn.ConvTranspose2d(1024, 
                                                  512, 
                                                  kernel_size=3,
                                                  stride=2,
                                                  padding=1,
                                                  output_padding=1),
                               nn.ReLU(inplace=True)
            )
        
        self.up1 = Upsample(512)
        self.up2 = Upsample(256)
        self.up3 = Upsample(128)
        
        self.conv_2 = Downsample(128, 64)
        self.last = nn.Conv2d(64, 2, kernel_size=1)

    def forward(self, x):
        x1 = self.down1(x, is_pool=False)
        x2 = self.down2(x1)
        x3 = self.down3(x2)
        x4 = self.down4(x3)
        x5 = self.down5(x4)
        
        x5 = self.up(x5)
        
        x5 = torch.cat([x4, x5], dim=1)           # 32*32*1024
        x5 = self.up1(x5)                         # 64*64*256)
        x5 = torch.cat([x3, x5], dim=1)           # 64*64*512  
        x5 = self.up2(x5)                         # 128*128*128
        x5 = torch.cat([x2, x5], dim=1)           # 128*128*256
        x5 = self.up3(x5)                         # 256*256*64
        x5 = torch.cat([x1, x5], dim=1)           # 256*256*128
        
        x5 = self.conv_2(x5, is_pool=False)       # 256*256*64
        
        x5 = self.last(x5)                        # 256*256*3
        return x5

posted @ 2021-02-07 16:14 pbc的成长之路阅读(574) 评论(0) 编辑收藏举报

刷新页面返回顶部

pbc的成长之路