神经网络架构参考：2-2 卷积篇

densenet

结构

层名称	类型	输入大小 (H x W x C)	输出大小 (H x W x C)	核尺寸	步长	参数数量
Initial Conv	Conv2D	224 x 224 x 3	112 x 112 x 64	7 x 7	2	9,408
Max Pooling	MaxPool2D	112 x 112 x 64	56 x 56 x 64	3 x 3	2	0
Dense Block 1	Composite	56 x 56 x 64	56 x 56 x 256	-	-	-
	Bottleneck 1.1	Conv2D	56 x 56 x 64	56 x 56 x 128	1 x 1	1
	Conv 1.1	Conv2D	56 x 56 x 128	56 x 56 x 32	3 x 3	1
	...	...	...	...	...	...
	Bottleneck 1.6	Conv2D	56 x 56 x 256	56 x 56 x 128	1 x 1	1
	Conv 1.6	Conv2D	56 x 56 x 128	56 x 56 x 32	3 x 3	1
Transition Layer 1	Composite	56 x 56 x 320	28 x 28 x 128	-	-	-
	Conv	Conv2D	56 x 56 x 320	56 x 56 x 128	1 x 1	1
	Average Pooling	AveragePool2D	56 x 56 x 128	28 x 28 x 128	2 x 2	2
Dense Block 2	Composite	28 x 28 x 128	28 x 28 x 512	-	-	-
	Bottleneck 2.1	Conv2D	28 x 28 x 128	28 x 28 x 128	1 x 1	1
	Conv 2.1	Conv2D	28 x 28 x 128	28 x 28 x 32	3 x 3	1
	...	...	...	...	...	...
	Bottleneck 2.6	Conv2D	28 x 28 x 512	28 x 28 x 128	1 x 1	1
	Conv 2.6	Conv2D	28 x 28 x 128	28 x 28 x 32	3 x 3	1
Transition Layer 2	Composite	28 x 28 x 640	14 x 14 x 256	-	-	-
	Conv	Conv2D	28 x 28 x 640	28 x 28 x 256	1 x 1	1
	Average Pooling	AveragePool2D	28 x 28 x 256	14 x

下面是一个Dense Block的结构表格示例，这里以DenseNet-121中的第一个Dense Block为例，该Dense Block包含6个卷积层（每个卷积层由一个瓶颈层和一个3x3卷积层组成）。请注意，每个卷积层的输入大小是基于之前所有层的特征图合并后的结果。

层名称	类型	输入大小 (H x W x C)	输出大小 (H x W x C)	核尺寸	步长	参数数量
Bottleneck 1.1	Conv2D	56 x 56 x 64	56 x 56 x 128	1 x 1	1	832
Conv 1.1	Conv2D	56 x 56 x 128	56 x 56 x 32	3 x 3	1	3,072
Bottleneck 1.2	Conv2D	56 x 56 x 96	56 x 56 x 128	1 x 1	1	1,056
Conv 1.2	Conv2D	56 x 56 x 128	56 x 56 x 32	3 x 3	1	3,072
Bottleneck 1.3	Conv2D	56 x 56 x 128	56 x 56 x 128	1 x 1	1	1,056
Conv 1.3	Conv2D	56 x 56 x 128	56 x 56 x 32	3 x 3	1	3,072
Bottleneck 1.4	Conv2D	56 x 56 x 160	56 x 56 x 128	1 x 1	1	1,056
Conv 1.4	Conv2D	56 x 56 x 128	56 x 56 x 32	3 x 3	1	3,072
Bottleneck 1.5	Conv2D	56 x 56 x 192	56 x 56 x 128	1 x 1	1	1,056
Conv 1.5	Conv2D	56 x 56 x 128	56 x 56 x 32	3 x 3	1	3,072
Bottleneck 1.6	Conv2D	56 x 56 x 224	56 x 56 x 128	1 x 1	1	1,056
Conv 1.6	Conv2D	56 x 56 x 128	56 x 56 x 32	3 x 3	1	3,072

下面是一个Transition Layer的结构表格示例，这里以DenseNet-121中的一个Transition Layer为例：

层名称	类型	输入大小 (H x W x C)	输出大小 (H x W x C)	核尺寸	步长	参数数量
Conv (Transition)	Conv2D	56 x 56 x 256	56 x 56 x 128	1 x 1	1	33,024
Avg Pooling	AveragePooling2D	56 x 56 x 128	28 x 28 x 128	2 x 2	2	0

pytorch 源码

import torch
import torch.nn as nn
import torch.nn.functional as F
# 定义Dense Block中的单个Dense Layer
class DenseLayer(nn.Module):
    def __init__(self, in_channels, growth_rate):
        super(DenseLayer, self).__init__()
        inter_channels = 4 * growth_rate
        self.bn1 = nn.BatchNorm2d(in_channels)
        self.relu = nn.ReLU(inplace=True)
        self.conv1 = nn.Conv2d(in_channels, inter_channels, kernel_size=1, bias=False)
        self.bn2 = nn.BatchNorm2d(inter_channels)
        self.conv2 = nn.Conv2d(inter_channels, growth_rate, kernel_size=3, padding=1, bias=False)
    def forward(self, x):
        out = self.bn1(x)
        out = self.relu(out)
        out = self.conv1(out)
        out = self.bn2(out)
        out = self.relu(out)
        out = self.conv2(out)
        out = torch.cat([x, out], 1)
        return out
# 定义Dense Block
class DenseBlock(nn.Module):
    def __init__(self, in_channels, growth_rate, num_layers):
        super(DenseBlock, self).__init__()
        layers = []
        for i in range(num_layers):
            layers.append(DenseLayer(in_channels + i * growth_rate, growth_rate))
        self.layers = nn.Sequential(*layers)
    def forward(self, x):
        return self.layers(x)
# 定义Transition Layer
class TransitionLayer(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(TransitionLayer, self).__init__()
        self.bn = nn.BatchNorm2d(in_channels)
        self.relu = nn.ReLU(inplace=True)
        self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=1, bias=False)
        self.pool = nn.AvgPool2d(kernel_size=2, stride=2)
    def forward(self, x):
        out = self.bn(x)
        out = self.relu(out)
        out = self.conv(out)
        out = self.pool(out)
        return out
# 定义DenseNet
class DenseNet(nn.Module):
    def __init__(self, growth_rate=32, block_config=(6, 12, 24, 16), num_init_features=64, bn_size=4, drop_rate=0, num_classes=1000):
        super(DenseNet, self).__init__()
        # 初始卷积层
        self.features = nn.Sequential(
            nn.Conv2d(3, num_init_features, kernel_size=7, stride=2, padding=3, bias=False),
            nn.BatchNorm2d(num_init_features),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        )
        # 每个Dense Block之前的通道数
        num_features = num_init_features
        for i, num_layers in enumerate(block_config):
            # 添加一个Dense Block
            self.features.add_module('denseblock%d' % (i + 1),
                                     DenseBlock(num_features, growth_rate, num_layers))
            # 更新通道数
            num_features += num_layers * growth_rate
            # 在Dense Block之间添加Transition Layer，除了最后一个
            if i != len(block_config) - 1:
                self.features.add_module('transition%d' % (i + 1),
                                         TransitionLayer(num_features, num_features // 2))
                num_features = num_features // 2
        # 最终的BatchNorm和ReLU
        self.features.add_module('bn', nn.BatchNorm2d(num_features))
        self.features.add_module('relu', nn.ReLU(inplace=True))
        # 全局平均池化层和分类器
        self.classifier = nn.Linear(num_features, num_classes)
        # 初始化权重
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight)
            elif isinstance(m, nn.BatchNorm2d):
                nn.init.constant_(m.weight, 1)
                nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                nn.init.constant_(m.bias, 0)
    def forward(self, x):
        features = self.features(x)
        out = F.adaptive_avg_pool2d(features, (1, 1))
        out = torch.flatten(out, 1)
        out = self.classifier(out)
        return out
# 创建DenseNet-121模型
densenet121 = DenseNet(growth_rate=32, block_config=(6, 12, 24, 16))
# 打印模型结构
print(densenet121)
# 假设输入张量是3x224x224
input_tensor = torch.randn(1, 3, 224, 224)
# 前向传播
output = densenet121(input_tensor)
print(output.shape)  # 应该输出torch.Size([1, 1000])，表示batch size为1，类别数为1000

mobilenet

结构

层名称	类型	输入大小（HWC）	输出大小（HWC）	核尺寸	步长	参数数量
Conv2d_0	Conv2d	224x224x3	112x112x32	3x3	2	864
DepthwiseConv2d_1	DepthwiseConv2d	112x112x32	112x112x32	3x3	1	288
Conv2d_2	Conv2d	112x112x32	112x112x64	1x1	1	2048
DepthwiseConv2d_3	DepthwiseConv2d	112x112x64	56x56x64	3x3	2	576
Conv2d_4	Conv2d	56x56x64	56x56x128	1x1	1	8192
...	...	...	...	...	...	...
DepthwiseConv2d_12	DepthwiseConv2d	14x14x512	14x14x512	3x3	1	4608
Conv2d_13	Conv2d	14x14x512	14x14x1024	1x1	1	524288
DepthwiseConv2d_14	DepthwiseConv2d	14x14x1024	7x7x1024	3x3	2	9216
Conv2d_15	Conv2d	7x7x1024	7x7x1024	1x1	1	1048576
AvgPool2d_16	AvgPool2d	7x7x1024	1x1x1024	7x7	1	0
Flatten_17	Flatten	1x1x1024	1024	-	-	0
Linear_18	Linear	1024	1000	-	-	1025000

pytorch 源码

import torch
import torch.nn as nn
import torch.nn.functional as F
class MobileNetV1(nn.Module):
    def __init__(self, num_classes=1000):
        super(MobileNetV1, self).__init__()
        self.conv1 = nn.Conv2d(3, 32, kernel_size=3, stride=2, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(32)
        self.layers = self._make_layers(in_channels=32)
        self.conv2 = nn.Conv2d(1024, 1024, kernel_size=1, stride=1, bias=False)
        self.bn2 = nn.BatchNorm2d(1024)
        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        self.fc = nn.Linear(1024, num_classes)
    def _make_layers(self, in_channels):
        layers = []
        # 定义每一层的配置
        cfg = [
            (32, 1),
            (64, 2),
            (128, 2),
            (256, 2),
            (512, 6),
            (1024, 2),
        ]
        for x, stride in cfg:
            # 深度可分离卷积
            layers.append(nn.Conv2d(in_channels, in_channels, kernel_size=3, stride=stride, padding=1, groups=in_channels, bias=False))
            layers.append(nn.BatchNorm2d(in_channels))
            layers.append(nn.ReLU6(inplace=True))
            # 点卷积（1x1卷积）
            layers.append(nn.Conv2d(in_channels, x, kernel_size=1, stride=1, padding=0, bias=False))
            layers.append(nn.BatchNorm2d(x))
            layers.append(nn.ReLU6(inplace=True))
            in_channels = x
        return nn.Sequential(*layers)
    def forward(self, x):
        x = F.relu6(self.bn1(self.conv1(x)))
        x = self.layers(x)
        x = F.relu6(self.bn2(self.conv2(x)))
        x = self.avgpool(x)
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        return x
# 创建模型实例
model = MobileNetV1(num_classes=1000)
print(model)

空间注意力网络

结构

层名称	类型	输入大小（HWC）	输出大小（HWC）	核尺寸	步长	参数数量
Input	-	224x224x3	-	-	-	0
Conv1	Conv2D	224x224x3	112x112x64	7x7	2	9472
BatchNorm1	BatchNorm	112x112x64	112x112x64	-	-	256
ReLU1	ReLU	112x112x64	112x112x64	-	-	0
MaxPool1	MaxPooling	112x112x64	56x56x64	3x3	2	0
Conv2	Conv2D	56x56x64	56x56x128	3x3	1	73856
BatchNorm2	BatchNorm	56x56x128	56x56x128	-	-	512
ReLU2	ReLU	56x56x128	56x56x128	-	-	0
SpatialAttn1	SpatialAttn	56x56x128	56x56x128	-	-	8192
Conv3	Conv2D	56x56x128	28x28x256	3x3	2	295168
BatchNorm3	BatchNorm	28x28x256	28x28x256	-	-	1024
ReLU3	ReLU	28x28x256	28x28x256	-	-	0
SpatialAttn2	SpatialAttn	28x28x256	28x28x256	-	-	32768
Conv4	Conv2D	28x28x256	14x14x512	3x3	2	1180160
BatchNorm4	BatchNorm	14x14x512	14x14x512	-	-	2048
ReLU4	ReLU	14x14x512	14x14x512	-	-	0
SpatialAttn3	SpatialAttn	14x14x512	14x14x512	-	-	131072
AvgPool	AvgPooling	14x14x512	7x7x512	7x7	1	0
Flatten	Flatten	7x7x512	25088	-	-	0
FC1	Dense	25088	4096	-	-	102764544
ReLU5	ReLU	4096	4096	-	-	0
Dropout	Dropout	4096	4096	-	-	0
FC2	Dense	4096	1000	-	-	4097000
Softmax	Softmax	1000	1000	-	-	0

以下是一个简化的空间注意力模块的结构表格。请注意，这个表格是一个示例，实际的网络结构可能会有所不同。

层名称	类型	输入大小（HWC）	输出大小（HWC）	核尺寸	步长	参数数量
Input	-	HxWxC	-	-	-	0
Conv1	Conv2D	HxWxC	HxWx1	1x1	1	C
Sigmoid	Sigmoid	HxWx1	HxWx1	-	-	0
Multiply	Element-wise Mul	HxWxC	HxWxC	-	-	0

pytorch 源码

import torch
import torch.nn as nn
import torch.nn.functional as F
class SpatialAttentionModule(nn.Module):
    def __init__(self, kernel_size=7):
        super(SpatialAttentionModule, self).__init__()
        assert kernel_size % 2 == 1, "Kernel size must be odd"
        self.conv1 = nn.Conv2d(2, 1, kernel_size, padding=kernel_size//2, bias=False)
        self.sigmoid = nn.Sigmoid()
    def forward(self, x):
        # 原始特征图
        avg_out = torch.mean(x, dim=1, keepdim=True)
        max_out, _ = torch.max(x, dim=1, keepdim=True)
        x = torch.cat([avg_out, max_out], dim=1)
        x = self.conv1(x)
        return self.sigmoid(x) * x
class SpatialAttentionNetwork(nn.Module):
    def __init__(self):
        super(SpatialAttentionNetwork, self).__init__()
        self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False)
        self.bn1 = nn.BatchNorm2d(64)
        self.relu = nn.ReLU(inplace=True)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.spatial_attention = SpatialAttentionModule(kernel_size=7)
        self.layer1 = self._make_layer(64, 64, 3)
        self.layer2 = self._make_layer(64, 128, 4, stride=2)
        self.layer3 = self._make_layer(128, 256, 6, stride=2)
        self.layer4 = self._make_layer(256, 512, 3, stride=2)
        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        self.fc = nn.Linear(512, 1000)
    def _make_layer(self, in_channels, out_channels, blocks, stride=1):
        layers = []
        layers.append(nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1, bias=False))
        layers.append(nn.BatchNorm2d(out_channels))
        layers.append(nn.ReLU(inplace=True))
        for i in range(1, blocks):
            layers.append(nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1, bias=False))
            layers.append(nn.BatchNorm2d(out_channels))
            layers.append(nn.ReLU(inplace=True))
        return nn.Sequential(*layers)
    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)
        
        x = self.spatial_attention(x)
        
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.fc(x)
        
        return x
# 实例化网络
san = SpatialAttentionNetwork()
# 打印网络结构
print(san)

卷积变分自编码器

结构1（转置卷积）

层名称	类型	输入大小（HWC）	输出大小（HWC）	核尺寸	步长	参数数量
Input	-	128x128x3	-	-	-	0
Conv1	Conv2D	128x128x3	64x64x32	3x3	2x2	896
ReLU1	ReLU	64x64x32	64x64x32	-	-	0
Conv2	Conv2D	64x64x32	32x32x64	3x3	2x2	18496
ReLU2	ReLU	32x32x64	32x32x64	-	-	0
Conv3	Conv2D	32x32x64	16x16x128	3x3	2x2	73856
ReLU3	ReLU	16x16x128	16x16x128	-	-	0
Flatten	Flatten	16x16x128	32768	-	-	0
FC4	Dense	32768	512	-	-	16781312
FC_mean	Dense	512	10	-	-	5130
FC_log_var	Dense	512	10	-	-	5130
Sampling	Sampling	10	10	-	-	0
FC5	Dense	10	512	-	-	5220
FC6	Dense	512	32768	-	-	16781312
Reshape	Reshape	32768	16x16x128	-	-	0
Deconv1	Conv2DTranspose	16x16x128	32x32x64	3x3	2x2	73792
ReLU4	ReLU	32x32x64	32x32x64	-	-	0
Deconv2	Conv2DTranspose	32x32x64	64x64x32	3x3	2x2	18432
ReLU5	ReLU	64x64x32	64x64x32	-	-	0
Deconv3	Conv2DTranspose	64x64x32	128x128x3	3x3	2x2	864
Sigmoid	Sigmoid	128x128x3	128x128x3	-	-	0

结构2（池化+上采样）

层名称	类型	输入大小（HWC）	输出大小（HWC）	核尺寸	步长	参数数量
Input	-	128x128x3	-	-	-	0
Conv1	Conv2D	128x128x3	128x128x32	3x3	1x1	896
ReLU1	ReLU	128x128x32	128x128x32	-	-	0
Pool1	MaxPooling2D	128x128x32	64x64x32	2x2	2x2	0
Conv2	Conv2D	64x64x32	64x64x64	3x3	1x1	18496
ReLU2	ReLU	64x64x64	64x64x64	-	-	0
Pool2	MaxPooling2D	64x64x64	32x32x64	2x2	2x2	0
Conv3	Conv2D	32x32x64	32x32x128	3x3	1x1	73856
ReLU3	ReLU	32x32x128	32x32x128	-	-	0
Pool3	MaxPooling2D	32x32x128	16x16x128	2x2	2x2	0
Flatten	Flatten	16x16x128	32768	-	-	0
FC4	Dense	32768	512	-	-	16781312
FC_mean	Dense	512	10	-	-	5130
FC_log_var	Dense	512	10	-	-	5130
Sampling	Sampling	10	10	-	-	0
FC5	Dense	10	512	-	-	5220
FC6	Dense	512	32768	-	-	16781312
Reshape	Reshape	32768	16x16x128	-	-	0
Deconv1	Conv2DTranspose	16x16x128	32x32x64	3x3	1x1	73792
Upsample1	UpSampling2D	32x32x64	64x64x64	2x2	2x2	0
Deconv2	Conv2DTranspose	64x64x64	64x64x32	3x3	1x1	18432
Upsample2	UpSampling2D	64x64x32	128x128x32	2x2	2x2	0
Deconv3	Conv2DTranspose	128x128x32	128x128x3	3x3	1x1	864
Sigmoid	Sigmoid	128x128x3	128x128x3	-	-	0

源码

import torch
import torch.nn as nn
import torch.nn.functional as F
class CVAE(nn.Module):
    def __init__(self):
        super(CVAE, self).__init__()
        # 编码器部分
        self.encoder = nn.Sequential(
            nn.Conv2d(3, 32, kernel_size=3, stride=1, padding=1),  # 输出: 128x128x32
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2),  # 输出: 64x64x32
            nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1),  # 输出: 64x64x64
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2),  # 输出: 32x32x64
            nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1),  # 输出: 32x32x128
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2),  # 输出: 16x16x128
        )
        # 全连接层，用于获取均值和方差
        self.fc_mean = nn.Linear(16*16*128, 10)
        self.fc_log_var = nn.Linear(16*16*128, 10)
        # 解码器部分
        self.decoder = nn.Sequential(
            nn.Linear(10, 16*16*128),
            nn.ReLU(),
            nn.Unflatten(1, (128, 16, 16)),
            nn.ConvTranspose2d(128, 64, kernel_size=3, stride=1, padding=1),  # 输出: 16x16x64
            nn.ReLU(),
            nn.UpSampling2d(scale_factor=2),  # 输出: 32x32x64
            nn.ConvTranspose2d(64, 32, kernel_size=3, stride=1, padding=1),  # 输出: 32x32x32
            nn.ReLU(),
            nn.UpSampling2d(scale_factor=2),  # 输出: 64x64x32
            nn.ConvTranspose2d(32, 3, kernel_size=3, stride=1, padding=1),  # 输出: 64x64x3
            nn.Sigmoid(),
            nn.UpSampling2d(scale_factor=2),  # 输出: 128x128x3
        )
    def reparameterize(self, mu, logvar):
        std = torch.exp(0.5*logvar)
        eps = torch.randn_like(std)
        return mu + eps*std
    def forward(self, x):
        # 编码
        encoded = self.encoder(x)
        encoded = encoded.view(encoded.size(0), -1)
        mu = self.fc_mean(encoded)
        logvar = self.fc_log_var(encoded)
        # 重参数化
        z = self.reparameterize(mu, logvar)
        # 解码
        decoded = self.decoder(z)
        return decoded, mu, logvar
# 实例化模型
cvae = CVAE()
# 打印模型结构
print(cvae)

UNET

结构

层名称	类型	输入大小 (HxWxC)	输出大小 (HxWxC)	核尺寸	步长	参数数量
Input	-	572x572x1	-	-	-	-
Conv2D_1	Conv2D	572x572x1	568x568x64	3x3	1	1728
BatchNorm_1	BatchNorm	568x568x64	568x568x64	-	-	256
ReLU_1	ReLU	568x568x64	568x568x64	-	-	0
MaxPool2D_1	MaxPool2D	568x568x64	284x284x64	2x2	2	0
Conv2D_2	Conv2D	284x284x64	280x280x128	3x3	1	18432
BatchNorm_2	BatchNorm	280x280x128	280x280x128	-	-	512
ReLU_2	ReLU	280x280x128	280x280x128	-	-	0
MaxPool2D_2	MaxPool2D	280x280x128	140x140x128	2x2	2	0
Conv2D_3	Conv2D	140x140x128	136x136x256	3x3	1	73728
BatchNorm_3	BatchNorm	136x136x256	136x136x256	-	-	1024
ReLU_3	ReLU	136x136x256	136x136x256	-	-	0
MaxPool2D_3	MaxPool2D	136x136x256	68x68x256	2x2	2	0
Conv2D_4	Conv2D	68x68x256	64x64x512	3x3	1	295040
BatchNorm_4	BatchNorm	64x64x512	64x64x512	-	-	2048
ReLU_4	ReLU	64x64x512	64x64x512	-	-	0
MaxPool2D_4	MaxPool2D	64x64x512	32x32x512	2x2	2	0
Conv2D_5	Conv2D	32x32x512	32x32x1024	3x3	1	1180160
BatchNorm_5	BatchNorm	32x32x1024	32x32x1024	-	-	4096
ReLU_5	ReLU	32x32x1024	32x32x1024	-	-	0
UpConv2D_1	ConvTranspose	32x32x1024	64x64x512	2x2	2	2099200
Concat_1	Concat	64x64x1536	64x64x1024	-	-	0
Conv2D_6	Conv2D	64x64x1024	64x64x512	3x3	1	524800
BatchNorm_6	BatchNorm	64x64x512	64x64x512	-	-	2048
ReLU_6	ReLU	64x64x512	64x64x512	-	-	0
UpConv2D_2	ConvTranspose	64x64x512	128x128x256	2x2	2	1049600
Concat_2	Concat	128x128x512	128x128x512	-	-	0
Conv2D_7	Conv2D	128x128x512	128x128x256	3x3	1	262400
BatchNorm_7	BatchNorm	128x128x256	128x128x256	-	-	1024
ReLU_7	ReLU	128x128x256	128x128x256	-	-	0
UpConv2D_3	ConvTranspose	128x128x256	256x256x128	2x2	2	524800
Concat_3	Concat	256x256x256	256x256x256	-	-	0
Conv2D_8	Conv2D	256x256x256	256x256x128	3x3	1	131200
BatchNorm_8	BatchNorm	256x256x128	256x256x128	-	-	512
ReLU_8	ReLU	256x256x128	256x256x128	-	-	0
UpConv2D_4	ConvTranspose	256x256x128	512x512x64	2x2	2	262400
Concat_4	Concat	512x512x128	512x512x128	-	-	0
Conv2D_9	Conv2D	512x512x128	512x512x64	3x3	1	64800
BatchNorm_9	BatchNorm	512x512x64	512x512x64	-	-	256
ReLU_9	ReLU	512x512x64	512x512x64	-	-	0
Conv2D_10	Conv2D	512x512x64	512x512x1	1x1	1	65
Sigmoid	Sigmoid	512x512x1	512x512x1	-	-	0

源码

import torch
import torch.nn as nn
import torch.nn.functional as F
class UNet(nn.Module):
    def __init__(self, in_channels=1, out_channels=1):
        super(UNet, self).__init__()
        
        # Encoder path
        self.conv1 = self.conv_block(in_channels, 64)
        self.conv2 = self.conv_block(64, 128)
        self.conv3 = self.conv_block(128, 256)
        self.conv4 = self.conv_block(256, 512)
        self.conv5 = self.conv_block(512, 1024)
        
        # Decoder path
        self.upconv4 = self.up_conv_block(1024, 512)
        self.upconv3 = self.up_conv_block(512, 256)
        self.upconv2 = self.up_conv_block(256, 128)
        self.upconv1 = self.up_conv_block(128, 64)
        
        # Output
        self.out = nn.Conv2d(64, out_channels, kernel_size=1)
        
    def conv_block(self, in_channels, out_channels):
        block = nn.Sequential(
            nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(inplace=True),
            nn.Conv2d(out_channels, out_channels, kernel_size=3, padding=1),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(inplace=True)
        )
        return block
    
    def up_conv_block(self, in_channels, out_channels):
        block = nn.Sequential(
            nn.ConvTranspose2d(in_channels, out_channels, kernel_size=2, stride=2),
            nn.ReLU(inplace=True)
        )
        return block
    
    def forward(self, x):
        # Encoder path
        enc1 = self.conv1(x)
        enc2 = self.conv2(F.max_pool2d(enc1, 2))
        enc3 = self.conv3(F.max_pool2d(enc2, 2))
        enc4 = self.conv4(F.max_pool2d(enc3, 2))
        enc5 = self.conv5(F.max_pool2d(enc4, 2))
        
        # Decoder path
        dec4 = self.upconv4(enc5)
        dec4 = torch.cat((enc4, dec4), dim=1)
        dec3 = self.upconv3(dec4)
        dec3 = torch.cat((enc3, dec3), dim=1)
        dec2 = self.upconv2(dec3)
        dec2 = torch.cat((enc2, dec2), dim=1)
        dec1 = self.upconv1(dec2)
        dec1 = torch.cat((enc1, dec1), dim=1)
        
        # Output
        out = self.out(dec1)
        return out
# Example usage:
# unet = UNet()
# input_tensor = torch.randn(1, 1, 572, 572)
# output = unet(input_tensor)

posted @ 2024-11-14 12:02 绝不原创的飞龙阅读(102) 评论(0) 收藏举报

刷新页面返回顶部

龙哥盟

掠夺·扩张·投机·博弈

神经网络架构参考：2-2 卷积篇

densenet

结构

pytorch 源码

mobilenet

结构

pytorch 源码

空间注意力网络

结构

pytorch 源码

卷积变分自编码器

结构1（转置卷积）

结构2（池化+上采样）

源码

UNET

结构

源码

公告