命名张量的解释

import torch

把3个颜色grb通道合并为一个灰度通道

定义变量,用随机数模拟

img_t = torch.randn(3, 5, 5)
batch_t = torch.randn(2, 3, 5, 5)
weights = torch.randn(3)

朴素的求法

以 channels 通道的平均数为灰度值

img_gray_naive = img_t.mean(-3)
batch_gray_naive = batch_t.mean(-3)
img_gray_naive.shape, batch_gray_naive.shape
(torch.Size([5, 5]), torch.Size([2, 5, 5]))

带权重的求法

朴素的求法相当于平分了权重值,即每个通道的权重都是 1/3,
weights中包含了每个通道真正的权重值,img_t 中每个值乘以权重才是真正的值

首先让 weights 扩展成 img_t 一样的维度结构

unsqueeze_weights = weights.unsqueeze(-1).unsqueeze(-1)

其次 img_t 和 unsqueeze_weights 进行矩阵乘法,把权重作用到 img_t 上面

img_weights = img_t * unsqueeze_weights
batch_weights = batch_t * unsqueeze_weights

最后把各个维度的权重相加得到最终的灰度值

img_gray_weighted = img_weights.sum(-3)
batch_gray_weighted = batch_weights.sum(-3)

查看形状可知,没有了颜色通道维度

img_gray_weighted.shape, batch_gray_weighted.shape
(torch.Size([5, 5]), torch.Size([2, 5, 5]))

高级做法-使用爱因斯坦求和约定

img_gray_weighted_fancy = torch.einsum('...chw,c->...hw', img_t, weights)
batch_gray_weighted_fancy = torch.einsum('...chw,c->...hw', batch_t, weights)
img_gray_weighted_fancy.shape, batch_gray_weighted_fancy.shape
(torch.Size([5, 5]), torch.Size([2, 5, 5]))

用命名张量的做法

用 refine_names 函数为张量的每个维度命名

weights_named = weights.refine_names(..., 'channels')
img_named = img_t.refine_names(..., 'channels', 'rows', 'columns')
batch_named = batch_t.refine_names(..., 'channels', 'rows', 'columns')

把一维的 weights_named 张量对齐到 多维的张量 img_named

weights_aligned = weights_named.align_as(img_named)

矩阵乘法,再求和(给定命名的维度名称)

img_gray_named = (img_named * weights_aligned).sum('channels')
batch_gray_named = (batch_named * weights_aligned).sum('channels')
img_gray_named.shape, batch_gray_named.shape, weights_aligned.shape
(torch.Size([5, 5]), torch.Size([2, 5, 5]), torch.Size([3, 1, 1]))
posted @ 2024-08-10 16:32  立体风  阅读(16)  评论(0编辑  收藏  举报