命名张量的解释
import torch
把3个颜色grb通道合并为一个灰度通道
定义变量,用随机数模拟
img_t = torch.randn(3, 5, 5)
batch_t = torch.randn(2, 3, 5, 5)
weights = torch.randn(3)
朴素的求法
以 channels 通道的平均数为灰度值
img_gray_naive = img_t.mean(-3)
batch_gray_naive = batch_t.mean(-3)
img_gray_naive.shape, batch_gray_naive.shape
(torch.Size([5, 5]), torch.Size([2, 5, 5]))
带权重的求法
朴素的求法相当于平分了权重值,即每个通道的权重都是 1/3,
weights中包含了每个通道真正的权重值,img_t 中每个值乘以权重才是真正的值
首先让 weights 扩展成 img_t 一样的维度结构
unsqueeze_weights = weights.unsqueeze(-1).unsqueeze(-1)
其次 img_t 和 unsqueeze_weights 进行矩阵乘法,把权重作用到 img_t 上面
img_weights = img_t * unsqueeze_weights
batch_weights = batch_t * unsqueeze_weights
最后把各个维度的权重相加得到最终的灰度值
img_gray_weighted = img_weights.sum(-3)
batch_gray_weighted = batch_weights.sum(-3)
查看形状可知,没有了颜色通道维度
img_gray_weighted.shape, batch_gray_weighted.shape
(torch.Size([5, 5]), torch.Size([2, 5, 5]))
高级做法-使用爱因斯坦求和约定
img_gray_weighted_fancy = torch.einsum('...chw,c->...hw', img_t, weights)
batch_gray_weighted_fancy = torch.einsum('...chw,c->...hw', batch_t, weights)
img_gray_weighted_fancy.shape, batch_gray_weighted_fancy.shape
(torch.Size([5, 5]), torch.Size([2, 5, 5]))
用命名张量的做法
用 refine_names 函数为张量的每个维度命名
weights_named = weights.refine_names(..., 'channels')
img_named = img_t.refine_names(..., 'channels', 'rows', 'columns')
batch_named = batch_t.refine_names(..., 'channels', 'rows', 'columns')
把一维的 weights_named 张量对齐到 多维的张量 img_named
weights_aligned = weights_named.align_as(img_named)
矩阵乘法,再求和(给定命名的维度名称)
img_gray_named = (img_named * weights_aligned).sum('channels')
batch_gray_named = (batch_named * weights_aligned).sum('channels')
img_gray_named.shape, batch_gray_named.shape, weights_aligned.shape
(torch.Size([5, 5]), torch.Size([2, 5, 5]), torch.Size([3, 1, 1]))