PyTorch 对类别张量进行 one-hot 编码

参考：

-TORCH.INDEX_SELECT
-PyTorch 对类别张量进行 one-hot 编码

正文

index_select
torch.index_select(input, dim, index, *, out=None) → Tensor

- input (Tensor) – the input tensor.
- dim (int) – the dimension in which we index
- index (IntTensor or LongTensor) – the 1-D tensor containing the indices to index

该函数如其名，就是用索引来选择 tensor 的指定维度的子 tensor 的。

想要理解这一方法的动机，实际上需要反过来，从类别标签的角度看待one-hot编码。

对于原始从小到大排布的类别序号对应的one-hot编码成的矩阵就是一个单位矩阵。所以每个类别对应的就是该单位矩阵的特定的列（或者行）。这一需求恰好符合index_select的功能。所以我们可以使用其实现one_hot编码，只需要使用类别序号索引特定的列或者行即可。下面就是一个例子：

def bhw_to_onehot_by_index_select(bhw_tensor: torch.Tensor, num_classes: int):
    """
    Args:
        bhw_tensor: b,h,w
        num_classes:
    Returns: b,h,w,num_classes
    """
    assert bhw_tensor.ndim == 3, bhw_tensor.shape
    assert num_classes > bhw_tensor.max(), torch.unique(bhw_tensor)
    one_hot = torch.eye(num_classes).index_select(dim=0, index=bhw_tensor.reshape(-1))
    one_hot = one_hot.reshape(*bhw_tensor.shape, num_classes)
    return one_hot

posted @ 2021-12-19 09:41 麦克斯的园丁阅读(357) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

麦克斯的园丁

学习在出不在进

PyTorch 对类别张量进行 one-hot 编码

参考：

正文

公告