RuntimeError: “nll_loss_forward_reduce_cuda_kernel_2d_index“ not implemented for ‘Int‘
RuntimeError: "nll_loss_forward_reduce_cuda_kernel_2d_index" not implemented for 'Int'
Traceback (most recent call last):
File "E:/MyWorkspace/EEG/Pytorch/Train.py", line 79, in <module>
opti='Adam')
File "E:\MyWorkspace\EEG\Pytorch\Utils.py", line 133, in TrainTest_Model
validation_loss, validation_acc = Test_Model(net, testloader, criterion,True)
File "E:\MyWorkspace\EEG\EEGLearn-Pytorch\Utils.py", line 82, in Test_Model
loss = criterion(outputs, labels.cuda()) # GPU
File "D:\coson\anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "D:\coson\anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\loss.py", line 1166, in forward
label_smoothing=self.label_smoothing)
File "D:\coson\anaconda3\envs\pytorch\lib\site-packages\torch\nn\functional.py", line 3014, in cross_entropy
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
RuntimeError: "nll_loss_forward_reduce_cuda_kernel_2d_index" not implemented for 'Int'
Process finished with exit code 1
关键错误在criterion(outputs, labels.cuda()) ,在本工程中criterion运行时给的值CrossEntropyLoss类实例,即: criterion = nn.CrossEntropyLoss(),
因此该错误是在loss计算的时候发生的,原因就是类型不匹配,那个参数类的类型不匹配呢?(其实就是labels类型不匹配,
手动给的参数中,我这labels是int32即Int,所以很好判定,但如果不知道,怎么办,接着往下看)
看到 torch\nn\modules\loss.py 的1166行,即 label_smoothing=self.label_smoothing,该1166行是函数调用部分代码,完整代码如下:
def forward(self, input: Tensor, target: Tensor) -> Tensor:
return F.cross_entropy(input, target, weight=self.weight,
ignore_index=self.ignore_index, reduction=self.reduction,
label_smoothing=self.label_smoothing)
打开 cross_entropy
定义头,看到如下:
def cross_entropy(
input: Tensor,
target: Tensor,
weight: Optional[Tensor] = None,
size_average: Optional[bool] = None,
ignore_index: int = -100,
reduce: Optional[bool] = None,
reduction: str = "mean",
label_smoothing: float = 0.0,
) -> Tensor:
我们传入的参数对应 cross_entropy
的 input
和 target
,定义中没有指出input和target数值类型,到底是哪个参数不匹配,接着在 cross_entropy
函数往下看,发现其调用了C函数,芭比Q了,
torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
看不到这个C函数的底层实现,如何办,看看官方给函数说明和例子吧,峰回路转,发现在 cross_entropy
函数说明中有以下例子:
# Example of target with class indices
input = torch.randn(3, 5, requires_grad=True)
target = torch.randint(5, (3,), dtype=torch.int64)
loss = F.cross_entropy(input, target)
loss.backward()
``
官方给的target用的int64,即long类型
所以可以断定`criterion(outputs, labels.cuda())`中的labels参数类型造成。
由上,我们可以对labels参数类型做转为:
```python
labels.long().cuda()
所以:
criterion(outputs, labels.long().cuda())
修改后,代码正常运行。