深度学习中的几种损失函数
下文中,所有的 \(x_i\) 对应模型输出向量 \(X\) 的元素,\(y_i\) 对应真值向量 \(Y\) 的元素。注意的是,实际在模型运行时,会存在 batch 批大小这个维度。这里只讨论一个样本的情况,即 batch_size = 1 。
通过结合公式,自定义实现损失函数,并与官方的实现进行对比,确认公式的准确性,加深理解。
L1 Loss
L1 Loss 即一范数损失,一般称作绝对值平均误差(Mean Average Error)。
\[MAE = \frac{1}{n} \sum_{i=1}^n |x_i - y_i|
\]
import torch
import torch.nn as nn
# 定义预测值和真实值
pred = torch.tensor([1.0, 2.0, 3.0, 4.0]) # 相应的真实值
target = torch.tensor([1.5, 2.5, 3.5, 4.5]) # 模型的预测值
# 官方实现
L1_loss = nn.L1Loss()
loss = L1_loss(pred,target)
print("pytorch :", loss)
# 自定义实现
def L1Loss(pred, target):
loss = torch.mean(abs(pred - target))
return loss
loss = L1Loss(pred,target)
print("ours :", loss)
pytorch : tensor(0.5000)
ours : tensor(0.5000)
L2 Loss
L2 Loss 即二范数损失,一般叫做均方误差(Mean Square Error)。
\[MSE = \frac{1}{n} \sum^{m}_{i=1}(x_i-y_i)^2
\]
import torch
import torch.nn as nn
# 定义预测值和真实值
pred = torch.tensor([1.0, 2.0, 3.0, 4.0]) # 相应的真实值
target = torch.tensor([1.5, 2.5, 3.5, 4.5]) # 模型的预测值
# 官方实现
L2_loss = nn.MSELoss()
loss = L2_loss(pred,target)
print("pytorch :", loss)
# 自定义实现
def L2Loss(pred, target):
loss = torch.mean((pred - target)*(pred - target))
return loss
loss = L2Loss(pred,target)
print("ours :", loss)
pytorch : tensor(0.2500)
ours : tensor(0.2500)
CrossEntropy Loss
CrossEntropy Loss 即交叉熵损失。一般要模型输出的结果经过softmax函数处理后,转换成概率,再计算交叉熵。
\[\begin{cases}
p_i = \frac{x_i}{\sum_{i=1}^{n} x_i} \\
L = -{\sum^{n}_{i=1} {y_i \cdot log(p_i)}}
\end{cases}
\]
import torch
import torch.nn as nn
# 定义预测值和真实值
pred = torch.tensor([1.0, 2.0, 3.0, 4.0]) # 相应的真实值
target = torch.tensor([1.5, 2.5, 3.5, 4.5]) # 模型的预测值
# 官方实现
CE_loss = nn.CrossEntropyLoss()
loss = CE_loss(pred,target)
print("pytorch :", loss)
# 自定义实现
def CE_loss(pred, target):
softmax = nn.Softmax(dim=0)
pred = softmax(pred)
loss = -torch.sum(target*torch.log(pred))
return loss
loss = CE_loss(pred,target)
print("ours :", loss)
pytorch : tensor(18.2823)
ours : tensor(18.2823)
KLDiv Loss
KLDiv Loss 即 KL 散度损失。值得注意的是,在 pytorch 的 KLDivLoss 接口中,认为模型输出的结果 \(x_i\) 是已经经过log函数处理了的,而对于真值 \(y_i\) 则默认不经过log函数处理,因此在使用时,计算代码与公式对比起来有些奇怪,但有了这层理解,就能够对应上。
\[L = \frac{1}{n}{\sum^{m}_{i=1}{y_i \cdot log(\frac{y_i}{x_i})}}
\]
import torch
import torch.nn as nn
# 定义预测值和真实值
pred = torch.tensor([1.0, 2.0, 3.0, 4.0]) # 相应的真实值
target = torch.tensor([1.5, 2.5, 3.5, 4.5]) # 模型的预测值
# 官方实现
KLDivLoss = nn.KLDivLoss()
loss = KLDivLoss(pred,target)
print("pytorch not log_target:", loss)
KLDivLoss = nn.KLDivLoss(log_target=True)
loss = KLDivLoss(pred,target)
print("pytorch log_target:", loss)
# 自定义实现
def KLDivLoss(pred, target, log_target = False):
if not log_target:
loss = torch.mean(target*(torch.log(target) - pred))
else:
loss = torch.mean(torch.exp(target)*(target - pred))
return loss
loss = KLDivLoss(pred,target)
print("ours not log_target:", loss)
loss = KLDivLoss(pred,target,log_target=True)
print("ours log_target:", loss)
pytorch not log_target: tensor(-5.2370)
pytorch log_target: tensor(17.4746)
ours not log_target: tensor(-5.2370)
ours log_target: tensor(17.4746)
参考资料
未经作者授权,禁止转载
THE END