Keras(二)损失函数loss
本章介绍Keras.loss
,损失函数
从功能上分,可以分为以下三类:
- Probabilistic losses,主要用于分类
- Regression losses, 用于回归问题
- Hinge losses, 又称"maximum-margin"分类,主要用作svm,最大化分割超平面的距离
损失函数的使用
用于模型的compile和fit
一般在模型compile的时候,将其作为loss的参数传进来。
有以下两种写法:
from tensorflow import keras
from tensorflow.keras import layers
model = keras.Sequential()
model.add(layers.Dense(64, kernel_initializer='uniform', input_shape=(10,)))
model.add(layers.Activation('softmax'))
loss_fn = keras.losses.SparseCategoricalCrossentropy()
model.compile(loss=loss_fn, optimizer='adam')
# pass optimizer by name: default parameters will be used
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam')
单独使用
损失函数也可以单独使用,只需要传入两个tensor作为y_true
和y_pred
输入即可,还有一个sample_weights
为可选参数
tf.keras.losses.mean_squared_error(tf.ones((2, 2,)), tf.zeros((2, 2)))
<tf.Tensor: shape=(2,), dtype=float32, numpy=array([1., 1.], dtype=float32)>
自定义loss
我们可以自定义一个函数用于loss的参数,定义也非常简单,只需要一个输入tensor和一个输出tensor即可
def my_loss_fn(y_true, y_pred):
squared_difference = tf.square(y_true - y_pred)
return tf.reduce_mean(squared_difference, axis=-1) # Note the `axis=-1`
model.compile(optimizer='adam', loss=my_loss_fn)
add_loss()
add_loss是tf.layers提供的函数,我们在创建自定义的layer时,可以在call
中调用这个函数,可以将损失添加到训练的过程中。
from tensorflow.keras.layers import Layer
class MyActivityRegularizer(Layer):
"""Layer that creates an activity sparsity regularization loss."""
def __init__(self, rate=1e-2):
super(MyActivityRegularizer, self).__init__()
self.rate = rate
def call(self, inputs):
# We use `add_loss` to create a regularization loss
# that depends on the inputs.
self.add_loss(self.rate * tf.reduce_sum(tf.square(inputs)))
return inputs
class SparseMLP(Layer):
"""Stack of Linear layers with a sparsity regularization loss."""
def __init__(self, output_dim):
super(SparseMLP, self).__init__()
self.dense_1 = layers.Dense(32, activation=tf.nn.relu)
self.regularization = MyActivityRegularizer(1e-2)
self.dense_2 = layers.Dense(output_dim)
def call(self, inputs):
x = self.dense_1(inputs)
x = self.regularization(x)
return self.dense_2(x)
mlp = SparseMLP(1)
y = mlp(tf.ones((10, 10)))
print(mlp.losses) # List containing one float32 scalar
[<tf.Tensor: shape=(), dtype=float32, numpy=0.800574>]
损失函数类型
Probabilistic losses
对于分类概率问题常用交叉熵来作为损失函数。
BinaryCrossentropy(BCE)
BinaryCrossentropy用于0,1类型的交叉熵。计算公式:
交叉熵描述了两个概率分布之间的距离,当交叉熵越小说明二者之间越接近。
关于BinaryCrossentropy的原理可以参考:
https://towardsdatascience.com/understanding-binary-cross-entropy-log-loss-a-visual-explanation-a3ac6025181a
示例:
bce = tf.keras.losses.BinaryCrossentropy(from_logits=True)
y_pred = [0, 1, 0, 0]
y_val = [0.1, 1.29, -1, -6.2]
bce(y_pred, y_val).numpy()
0.32571107
其中,from_logits
在使用样本取[-inf, inf],用正负代表接近0,1时取True,在样本取值在[0,1]时取False。
CategoricalCrossentropy(CCE)
CategoricalCrossentropy
和SparseCategorialCrossentropy
都用于多分类器,而CategoricalCrossentropy
通常用于One-hot分类的问题中
cce = tf.keras.losses.CategoricalCrossentropy(from_logits=True)
y_true = [[0, 1, 0], [0, 0, 1]]
y_pred = [[0.05, 0.95, 0], [0.1, 0.8, 0.1]]
print(cce(y_true, y_pred).numpy())
y_pred = [[0.05, 0.95, 0], [0.1, 0.1, 0.8]]
print(cce(y_true, y_pred).numpy())
1.1769392
0.13721842
SparseCategorialCrossentropy(SCCE)
SparseCategorialCrossentropy
用于数值标签的多分类器
y_true = [1, 2]
y_pred = [[0.05, 0.95, 0], [0.1, 0.8, 0.1]]
# Using 'auto'/'sum_over_batch_size' reduction type.
scce = tf.keras.losses.SparseCategoricalCrossentropy()
print(scce(y_true, y_pred).numpy())
y_pred = [[0.05, 0.95, 0], [0.1, 0.1, 0.8]]
print(scce(y_true, y_pred).numpy())
1.1769392
0.13721849
Poisson
泊松损失,值为loss = y_pred - y_true * log(y_pred)
真实标签服从泊松分布的负对数似然损失。
y_true = [[0., 1.], [0., 0.]]
y_pred = [[1., 1.], [0., 0.]]
# Using 'auto'/'sum_over_batch_size' reduction type.
p = tf.keras.losses.Poisson()
print(p(y_true, y_pred).numpy())
y_pred = [[0., 1.], [0., 0.]]
print(p(y_true, y_pred).numpy())
0.49999997
0.24999997
KLDivergence
Kullback-Leibler 散度损失,公式:loss = y_true * log(y_true / y_pred)
y_true = [[0., 1.], [0., 0.]]
y_pred = [[1., 1.], [0., 0.]]
# Using 'auto'/'sum_over_batch_size' reduction type.
p = tf.keras.losses.KLDivergence()
print(p(y_true, y_pred).numpy())
y_pred = [[0., 1.], [0., 0.]]
print(p(y_true, y_pred).numpy())
-8.059048e-07
0.0
Regression losses
下面介绍回归问题中常见的损失函数。
MeanSquaredError(MSE)
最常见的损失函数,均方差loss = square(y_true - y_pred)
y_true = [0.8, 0.1, 0.3]
y_pred = [0.81, 0.101, 0.45]
mse = tf.keras.losses.MeanSquaredError()
print(mse(y_true, y_pred).numpy())
0.0075336643
MeanAbsoluteError(MAE)
loss = abs(y_true - y_pred)
y_true = [0.8, 0.1, 0.3]
y_pred = [0.81, 0.101, 0.45]
mae = tf.keras.losses.MeanAbsoluteError()
print(mae(y_true, y_pred).numpy())
0.053666655
MeanAbsolutePercentageError(MAPE)
loss = 100 * abs((y_true - y_pred) / y_true)
y_true = [0.8, 0.1, 0.3]
y_pred = [0.81, 0.101, 0.45]
mape = tf.keras.losses.MeanAbsolutePercentageError()
print(mape(y_true, y_pred).numpy())
17.416664
MeanSquaredLogarithmicError(MSLE)
loss = square(log(y_true + 1.) - log(y_pred + 1.))
y_true = [0.8, 0.1, 0.3]
y_pred = [0.81, 0.101, 0.45]
msle = tf.keras.losses.MeanSquaredLogarithmicError()
print(msle(y_true, y_pred).numpy())
0.003985339
CosineSimilarity
loss = -sum(l2_norm(y_true) * l2_norm(y_pred))
注意,它是一个介于-1和1之间的数字。当它是一个介于-1和0之间的负数时,0表示正交,接近-1的值表示更大的相似性。越接近1的值表示差异越大。
y_true = [0.8, 0.1, 0.3]
y_pred = [0.81, 0.101, 0.45]
cs = tf.keras.losses.CosineSimilarity()
print(cs(y_true, y_pred).numpy())
-0.9891268
Huber
Huber Loss 是一个用于回归问题的带参损失函数, 优点是能增强平方误差损失函数(MSE, mean square error)对离群点的鲁棒性。公式:
其中参数a即residual,y_pred - y_true, 在Keras中δ=1
y_true = [0.8, 0.1, 0.3]
y_pred = [0.81, 0.101, 0.45]
huber = tf.keras.losses.Huber()
print(huber(y_true, y_pred).numpy())
0.0037668322
LogCosh
logcosh = log((exp(x) + exp(-x))/2)
, 其中 x = y_pred - y_true
y_true = [0.8, 0.1, 0.3]
y_pred = [0.81, 0.101, 0.45]
lc = tf.keras.losses.LogCosh()
print(lc(y_true, y_pred).numpy())
0.0037528474
HingeLoss
Hinge
HingeLoss的定义:
loss = maximum(1 - y_true * y_pred, 0)
y_true = [[0., 1.], [0., 0.]]
y_pred = [[0.6, 0.4], [0.4, 0.6]]
h = tf.keras.losses.Hinge()
print(h(y_true, y_pred).numpy())
1.3
SquaredHinge
Hinge的基础上平方
loss = square(maximum(1 - y_true * y_pred, 0))
y_true = [[0., 1.], [0., 0.]]
y_pred = [[0.6, 0.4], [0.4, 0.6]]
h = tf.keras.losses.SquaredHinge()
h(y_true, y_pred).numpy()
1.86
CategoricalHinge
neg=max((1−正确值)×预测值)
pos=∑(正确值 × 预测值)
loss=max(neg−pos+1,0)
y_true = np.random.randint(0, 3, size=(2,))
y_true = tf.keras.utils.to_categorical(y_true, num_classes=3)
y_pred = np.random.random(size=(2, 3))
loss = tf.keras.losses.categorical_hinge(y_true, y_pred)
print(loss.numpy())
assert loss.shape == (2,)
pos = np.sum(y_true * y_pred, axis=-1)
neg = np.amax((1. - y_true) * y_pred, axis=-1)
assert np.array_equal(loss.numpy(), np.maximum(0., neg - pos + 1.))
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· .NET10 - 预览版1新功能体验(一)