多输出感知机及其梯度
Multi-output Perceptron
E=12∑(O1i−ti)2E=12∑(Oi1−ti)2
对于多输出感知机,每个输出元只和输出元上的x和w和σσ 有关。
import tensorflow as tf
x = tf.random.normal([2, 4])
w = tf.random.normal([4, 3])
b = tf.zeros([3])
y = tf.constant([2, 0])
with tf.GradientTape() as tape:
tape.watch([w, b])
# axis=1,表示结果[b,3]中的3这个维度为概率
prob = tf.nn.softmax(x @ w + b, axis=1)
# 2 --> 001; 0 --> 100
loss = tf.reduce_mean(tf.losses.MSE(tf.one_hot(y, depth=3), prob))
grads = tape.gradient(loss, [w, b])
grads[0]
<tf.Tensor: id=92, shape=(4, 3), dtype=float32, numpy=
array([[ 0.00842961, -0.02221732, 0.01378771],
[ 0.02969089, -0.04625662, 0.01656573],
[ 0.05807886, -0.08139262, 0.02331377],
[-0.06571108, 0.11157083, -0.04585974]], dtype=float32)>
grads[1]
<tf.Tensor: id=90, shape=(3,), dtype=float32, numpy=array([-0.05913186, 0.09886257, -0.03973071], dtype=float32)>