Neural Network and DeepLearning (1.2)使用神经网络识别手写数字

1.Learning with gradient descent(使用梯度下降法进行学习)

cost function(代价函数)



when we move the ball a small amount Δv1 in the v1 direction, and a small amount Δv2 in the v2 direction. Calculus tells us that C changes as follows:

defined the gradient of C to be the vector of partial derivatives:

the expression for ΔC can be rewritten as:

choose Δv as:


this guarantees that ΔC≤0, C will always decrease, never increase.

hat is, we'll use Equation (10) to compute a value for Δv, then move the ball's position v by that amount:

Then we'll use this update rule again, to make another move. If we keep doing this, over and over, we'll keep decreasing C until - we hope - we reach a global minimum.

stochastic gradient descent(随机梯度下降):

works by picking out a randomly chosen mini-batch of training inputs, and training with those:






posted @ 2017-03-03 10:51  zhoulixue  阅读(193)  评论(0编辑  收藏  举报