Multilayer Neural Networks and the Backpropagation Learning Algorithm

A multilayer perceptron is a feedforward neural network with one or more hidden layers.

The network consists of an input layer of source neurons, at least one middle or hidden layer of computational neurons, and an output layer of computational neurons.

The input signals are propagated in a forward direction on a layer-by-layer basis.

Backpropagation tarining Algorithm

Algorithm:

Step 1: Initialisation
Set all the weights and threshold levels of the network to random numbers uniformly distributed inside a small range:

where Fi is the total number of inputs of neuron i in the network. The weight initialisation is done on a neuron-by-neuron basis.

Step2: Activation

Activate the back-propagation neural network by applying inputs x1(p), x2(p),…, xn(p) and desired outputs yd,1(p), yd,2(p),…, yd,n(p).

(a) Calculate the actual outputs of the neurons in the hidden layer:

where n is the number of inputs of neuron j in the hidden layer, and sigmoid is the sigmoid activation function.

(b) Calculate the actual outputs of the neurons in the output layer:

where m is the number of inputs of neuron k in the output layer.

Step3: Weight training

Update the weights in the back-propagation network propagating backward the errors associated with output neurons.
(a) Calculate the error gradient for the neurons in the output layer:

where

Calculate the weight corrections:

Update the weights at the output neurons:

(b) Calculate the error gradient for the neurons in the hidden layer:

Calculate the weight corrections:

Update the weights at the hidden neurons:

Step3: Iteration

Increase iteration p by one, go back to Step 2 and repeat the process until the selected error criterion is satisfied.

Example: Three-layer network for solving the Exclusive-OR operation

The effect of the threshold applied to a neuron in the hidden or output layer is represented by its weight,θ, connected to a fixed input equal to -1.

The initial weights and threshold levels are set randomly as follows:

w₁₃ = 0.5, w₁₄ = 0.9, w₂₃ = 0.4, w₂₄ = 1.0, w₃₅ = -1.2, w₄₅ = 1.1, θ₃ = 0.8, θ₄ = -0.1 and θ₅ = 0.3.

We consider a training set where inputs x₁ and x₂ are equal to 1 and desired output y_d,5 is 0. The actual outputs of neurons 3 and 4 in the hidden layer are calculated as