构建一个简单的神经网络
1. 模型阐述
假设我们有下面的一组数据
输入1 | 输入2 | 输入3 | 输出 |
---|---|---|---|
0 | 0 | 1 | 0 |
1 | 1 | 1 | 1 |
1 | 0 | 1 | 1 |
0 | 1 | 1 | 0 |
对于上面的表格,我们可以找出其中的一个规律是:
输入的第一列和输出相同
那对于输入有3列,每列有0和1两个值,那可能的排列有\(2^3=8\)种,但是此处只有4种,那么在有限的数据情况下,我们应该怎么预测其他结果呢?
这个时候神经网络就大显身手了!
如下图:输入1,2,3作为输入变量x1,x2,x3.输出作为o
将输入x1,x2,x3乘上相应的权重,然后做累加之后作为sigmoid单元的输入。sigmoid的输出定义为o。
sigmoid函数:
,它有一个很有用的特征,就是它的导数很容易用它的输出来表示:,后面将会用到这个特征。
定义一个误差:
w=(w1,w2,w3),td和od分为训练样例的输出和定义模型的输出。
接下来的工作就是在w的假设空间(三维空间)中寻找一个最佳复合训练样例的w,即使得上面定义的误差最小。
标准梯度下降算法:
1、初始化网络权重w1,w2,w3为小的随机值。
2、在遇到终止条件之前:
2.1、初始化每个Δwi为0
2.2、对于每个训练样例<x,t>,做:
把x输入到此单元中,计算输出o
对于每个权值调整Δwi,做:
Δwi = Δwi+η*o*(1-o)(t-o)*xi
2.3、对于每个权值wi,做:
wi = wi+Δwi
随机梯度下降,寻找最佳w。算法如下:
1、初始化网络权重w1,w2,w3为小的随机值。
2、在遇到终止条件之前:
对于训练样例中的每个<x,t>,比如上面的模型中的四个样例,做:
2.1、把x输入网络,计算网络的输出o
2.2、对于网络的输出单元计算它的误差,δ=o(1-o)(t-o)。t为训练样例给出的目标输出值。
2.3、更新每个权值。wi <— wi+Δwi; Δwi = η*δ*xi。 η是学习速率,是一个比较小的值。
算法说明:
,求这个误差项的最小值。它是一个关于变量w1,w2,w3的一个函数。为了确定一个使E最小化的权向量,梯度下降搜索从一个任意的初始权向量开始,然后以很小的步伐反复修改这个向量。每一步都沿误差曲面产生最陡峭下降的方向修改权向量。继续这个过程直到得到全局的最小误差点。
如何获取最陡峭的方向?
计算E相对向量w各个分量的导数,得到该点的梯度为最陡峭的方向。
确定最陡峭的方向为梯度,那么梯度下降的训练法则是:
写成分量的形式为:
接着就是求:。
标准的梯度下降算法的对所有的样例汇总的误差。如图所示:
带入上面的Δwi得出:
, 与上面的标准梯度下降算法中的一致。
下面说明随机梯度下降算法:
随机梯度下降的权值是通过考查每个训练样例来更新的,它可以以任意的程度接近标准梯度下降,只有学习速率η足够小。
因为随机梯度是对单个的样例来更新权值。定义一个类似的误差函数:
,其中td和od是训练样例d的目标输出值和单元输出值。随机梯度下降迭代计算训练样例D中的每个训练样例d,在每次迭代过程中按照关于的梯度来改变权值。
下面来计算针对单个样例的。
推导出:
。与上面的随机梯度下降算法一致。
下面基于标准梯度下降算法的代码实现如下:
from numpy import exp, array, random, dot class NeuralNetwork(): def __init__(self): # Seed the random number generator, so it generates the same numbers # every time the program runs. random.seed(1) # We model a single neuron, with 3 input connections and 1 output connection. # We assign random weights to a 3 x 1 matrix, with values in the range -1 to 1 # and mean 0. self.synaptic_weights = 2 * random.random((3, 1)) - 1 self.sigmoid_derivative = self.__sigmoid_derivative # The Sigmoid function, which describes an S shaped curve. # We pass the weighted sum of the inputs through this function to # normalise them between 0 and 1. def __sigmoid(self, x): return 1 / (1 + exp(-x)) # The derivative of the Sigmoid function. # This is the gradient of the Sigmoid curve. # It indicates how confident we are about the existing weight. def __sigmoid_derivative(self, x): return x * (1 - x) # We train the neural network through a process of trial and error. # Adjusting the synaptic weights each time. def train(self, training_set_inputs, training_set_outputs, number_of_training_iterations): for iteration in range(number_of_training_iterations): # Pass the training set through our neural network (a single neuron). output = self.think(training_set_inputs) # Calculate the error (The difference between the desired output # and the predicted output). error = training_set_outputs - output # Multiply the error by the input and again by the gradient of the Sigmoid curve. # This means less confident weights are adjusted more. # This means inputs, which are zero, do not cause changes to the weights. adjustment = dot(training_set_inputs.T, error * self.__sigmoid_derivative(output)) # Adjust the weights. self.synaptic_weights += adjustment # The neural network thinks. def think(self, inputs): # Pass inputs through our neural network (our single neuron). return self.__sigmoid(dot(inputs, self.synaptic_weights))
#Intialise a single neuron neural network. neural_network = NeuralNetwork() print("Random starting synaptic weights: ") print(neural_network.synaptic_weights) # The training set. We have 4 examples, each consisting of 3 input values # and 1 output value. training_set_inputs = array([[0, 0, 1], [1, 1, 1], [1, 0, 1], [0, 1, 1]]) training_set_outputs = array([[0, 1, 1, 0]]).T # Train the neural network using a training set. # Do it 10,000 times and make small adjustments each time. neural_network.train(training_set_inputs, training_set_outputs, 10000) print("New synaptic weights after training: ") print(neural_network.synaptic_weights) # Test the neural network with a new situation. print("Considering new situation [1, 0, 0] -> ?: ") print(neural_network.think(array([1, 0, 0])))
输出结果:
Random starting synaptic weights:
[[-0.16595599]
[ 0.44064899]
[-0.99977125]]
New synaptic weights after training:
[[ 9.67299303]
[-0.2078435 ]
[-4.62963669]]
Considering new situation [1, 0, 0] -> ?:
[ 0.99993704]
下面是在上面的代码改动为随机梯度下降算法:
''' Created on 2017年2月22日 @author: LBX ''' from numpy import exp, array, random, dot class NeuralNetwork(): def __init__(self): random.seed(1) #初始化权值 0-1 self.synaptic_weights = 2*random.random((3,1))-1 self.sigmoid_derivative = self.__sigmoid_derivative def __sigmoid(self, x): return 1/(1+exp(-x)) def __sigmoid_derivative(self,x): return x*(1-x) def train(self, training_set_inputs, training_set_outputs, number_of_training_iterations): for iteration in range(number_of_training_iterations): for i in range(training_set_inputs.shape[0]): output = self.think(training_set_inputs[i]) error = training_set_outputs[i][0]-output for j in range(self.synaptic_weights.size): adjustment = (training_set_inputs[i]) * (error*self.__sigmoid_derivative(output)) self.synaptic_weights[j] += adjustment[j] def think(self, inputs): return self.__sigmoid(dot(inputs, self.synaptic_weights)) neural_network = NeuralNetwork() print("Random starting synaptic weights: ") print(neural_network.synaptic_weights) training_set_inputs = array([[0,0,1],[1,1,1],[1,0,1],[0,1,1]]) #4*3 training_set_outputs = array([[0,1,1,0]]).T neural_network.train(training_set_inputs, training_set_outputs, 10000) print("New synaptic weights after training: ") print(neural_network.synaptic_weights) # Test the neural network with a new situation. print("Considering new situation [1, 0, 0] -> ?: ") print(neural_network.think(array([1, 0, 0])))
输出结果:
Random starting synaptic weights:
[[-0.16595599]
[ 0.44064899]
[-0.99977125]]
New synaptic weights after training:
[[ 9.67299303]
[-0.2078435 ]
[-4.62963669]]
Considering new situation [1, 0, 0] -> ?:
[ 0.99993704]
参考
1、《机器学习》
2、http://www.jianshu.com/p/15db29e72719