机器学习(四):4层BP神经网络(只用numpy不调包)用于训练鸢尾花数据集|准确率96%

题目:

  1. 设计四层BP网络,以g(x)=sigmoid(x)为激活函数,

  2. 神经网络结构为:[4,10,6, 3],其中,输入层为4个节点,第一个隐含层神经元个数为10个节点;第二个隐含层神经元个数为6个节点,输出层为3个节点

  3. 利用训练数据iris-train.txt对BP神经网络分别进行训练,对训练后的模型统计识别正确率,并计算对测试数据iris-test.txt的正确率。

参考函数:

def show_accuracy(n, X, y):
	h = n.test(X)
	y_pred = np.argmax(h, axis=1)
	print(classification_report(y, y_pred))

解:

首先题目已经给出了3行函数,我们可以进行如下拆解:

show_accuracy(n, X, y)n代表BP神经网络模型,采用面向对象的方式编写,X是测试数据输入特征X_testy是测试数据输出特征y_testh = n.test(X)的含义是调用训练好的BPNN的预测方法predict,将X输入进行前向传播即可获得输出层[0,1,2]分别的概率,然后通过np.argmax(h, axis=1)取输出层的最大概率,返回下标0,1,2,最后调用sklearn的分类报告方法即可打印分类准确情况。

主要代码如下:

导入包及数据

import pandas
import numpy as np
from sklearn.metrics import classification_report
# 导入txt数据
iris_train = pandas.read_table("iris/iris-train.txt", header=None)
iris_train.columns = ['SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm', 'Species']
iris_test = pandas.read_table("iris/iris-test.txt", header=None)
iris_test.columns = ['SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm', 'Species']

编写识别正确率函数

def show_accuracy(self, X, Y):
	count=0
	Y_pred=[]
    for i in range(X.shape[0]):
    	h=self.update(X[i, 0:4])
    	y_pred=np.argmax(h)
    	Y_pred.append(y_pred)
    	if y_pred==Y[i]:
    		count+=1
    print("准确率为:",count/X.shape[0])
    print(count)
    print(classification_report(Y, Y_pred))

完整程序代码

import pandas
import numpy as np
from sklearn.metrics import classification_report
# 导入txt数据
iris_train = pandas.read_table("iris/iris-train.txt", header=None)
iris_train.columns = ['SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm', 'Species']
iris_test = pandas.read_table("iris/iris-test.txt", header=None)
iris_test.columns = ['SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm', 'Species']

#打乱顺序
array = iris_train.values#
np.random.seed(1377)
np.random.shuffle(array)


#独热编码
def onehot(targets, num_out):
    onehot = np.zeros((num_out, targets.shape[0]))
    for idx, val in enumerate(targets.astype(int)):
        onehot[val, idx] = 1.
    return onehot.T

#生成一个矩阵,大小为m*n,并且设置默认零矩阵
def makematrix(m, n, fill=0.0):
    X_train = []
    for i in range(m):
        X_train.append([fill] * n)
    return X_train


#函数sigmoid()
def sigmoid(x):
    a = 1 / (1 + np.exp(-x))
    return a



#函数
def derived_sigmoid(x):
    return x * (1 - x)
    # return 1.0 - x ** 2


#构造四层BP网络架构
class BPNN:
    def __init__(self, num_in, num_hidden1, num_hidden2, num_out):
        # 输入层,隐藏层,输出层的节点数
        self.num_in = num_in + 1  # 增加一个偏置结点 4
        self.num_hidden1 = num_hidden1 + 1  # 增加一个偏置结点 4
        self.num_hidden2 = num_hidden2 + 1
        self.num_out = num_out

        # 激活神经网络的所有节点
        self.active_in = [1.0] * self.num_in
        self.active_hidden1 = [1.0] * self.num_hidden1
        self.active_hidden2 = [1.0] * self.num_hidden2
        self.active_out = [1.0] * self.num_out

        # 创建权重矩阵
        self.wight_in = makematrix(self.num_in, self.num_hidden1)
        self.wight_h1h2 = makematrix(self.num_hidden1, self.num_hidden2)
        self.wight_out = makematrix(self.num_hidden2, self.num_out)

        # 对权值矩阵赋初值
        for i in range(self.num_in):
            for j in range(self.num_hidden1):
                self.wight_in[i][j] = np.random.normal(0.0, pow(self.num_hidden1, -0.5))  # 输出num_in行,num_hidden列权重矩阵,随机生成满足正态分布的权重
        for i in range(self.num_hidden1):
            for j in range(self.num_hidden2):
                self.wight_h1h2[i][j] = np.random.normal(0.0, pow(self.num_hidden2, -0.5))
        for i in range(self.num_hidden2):
            for j in range(self.num_out):
                self.wight_out[i][j] = np.random.normal(0.0, pow(self.num_out, -0.5))

        # 最后建立动量因子(矩阵)
        self.ci = makematrix(self.num_in, self.num_hidden1)
        self.ch1h2 = makematrix(self.num_hidden1, self.num_hidden2)
        self.co = makematrix(self.num_hidden2, self.num_out)


        # 信号正向传播

    def update(self, inputs):
        a=len(inputs)
        if len(inputs) != self.num_in - 1:
            raise ValueError('与输入层节点数不符')

        # 数据输入输入层
        for i in range(self.num_in - 1):
            # self.active_in[i] = sigmoid(inputs[i])  #或者先在输入层进行数据处理
            self.active_in[i] = inputs[i]  # active_in[]是输入数据的矩阵

        # 数据在隐藏层1的处理
        for i in range(self.num_hidden1):
            sum = 0.0
            for j in range(self.num_in):
                sum = sum + self.active_in[j] * self.wight_in[j][i]
            self.active_hidden1[i] = sigmoid(sum)  # active_hidden[]是处理完输入数据之后存储,作为输出层的输入数据

        # 数据在隐藏层2的处理
        for i in range(self.num_hidden2):
            sum = 0.0
            for j in range(self.num_hidden1):
                sum = sum + self.active_hidden1[j] * self.wight_h1h2[j][i]
            self.active_hidden2[i] = sigmoid(sum)  # active_hidden[]是处理完输入数据之后存储,作为输出层的输入数据

        # 数据在输出层的处理
        for i in range(self.num_out):
            sum = 0.0
            for j in range(self.num_hidden2):
                sum = sum + self.active_hidden2[j] * self.wight_out[j][i]
            self.active_out[i] = sigmoid(sum)  # 与上同理

        return self.active_out[:]


    # 误差反向传播
    def errorbackpropagate(self, targets, lr, m):  # lr是学习率, m是动量因子
        if len(targets) != self.num_out:
            raise ValueError('与输出层节点数不符!')

        # 首先计算输出层的误差
        out_deltas = [0.0] * self.num_out
        for i in range(self.num_out):
            error = targets[i] - self.active_out[i]
            out_deltas[i] = derived_sigmoid(self.active_out[i]) * error

        # 计算隐藏层2的误差
        hidden2_deltas = [0.0] * self.num_hidden2
        for i in range(self.num_hidden2):
            error = 0.0
            for j in range(self.num_out):
                error = error + out_deltas[j] * self.wight_out[i][j]
            hidden2_deltas[i] = derived_sigmoid(self.active_hidden2[i]) * error

        # 计算隐藏层1的误差
        hidden1_deltas = [0.0] * self.num_hidden1
        for i in range(self.num_hidden1):
            error = 0.0
            for j in range(self.num_hidden2):
                error = error + hidden2_deltas[j] * self.wight_h1h2[i][j]
            hidden1_deltas[i] = derived_sigmoid(self.active_hidden1[i]) * error



        # 更新输出层权值
        for i in range(self.num_hidden2):
            for j in range(self.num_out):
                change = out_deltas[j] * self.active_hidden2[i]
                self.wight_out[i][j] = self.wight_out[i][j] + lr * change + m * self.co[i][j]
                self.co[i][j] = change

        # 更新隐藏层间权值
        for i in range(self.num_hidden1):
            for j in range(self.num_hidden2):
                change = hidden2_deltas[j] * self.active_hidden1[i]
                self.wight_h1h2[i][j] = self.wight_h1h2[i][j] + lr * change + m * self.ch1h2[i][j]
                self.ch1h2[i][j] = change

        # 然后更新输入层权值
        for i in range(self.num_in):
            for j in range(self.num_hidden1):
                change = hidden1_deltas[j] * self.active_in[i]
                self.wight_in[i][j] = self.wight_in[i][j] + lr * change + m * self.ci[i][j]
                self.ci[i][j] = change

        # 计算总误差
        error = 0.0
        for i in range(self.num_out):
            error = error + 0.5 * (targets[i] - self.active_out[i]) ** 2
        return error

    # 测试
    def test(self, X_test):
        for i in range(X_test.shape[0]):
            print(X_test[i, 0:4], '->', self.update(X_test[i, 0:4]))

    # 权重
    def weights(self):
        print("输入层权重")
        for i in range(self.num_in):
            print(self.wight_in[i])
        print("输出层权重")
        for i in range(self.num_hidden2):
            print(self.wight_out[i])

    def train(self, train, itera=100, lr=0.1, m=0.1):
        for i in range(itera):
            error = 0.0
            for j in range(100):#训练集的大小
                inputs = train[j, 0:4]
                d = onehot(train[:,4], self.num_out)
                targets = d[j, :]
                self.update(inputs)
                error = error + self.errorbackpropagate(targets, lr, m)
            if i % 100 == 0:
                print('误差 %-.5f' % error)

    def show_accuracy(self, X, Y):
        count=0
        Y_pred=[]
        for i in range(X.shape[0]):
            h=self.update(X[i, 0:4])
            y_pred=np.argmax(h)
            Y_pred.append(y_pred)
            if y_pred==Y[i]:
                count+=1
        print("准确率为:",count/X.shape[0])
        print(count)
        print(classification_report(Y, Y_pred))

# 实例
def Mytrain(train,X_test, Y_test):
    # 创建神经网络,4个输入节点,10个隐藏层1节点,6个隐藏层2节点,3个输出层节点
    n = BPNN(4, 10, 6, 3)
    # 训练神经网络
    print("start training\n--------------------")
    n.train(train,itera=1000)
    n.weights()
    # n.test(X_test)
    n.show_accuracy(X_test, Y_test)


if __name__ == '__main__':
    train = array[:, :]  # 训练集
    X_test = iris_test.values[:, :]  # 测试集
    Y_test = iris_test.values[:, 4]  # 测试集的标签
    Mytrain(train, X_test, Y_test)

输出结果:

start training
--------------------
误差 35.40337
误差 1.43231
误差 1.08501
误差 1.05931
误差 1.04053
误差 1.02761
误差 1.02126
误差 1.01507
误差 1.01000
误差 1.00646
输入层权重
[0.056033408540934464, 2.8041275894047875, 0.16855148969297182, 1.0200135486931827, -1.4263718396216152, 0.46417722366714737, 3.951369301666286, -1.68522617998046, -0.07767181609266427, -0.3263575324395308, 2.0699554193776044]
[-1.4377068931180876, 1.851234250520833, 0.15780177315246371, -0.05179352554189774, -2.3563287594546423, 1.7623583488440409, 3.091780632711021, -1.372026038986551, -1.726220905551624, -1.1669182260086637, 1.6321410768033273]
[0.9560967359135075, -4.531261602111315, -0.6120871721438043, 0.3990936425721157, 2.9004962990324032, -1.6411239366055017, -6.471695892441071, 2.6775054457912315, 1.4507421345886298, 1.962983608402704, -3.3617954044955485]
[0.9386911995204238, -2.5918217255311093, -0.19942943923868559, -0.12162074195611947, 2.299246125556896, -1.5047700193638123, -3.760625731974951, 2.069744371129818, 0.8219878214614533, 0.8239600952803267, -2.5028309920744243]
[-0.5943509557690846, 3.1298630885285528, 0.182341529267033, -0.3904817160983978, -1.477792159823391, 1.242189984057302, 3.993237791636666, -1.3449814648342135, -0.42506681041697547, -0.4405383184743436, 2.466930731983345]
输出层权重
[-0.09150582827392847, 3.142269157863363, -4.641439345864634]
[-1.4871365396516583, -0.48293082562473966, -0.20849376992641952]
[-1.8532830432907033, -6.6108611902097465, 5.752994457916077]
[3.247964863909582, -3.6518302173463195, -1.7172925811709747]
[-0.9773159253935711, -0.3682438453855819, 0.09500142611662453]
[3.513238795753454, -4.578599999073612, -1.5312361861597554]
[-3.9400826690223867, 3.100329638184805, -1.3037079300155898]
准确率为: 0.96
48
              precision    recall  f1-score   support

         0.0       1.00      1.00      1.00        16
         1.0       1.00      0.88      0.94        17
         2.0       0.89      1.00      0.94        17

    accuracy                           0.96        50
   macro avg       0.96      0.96      0.96        50
weighted avg       0.96      0.96      0.96        50


Process finished with exit code 0

参考文献:

[1] 四层BP网络python代码实现_odd~的博客-CSDN博客

[2] MachineLearning_Ass2/bpNetwork.py at master · wangtuntun/MachineLearning_Ass2 · GitHub

ps:参考文献[1]代码对于数据处理有严重问题,并且没有写计算正确率的函数,训练结果有严重错误,我部分参考代码[2]进行了修复,代码[2]是7年前的老代码,应该是基于python2的,并不能直接运行在python3.

数据集下载链接:https://wwke.lanzoub.com/i2KGW0seo4yf

posted @ 2023-04-08 13:52  孤飞  阅读(456)  评论(0编辑  收藏  举报