关闭页面特效

人工智能深度学习入门练习之（28）TensorFlow – 例子：循环神经网络(RNN)

阅读目录

数据集
训练
实现

循环神经网络(RNN)是一种用于处理序列数据的人工神经网络，序列数据是相互依赖的（有限或无限）数据流，比如时间序列数据、信息性的字符串、对话等。

长短时记忆网络(LSTM)是一类特殊的循环神经网络，具有学习长时依赖关系的能力，是目前最常用的循环神经网络。

注意: 关于循环神经网络的介绍，可参考我们的教程深度学习 – 循环神经网络(RNN)。

我们的例子是训练一个LSTM模型，训练时模型会学习一段短文，完成训练后，模型可以在输入一个短句后，预测接下来的单词。

回到顶部

数据集

数据集是一段短文，《伊索寓言》中的老鼠给猫挂铃铛的故事：

long ago , the mice had a general council to consider what measures they could take to outwit their common enemy , the cat . some said this , and some said that but at last a young mouse got up and said he had a proposal to make , which he thought would meet the case . you will all agree , said he , that our chief danger consists in the sly and treacherous manner in which the enemy approaches us . now , if we could receive some signal of her approach , we could easily escape from her . i venture , therefore , to propose that a small bell be procured , and attached by a ribbon round the neck of the cat . by this means we should always know when she was about , and could easily retire while she was in the neighborhood . this proposal met with general applause , until an old mouse got up and said that is all very well , but who is to bell the cat ? the mice looked at one another and nobody spoke . then the old mouse said it is easy to propose impossible remedies .

这篇短文有112个不重复的符号，单词和标点符号都被认为是符号。

回到顶部

训练

如果我们向LSTM输入3个正确排序的符号，和一个标签符号，模型最终将学会正确预测下一个符号(图1)。

图1. 具有三个输入和一个输出的LSTM单元。

技术上来说，LSTM只能理解数字。因此，需要对上面的短文作一些处理，每个不重复的单词和标点都用数字代替，总共有112个不重复的符号。

我们将创建2个字典，一个可以从单词/标点映射到数字，另一个可以从数字映射到单词/标点(反向字典)。例如，上面的文本中有112个独特的符号，第一个字典包含以下条目[ “,” : 0 ] [ “the” : 1 ], …, [ “council” : 37 ],…,[ “spoke” : 111 ]。同时构建一个反向字典，将用于解码LSTM的输出。

LSTM输入输出都是数字，LSTM应该输出一个符号数字代号，用来表示符号。例如，如果输出是37，表示单词”council”。

但是，LSTM输出的实际上是一个112个元素的向量，每个元素值表示对应符号的概率，概率最大的符号就是最终符号(图2)。

图2. 每个输入符号都被替换成数字代号。输出是一个表示本次输出各符号概率的向量，读取概率最大的符号代号，结合反向字典，最终查出符号。

回到顶部

实现

下面是实现代码：

from __future__ import print_function

import numpy as np
import tensorflow as tf
from tensorflow.contrib import rnn
import random
import collections
import time

start_time = time.time()
def elapsed(sec):
    if sec<60:
        return str(sec) + " sec"
    elif sec<(60*60):
        return str(sec/60) + " min"
    else:
        return str(sec/(60*60)) + " hr"


# 日志目录
logs_path = './train/rnn_words'
writer = tf.summary.FileWriter(logs_path)

# 训练用的短文
training_file = 'belling_the_cat.txt'

# 读取短文函数
def read_data(fname):
    with open(fname) as f:
        content = f.readlines()
    content = [x.strip() for x in content]
    content = [word for i in range(len(content)) for word in content[i].split()]
    content = np.array(content)
    return content

training_data = read_data(training_file)
print("Loaded training data...")

# 短文符号->数字字典，数字->短文符号反向字典
def build_dataset(words):
    count = collections.Counter(words).most_common()
    dictionary = dict()
    for word, _ in count:
        dictionary[word] = len(dictionary)
    reverse_dictionary = dict(zip(dictionary.values(), dictionary.keys()))
    return dictionary, reverse_dictionary

# 创建字典与反向字典
dictionary, reverse_dictionary = build_dataset(training_data)
# 符号数量
vocab_size = len(dictionary)

# 参数
learning_rate = 0.001
training_iters = 50000
display_step = 1000
n_input = 3

# RNN cell中的神经数量
n_hidden = 512

# tf Graph input
x = tf.placeholder("float", [None, n_input, 1])
y = tf.placeholder("float", [None, vocab_size])

# RNN 输出节点的 weights 与 biases
weights = {
    'out': tf.Variable(tf.random_normal([n_hidden, vocab_size]))
}
biases = {
    'out': tf.Variable(tf.random_normal([vocab_size]))
}

def RNN(x, weights, biases):

    # reshape 到 [-1, n_input]
    x = tf.reshape(x, [-1, n_input])

    # Generate a n_input-element sequence of inputs
    # (eg. [had] [a] [general] -> [20] [6] [33])
    x = tf.split(x,n_input,1)

    # 2-layer LSTM, each layer has n_hidden units.
    # Average Accuracy= 95.20% at 50k iter
    rnn_cell = rnn.MultiRNNCell([rnn.BasicLSTMCell(n_hidden),rnn.BasicLSTMCell(n_hidden)])

    # 1-layer LSTM with n_hidden units but with lower accuracy.
    # Average Accuracy= 90.60% 50k iter
    # Uncomment line below to test but comment out the 2-layer rnn.MultiRNNCell above
    # rnn_cell = rnn.BasicLSTMCell(n_hidden)

    # generate prediction
    outputs, states = rnn.static_rnn(rnn_cell, x, dtype=tf.float32)

    # there are n_input outputs but
    # we only want the last output
    return tf.matmul(outputs[-1], weights['out']) + biases['out']

pred = RNN(x, weights, biases)

# 计算损失
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
# 优化
optimizer = tf.train.RMSPropOptimizer(learning_rate=learning_rate).minimize(cost)

# 模型评估
correct_pred = tf.equal(tf.argmax(pred,1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# 初始化变量
init = tf.global_variables_initializer()

# 执行图
with tf.Session() as session:
    session.run(init)
    step = 0
    offset = random.randint(0,n_input+1)
    end_offset = n_input + 1
    acc_total = 0
    loss_total = 0

    writer.add_graph(session.graph)

    while step < training_iters:
        # Generate a minibatch. Add some randomness on selection process.
        if offset > (len(training_data)-end_offset):
            offset = random.randint(0, n_input+1)

        # 准备样本数据和标签
        symbols_in_keys = [ [dictionary[ str(training_data[i])]] for i in range(offset, offset+n_input) ]
        symbols_in_keys = np.reshape(np.array(symbols_in_keys), [-1, n_input, 1])

        symbols_out_onehot = np.zeros([vocab_size], dtype=float)
        symbols_out_onehot[dictionary[str(training_data[offset+n_input])]] = 1.0
        symbols_out_onehot = np.reshape(symbols_out_onehot,[1,-1])

        # 执行optimizer, accuracy, cost, pred
        _, acc, loss, onehot_pred = session.run([optimizer, accuracy, cost, pred], \
                                                feed_dict={x: symbols_in_keys, y: symbols_out_onehot})
        loss_total += loss
        acc_total += acc

        # 每过一定步数(display_step)，打印信息
        if (step+1) % display_step == 0:
            print("Iter= " + str(step+1) + ", Average Loss= " + \
                  "{:.6f}".format(loss_total/display_step) + ", Average Accuracy= " + \
                  "{:.2f}%".format(100*acc_total/display_step))
            acc_total = 0
            loss_total = 0
            symbols_in = [training_data[i] for i in range(offset, offset + n_input)]
            symbols_out = training_data[offset + n_input]
            symbols_out_pred = reverse_dictionary[int(tf.argmax(onehot_pred, 1).eval())]
            print("%s - [%s] vs [%s]" % (symbols_in,symbols_out,symbols_out_pred))

        # 递增step, offset
        step += 1
        offset += (n_input+1)

    # 训练完成，打印输出
    print("Optimization Finished!")
    print("Elapsed time: ", elapsed(time.time() - start_time))
    print("Run on command line.")
    print("\ttensorboard --logdir=%s" % (logs_path))
    print("Point your web browser to: http://localhost:6006/")

    # 测试：接受用户输入，生成输出
    while True:
        prompt = "%s words: " % n_input
        sentence = input(prompt)
        sentence = sentence.strip()
        words = sentence.split(' ')
        if len(words) != n_input:
            continue
        try:
            symbols_in_keys = [dictionary[str(words[i])] for i in range(len(words))]

            # 连续进行32次
            for i in range(32):
                keys = np.reshape(np.array(symbols_in_keys), [-1, n_input, 1])
                onehot_pred = session.run(pred, feed_dict={x: keys})
                onehot_pred_index = int(tf.argmax(onehot_pred, 1).eval())
                sentence = "%s %s" % (sentence,reverse_dictionary[onehot_pred_index])
                symbols_in_keys = symbols_in_keys[1:]
                symbols_in_keys.append(onehot_pred_index)
            print(sentence)
        except:
            print("Word not in dictionary")

输出

Iter= 1000, Average Loss= 4.428141, Average Accuracy= 5.10%
['nobody', 'spoke', '.'] - [then] vs [then]
Iter= 2000, Average Loss= 2.937925, Average Accuracy= 17.60%
['?', 'the', 'mice'] - [looked] vs [looked]
Iter= 3000, Average Loss= 2.401870, Average Accuracy= 31.00%
['an', 'old', 'mouse'] - [got] vs [got]
Iter= 4000, Average Loss= 2.079050, Average Accuracy= 46.10%
['when', 'she', 'was'] - [about] vs [about]
Iter= 5000, Average Loss= 1.756826, Average Accuracy= 52.40%
['a', 'small', 'bell'] - [be] vs [be]
Iter= 6000, Average Loss= 1.620517, Average Accuracy= 57.80%
['from', 'her', '.'] - [i] vs [cat]
Iter= 7000, Average Loss= 1.410994, Average Accuracy= 61.50%
['some', 'signal', 'of'] - [her] vs [be]
Iter= 8000, Average Loss= 1.340336, Average Accuracy= 65.60%
['enemy', 'approaches', 'us'] - [.] vs [,]

...

Iter= 48000, Average Loss= 0.447481, Average Accuracy= 90.60%
['general', 'council', 'to'] - [consider] vs [consider]
Iter= 49000, Average Loss= 0.527762, Average Accuracy= 89.30%
['ago', ',', 'the'] - [mice] vs [mice]
Iter= 50000, Average Loss= 0.375872, Average Accuracy= 91.50%
['spoke', '.', 'then'] - [the] vs [the]
Optimization Finished!
Elapsed time:  31.482173871994018 min
Run on command line.
        tensorboard --logdir=./train/rnn_words
Point your web browser to: http://localhost:6006/
3 words: the mice had
the mice had a general council to said that is all a consider what measures they enemy , the cat . by this means we should always know when when when when when when when

posted on 2020-06-20 21:38 大码王阅读(470) 评论(0) 编辑收藏举报

刷新页面返回顶部

登录后才能查看或发表评论，立即登录或者逛逛博客园首页

公告

青青陵上柏，磊磊X@a`

运行时长：2258天0小时58分22秒

您的浏览器不兼容canvas

昵称：大码王
园龄： 5年8个月
粉丝： 233
关注： 30

+加关注

2025年3月

日

一

二

三

四

五

六

随笔分类 (719)

clickhouse(4)

flink源码分析(2)

Groovy(1)

Java(34)

Linux(3)

office(10)

OpenStack入门(1)

Phoenix+hbase(11)

photoshop(10)

python之绘图(7)

python之爬虫(15)

python之入门到实战(26)

shell大全(1)

SparkCore(14)

sparkGraphx(2)

sparksql(8)

sparkstreaming(17)

spark源码分析(11)

博客园美化(6)

操作系统(1)

随笔档案 (693)

2024年5月(4)

2024年3月(3)

2023年9月(1)

2023年4月(2)

2023年3月(4)

2023年2月(1)

2022年12月(1)

2022年11月(1)

2022年9月(2)

2022年8月(17)

2022年7月(5)

2022年5月(3)

2022年4月(18)

2021年9月(1)

2021年6月(9)

2021年5月(19)

2021年2月(1)

2021年1月(17)

2020年12月(7)

2020年11月(19)

文章分类 (35)

airflow(4)

azkban(1)

canal(1)

Cassandra(1)

datax(1)

druid(1)

Elasticsearch(8)

java(11)

mongodb(2)

redis(3)

scala(2)

文章档案 (40)

2024年4月(2)

2023年5月(2)

2023年4月(1)

2023年1月(1)

2020年6月(9)

2020年5月(25)

数据集

训练

实现

公告

搜索

常用链接

最新随笔

积分与排名

随笔分类 (719)

随笔档案 (693)

文章分类 (35)

文章档案 (40)

阅读排行榜

评论排行榜

推荐排行榜

最新评论

喜欢请打赏

目录导航