Keras Francois_chollet_Python深度学习 - 电影评论分类——二分类问题

关于

Keras Francois_chollet_Python深度学习
电影评论分类：二分类问题
3.5-classifying-movie-reviews.ipynb
代码运行后存在如下问题，
?可能与成书时间是2017年有关，现在是2021年，数据集和接口有更新：

问题1：

from keras.datasets import imdb

如果没有装GPU库，会有如下提示

2021-03-06 16:56:17.006782: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2021-03-06 16:56:17.012966: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

问题2：

数据集

(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

要从国外网站下载数据，数据大小是17M 会比较慢，
参考1 : https://www.cnblogs.com/wangle1001986/p/11336471.html
参考2： https://www.cnblogs.com/wt7018/p/13092512.html
解决
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

问题3

#解码为单词
# word_index is a dictionary mapping words to an integer index（word_index 是一个将单词映射为整数索引的字典）
word_index = imdb.get_word_index()  #这里会下载一个dictionary
# We reverse it, mapping integer indices to words（键值颠倒，将整数索引映射为单词）
reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
# We decode the review; note that our indices were offset by 3
# because 0, 1 and 2 are reserved indices for "padding", "start of sequence", and "unknown".
#将评论解码。注意，索引减去了 3，
# 因为 0、1、2 是为“padding”（填充）、“start of sequence”（序列开始）、“unknown”（未知词）分别保留的索引
#decoded_review = ' '.join([reverse_word_index.get(i - 3, '?') for i in train_data[0]])  # i-3 错误，
#应该是
decoded_review = ' '.join([reverse_word_index.get(i , '?') for i in train_data[0]])  # i-3 错误，

posted @ 2021-03-06 17:03 boyang987 阅读(94) 评论(0) 收藏举报

刷新页面返回顶部

boyang987

Keras Francois_chollet_Python深度学习 - 电影评论分类——二分类问题

问题1：

问题2：

问题3

公告