Keras Francois_chollet_Python深度学习 - 电影评论分类——二分类问题

关于

  • Keras Francois_chollet_Python深度学习
  • 电影评论分类:二分类问题
  • 3.5-classifying-movie-reviews.ipynb
    代码运行后存在如下问题,
    ?可能与成书时间是2017年有关 ,现在是2021年,数据集和接口有更新:

问题1:

from keras.datasets import imdb  
  • 如果没有装GPU库,会有如下提示

2021-03-06 16:56:17.006782: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2021-03-06 16:56:17.012966: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

问题2:

数据集

(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000) 

要从国外网站下载数据,数据大小是17M 会比较慢 ,
参考1 : https://www.cnblogs.com/wangle1001986/p/11336471.html
参考2: https://www.cnblogs.com/wt7018/p/13092512.html
解决
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

问题3

#解码为单词
# word_index is a dictionary mapping words to an integer index(word_index 是一个将单词映射为整数索引的字典)
word_index = imdb.get_word_index()  #这里会下载一个dictionary
# We reverse it, mapping integer indices to words(键值颠倒,将整数索引映射为单词)
reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
# We decode the review; note that our indices were offset by 3
# because 0, 1 and 2 are reserved indices for "padding", "start of sequence", and "unknown".
#将评论解码。注意,索引减去了 3,
# 因为 0、1、2 是为“padding”(填充)、“start of sequence”(序列开始)、“unknown”(未知词)分别保留的索引
#decoded_review = ' '.join([reverse_word_index.get(i - 3, '?') for i in train_data[0]])  # i-3 错误,
#应该是
decoded_review = ' '.join([reverse_word_index.get(i , '?') for i in train_data[0]])  # i-3 错误,
posted @ 2021-03-06 17:03  boyang987  阅读(74)  评论(0编辑  收藏  举报