[Tensorflow] Cookbook - CNN
Convolutional Neural Networks (CNNs) are responsible for the major breakthroughs in image recognition made in the past few years. In this chapter we will cover:
- Implementing a Simpler CNN
- Implementing an Advanced CNN
- Retraining Existing CNN models
- Applying Stylenet/Neural-Style
- Implementing DeepDream
熟悉基本操作后,转入Model的具体实践,以及可视化。
Let's look at how to achieve it by Tensorflow.
Log赏析:
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes. Extracting temp/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes. Extracting temp/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes. Extracting temp/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes. Extracting temp/t10k-labels-idx1-ubyte.gz
Generation # 5. Train Loss: 2.28. Train Acc (Test Acc): 14.00 (14.00) Generation # 10. Train Loss: 2.22. Train Acc (Test Acc): 14.00 (22.00) Generation # 15. Train Loss: 2.11. Train Acc (Test Acc): 40.00 (33.40) Generation # 20. Train Loss: 2.05. Train Acc (Test Acc): 48.00 (50.00) Generation # 25. Train Loss: 1.93. Train Acc (Test Acc): 52.00 (57.00) Generation # 30. Train Loss: 1.69. Train Acc (Test Acc): 64.00 (62.60) Generation # 35. Train Loss: 1.43. Train Acc (Test Acc): 67.00 (64.40) Generation # 40. Train Loss: 1.22. Train Acc (Test Acc): 63.00 (70.80) Generation # 45. Train Loss: 0.87. Train Acc (Test Acc): 82.00 (76.80) Generation # 50. Train Loss: 0.76. Train Acc (Test Acc): 80.00 (77.20) Generation # 55. Train Loss: 0.66. Train Acc (Test Acc): 80.00 (75.40) Generation # 60. Train Loss: 0.59. Train Acc (Test Acc): 81.00 (80.80) Generation # 65. Train Loss: 0.55. Train Acc (Test Acc): 79.00 (85.60) Generation # 70. Train Loss: 0.41. Train Acc (Test Acc): 85.00 (81.80) Generation # 75. Train Loss: 0.57. Train Acc (Test Acc): 83.00 (85.20) Generation # 80. Train Loss: 0.39. Train Acc (Test Acc): 90.00 (86.00) Generation # 85. Train Loss: 0.39. Train Acc (Test Acc): 90.00 (86.00) Generation # 90. Train Loss: 0.26. Train Acc (Test Acc): 92.00 (90.60) Generation # 95. Train Loss: 0.32. Train Acc (Test Acc): 90.00 (87.60) Generation # 100. Train Loss: 0.39. Train Acc (Test Acc): 89.00 (89.80) Generation # 105. Train Loss: 0.49. Train Acc (Test Acc): 85.00 (90.00) Generation # 110. Train Loss: 0.34. Train Acc (Test Acc): 88.00 (90.00) Generation # 115. Train Loss: 0.24. Train Acc (Test Acc): 91.00 (89.20) Generation # 120. Train Loss: 0.30. Train Acc (Test Acc): 92.00 (91.40) Generation # 125. Train Loss: 0.29. Train Acc (Test Acc): 89.00 (91.60) Generation # 130. Train Loss: 0.31. Train Acc (Test Acc): 93.00 (90.20) Generation # 135. Train Loss: 0.41. Train Acc (Test Acc): 85.00 (91.40) Generation # 140. Train Loss: 0.22. Train Acc (Test Acc): 94.00 (91.40) Generation # 145. Train Loss: 0.39. Train Acc (Test Acc): 85.00 (92.60) Generation # 150. Train Loss: 0.38. Train Acc (Test Acc): 93.00 (90.00) Generation # 155. Train Loss: 0.17. Train Acc (Test Acc): 96.00 (91.60) Generation # 160. Train Loss: 0.22. Train Acc (Test Acc): 93.00 (93.20) Generation # 165. Train Loss: 0.15. Train Acc (Test Acc): 97.00 (92.00) Generation # 170. Train Loss: 0.24. Train Acc (Test Acc): 92.00 (93.40) Generation # 175. Train Loss: 0.21. Train Acc (Test Acc): 93.00 (92.40) Generation # 180. Train Loss: 0.35. Train Acc (Test Acc): 90.00 (91.80) Generation # 185. Train Loss: 0.15. Train Acc (Test Acc): 95.00 (93.80) Generation # 190. Train Loss: 0.17. Train Acc (Test Acc): 96.00 (91.60) Generation # 195. Train Loss: 0.26. Train Acc (Test Acc): 89.00 (92.20) Generation # 200. Train Loss: 0.32. Train Acc (Test Acc): 91.00 (91.40) Generation # 205. Train Loss: 0.20. Train Acc (Test Acc): 93.00 (93.60) Generation # 210. Train Loss: 0.16. Train Acc (Test Acc): 97.00 (93.80) Generation # 215. Train Loss: 0.18. Train Acc (Test Acc): 95.00 (91.60) Generation # 220. Train Loss: 0.21. Train Acc (Test Acc): 96.00 (92.40) Generation # 225. Train Loss: 0.23. Train Acc (Test Acc): 93.00 (94.80) Generation # 230. Train Loss: 0.16. Train Acc (Test Acc): 97.00 (96.60) Generation # 235. Train Loss: 0.19. Train Acc (Test Acc): 93.00 (94.80) Generation # 240. Train Loss: 0.12. Train Acc (Test Acc): 97.00 (95.20) Generation # 245. Train Loss: 0.16. Train Acc (Test Acc): 96.00 (92.20) Generation # 250. Train Loss: 0.22. Train Acc (Test Acc): 92.00 (93.80) Generation # 255. Train Loss: 0.22. Train Acc (Test Acc): 94.00 (95.00) Generation # 260. Train Loss: 0.22. Train Acc (Test Acc): 90.00 (93.40) Generation # 265. Train Loss: 0.23. Train Acc (Test Acc): 93.00 (94.40) Generation # 270. Train Loss: 0.11. Train Acc (Test Acc): 96.00 (92.00) Generation # 275. Train Loss: 0.15. Train Acc (Test Acc): 94.00 (93.20) Generation # 280. Train Loss: 0.17. Train Acc (Test Acc): 97.00 (95.60) Generation # 285. Train Loss: 0.25. Train Acc (Test Acc): 90.00 (95.20) Generation # 290. Train Loss: 0.17. Train Acc (Test Acc): 95.00 (95.80) Generation # 295. Train Loss: 0.20. Train Acc (Test Acc): 93.00 (96.40) Generation # 300. Train Loss: 0.12. Train Acc (Test Acc): 96.00 (93.60) Generation # 305. Train Loss: 0.15. Train Acc (Test Acc): 94.00 (94.20) Generation # 310. Train Loss: 0.37. Train Acc (Test Acc): 88.00 (94.80) Generation # 315. Train Loss: 0.19. Train Acc (Test Acc): 96.00 (93.40) Generation # 320. Train Loss: 0.17. Train Acc (Test Acc): 95.00 (96.20) Generation # 325. Train Loss: 0.16. Train Acc (Test Acc): 92.00 (93.80) Generation # 330. Train Loss: 0.17. Train Acc (Test Acc): 96.00 (94.00) Generation # 335. Train Loss: 0.14. Train Acc (Test Acc): 96.00 (95.20) Generation # 340. Train Loss: 0.15. Train Acc (Test Acc): 96.00 (96.60) Generation # 345. Train Loss: 0.15. Train Acc (Test Acc): 94.00 (95.60) Generation # 350. Train Loss: 0.27. Train Acc (Test Acc): 91.00 (97.00) Generation # 355. Train Loss: 0.11. Train Acc (Test Acc): 98.00 (94.60) Generation # 360. Train Loss: 0.15. Train Acc (Test Acc): 95.00 (95.20) Generation # 365. Train Loss: 0.08. Train Acc (Test Acc): 98.00 (96.40) Generation # 370. Train Loss: 0.15. Train Acc (Test Acc): 94.00 (93.80) Generation # 375. Train Loss: 0.21. Train Acc (Test Acc): 92.00 (96.60) Generation # 380. Train Loss: 0.21. Train Acc (Test Acc): 96.00 (94.40) Generation # 385. Train Loss: 0.07. Train Acc (Test Acc): 99.00 (95.40) Generation # 390. Train Loss: 0.19. Train Acc (Test Acc): 94.00 (95.40) Generation # 395. Train Loss: 0.12. Train Acc (Test Acc): 97.00 (94.40) Generation # 400. Train Loss: 0.14. Train Acc (Test Acc): 96.00 (96.60) Generation # 405. Train Loss: 0.17. Train Acc (Test Acc): 95.00 (96.60) Generation # 410. Train Loss: 0.16. Train Acc (Test Acc): 93.00 (96.40) Generation # 415. Train Loss: 0.18. Train Acc (Test Acc): 93.00 (95.60) Generation # 420. Train Loss: 0.11. Train Acc (Test Acc): 95.00 (94.80) Generation # 425. Train Loss: 0.22. Train Acc (Test Acc): 91.00 (95.20) Generation # 430. Train Loss: 0.07. Train Acc (Test Acc): 98.00 (96.20) Generation # 435. Train Loss: 0.11. Train Acc (Test Acc): 97.00 (95.80) Generation # 440. Train Loss: 0.07. Train Acc (Test Acc): 97.00 (95.20) Generation # 445. Train Loss: 0.15. Train Acc (Test Acc): 99.00 (97.80) Generation # 450. Train Loss: 0.09. Train Acc (Test Acc): 98.00 (95.00) Generation # 455. Train Loss: 0.07. Train Acc (Test Acc): 97.00 (95.80) Generation # 460. Train Loss: 0.08. Train Acc (Test Acc): 98.00 (94.60) Generation # 465. Train Loss: 0.07. Train Acc (Test Acc): 98.00 (95.40) Generation # 470. Train Loss: 0.14. Train Acc (Test Acc): 98.00 (94.40) Generation # 475. Train Loss: 0.24. Train Acc (Test Acc): 93.00 (96.40) Generation # 480. Train Loss: 0.08. Train Acc (Test Acc): 99.00 (94.40) Generation # 485. Train Loss: 0.16. Train Acc (Test Acc): 96.00 (96.40) Generation # 490. Train Loss: 0.09. Train Acc (Test Acc): 95.00 (96.40) Generation # 495. Train Loss: 0.13. Train Acc (Test Acc): 95.00 (96.20) Generation # 500. Train Loss: 0.09. Train Acc (Test Acc): 99.00 (95.80)
Code解读:
- 加载数据
# Introductory CNN Model: MNIST Digits #--------------------------------------- # # In this example, we will download the MNIST handwritten # digits and create a simple CNN network to predict the # digit category (0-9) import matplotlib.pyplot as plt import numpy as np import tensorflow as tf from tensorflow.contrib.learn.python.learn.datasets.mnist import read_data_sets from tensorflow.python.framework import ops ops.reset_default_graph() # Start a graph session sess = tf.Session() # Load data data_dir = 'temp' mnist = read_data_sets(data_dir) # Convert images into 28x28 (they are downloaded as 1x784) train_xdata = np.array([np.reshape(x, (28,28)) for x in mnist.train.images]) test_xdata = np.array([np.reshape(x, (28,28)) for x in mnist.test.images]) # Convert labels into one-hot encoded vectors train_labels = mnist.train.labels test_labels = mnist.test.labels
train_xdata --> train_labels; test_xdata --> test_labels
- Set model parameters
# Set model parameters batch_size = 100 learning_rate = 0.005 evaluation_size = 500 image_width = train_xdata[0].shape[0] image_height = train_xdata[0].shape[1] target_size = max(train_labels) + 1 num_channels = 1 # greyscale = 1 channel generations = 500 eval_every = 5 conv1_features = 25 conv2_features = 50 max_pool_size1 = 2 # NxN window for 1st max pool layer max_pool_size2 = 2 # NxN window for 2nd max pool layer fully_connected_size1 = 100
- 构建Graph连接
# Declare model placeholders
x_input_shape = (batch_size, image_width, image_height, num_channels) # (100, 28, 28, 1)
x_input = tf.placeholder(tf.float32, shape=x_input_shape)
y_target = tf.placeholder(tf.int32, shape=(batch_size))
# Test/Evaluation
eval_input_shape = (evaluation_size, image_width, image_height, num_channels) # (500, 28, 28, 1)
eval_input = tf.placeholder(tf.float32, shape=eval_input_shape)
eval_target = tf.placeholder(tf.int32, shape=(evaluation_size))
# Declare model parameters
# For one layer
conv1_weight = tf.Variable(tf.truncated_normal([4, 4, num_channels, conv1_features], stddev=0.1, dtype=tf.float32)) # conv1_features = 25 conv kernels
conv1_bias = tf.Variable(tf.zeros([conv1_features], dtype=tf.float32))
# For another layer
conv2_weight = tf.Variable(tf.truncated_normal([4, 4, conv1_features, conv2_features], stddev=0.1, dtype=tf.float32)) # conv2_features = 50 conv kernels
conv2_bias = tf.Variable(tf.zeros([conv2_features], dtype=tf.float32))
# fully connected variables
resulting_width = image_width // (max_pool_size1 * max_pool_size2)
resulting_height = image_height // (max_pool_size1 * max_pool_size2)
# Pooling两次后缩小为1/4
# For one layer
full1_input_size = resulting_width * resulting_height * conv2_features # 50个缩小后的feature map?总size?简直的人海战术
full1_weight = tf.Variable(tf.truncated_normal([full1_input_size, fully_connected_size1], stddev=0.1, dtype=tf.float32)) # fully_connected_size1 = 100
full1_bias = tf.Variable(tf.truncated_normal([ fully_connected_size1], stddev=0.1, dtype=tf.float32))
# For another layer
full2_weight = tf.Variable(tf.truncated_normal([fully_connected_size1, target_size], stddev=0.1, dtype=tf.float32))
full2_bias = tf.Variable(tf.truncated_normal([ target_size], stddev=0.1, dtype=tf.float32))
- 构建Graph结构
# Initialize Model Operations
def my_conv_net(input_data):
# First Conv-ReLU-MaxPool Layer
conv1 = tf.nn.conv2d(input_data, conv1_weight, strides=[1, 1, 1, 1], padding='SAME')
relu1 = tf.nn.relu( tf.nn.bias_add(conv1, conv1_bias) )
max_pool1 = tf.nn.max_pool(relu1,
ksize = [1, max_pool_size1, max_pool_size1, 1],
strides = [1, max_pool_size1, max_pool_size1, 1],
padding = 'SAME')
# Second Conv-ReLU-MaxPool Layer
conv2 = tf.nn.conv2d(max_pool1, conv2_weight, strides=[1, 1, 1, 1], padding='SAME')
relu2 = tf.nn.relu( tf.nn.bias_add(conv2, conv2_bias) )
max_pool2 = tf.nn.max_pool(relu2,
ksize = [1, max_pool_size2, max_pool_size2, 1],
strides = [1, max_pool_size2, max_pool_size2, 1],
padding = 'SAME')
# Transform Output into a 1xN layer for next fully connected layer
final_conv_shape = max_pool2.get_shape().as_list()
final_shape = final_conv_shape[1] * final_conv_shape[2] * final_conv_shape[3]
flat_output = tf.reshape(max_pool2, [final_conv_shape[0], final_shape])
# First Fully Connected Layer
fully_connected1 = tf.nn.relu( tf.add(tf.matmul(flat_output, full1_weight), full1_bias) )
# Second Fully Connected Layer
final_model_output = tf.add( tf.matmul(fully_connected1, full2_weight), full2_bias )
return(final_model_output)
model_output = my_conv_net(x_input)
test_model_output = my_conv_net(eval_input)
- Loss and solver
# Declare Loss Function (softmax cross entropy)
loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(model_output, y_target))
# Create an optimizer
my_optimizer = tf.train.MomentumOptimizer(learning_rate, 0.9)
train_step = my_optimizer.minimize(loss)
通过梯度下降减小loss,对于GD的优化方式采用了Momentum。
- 数据训练
# Create a prediction function prediction = tf.nn.softmax(model_output) test_prediction = tf.nn.softmax(test_model_output) # Create accuracy function def get_accuracy(logits, targets): batch_predictions = np.argmax(logits, axis=1) num_correct = np.sum(np.equal(batch_predictions, targets)) return(100. * num_correct/batch_predictions.shape[0]) # Initialize Variables init = tf.initialize_all_variables() sess.run(init) # Start training loop train_loss = [] train_acc = [] test_acc = []
for i in range(generations): rand_index = np.random.choice(len(train_xdata), size=batch_size) # 首先,train_xdata是三维(样本th, h, w) rand_x = train_xdata[rand_index] rand_x = np.expand_dims(rand_x, 3) # 其次,这里增加了一维(第四维),是为了channel rand_y = train_labels[rand_index]
train_dict = {x_input: rand_x, y_target: rand_y} # 毕竟,feed时,x_input已规定为四维(样本th, h, w, ch),这是卷积api的定义
# training...
sess.run(train_step, feed_dict=train_dict)
# for loss
temp_train_loss = sess.run(loss, feed_dict=train_dict)
# for accuracy
temp_train_preds = sess.run(predicton, feed_dict=train_dict)
temp_train_acc = get_accuracy(temp_train_preds, rand_y)
# 每次都要执行,这是当然;但只需要每五次打印一下log就可以了
# NB: 打印log的过程,其实也包含里TEST过程
if (i+1) % eval_every == 0:
eval_index = np.random.choice(len(test_xdata), size=evaluation_size)
eval_x = test_xdata[eval_index]
eval_x = np.expand_dims(eval_x, 3)
eval_y = test_labels[eval_index]
test_dict = {eval_input: eval_x, eval_target: eval_y}
test_preds = sess.run(test_prediction, feed_dict=test_dict)
temp_test_acc = get_accuracy(test_preds, eval_y)
# Record and print results 可见,主要记录了三个指标(test时可不需要关注loss)
train_loss.append(temp_train_loss)
train_acc.append(temp_train_acc)
test_acc.append(temp_test_acc)
acc_and_loss = [(i+1), temp_train_loss, temp_train_acc, temp_test_acc]
acc_and_loss = [np.round(x,2) for x in acc_and_loss]
print('Generation # {}. Train Loss: {:.2f}. Train Acc (Test Acc): {:.2f} ({:.2f})'.format(*acc_and_loss))
- 结果展示
# Matlotlib code to plot the loss and accuracies eval_indices = range(0, generations, eval_every) # Plot loss over time plt.plot(eval_indices, train_loss, 'k-') plt.title('Softmax Loss per Generation') plt.xlabel('Generation') plt.ylabel('Softmax Loss') plt.show() # Plot train and test accuracy plt.plot(eval_indices, train_acc, 'k-', label='Train Set Accuracy') plt.plot(eval_indices, test_acc, 'r--', label='Test Set Accuracy') plt.title('Train and Test Accuracy') plt.xlabel('Generation') plt.ylabel('Accuracy') plt.legend(loc='lower right') plt.show() # Plot some samples # Plot the 6 of the last batch results: actuals = rand_y[0:6] predictions = np.argmax(temp_train_preds,axis=1)[0:6] images = np.squeeze(rand_x[0:6]) Nrows = 2 Ncols = 3 for i in range(6): plt.subplot(Nrows, Ncols, i+1) plt.imshow(np.reshape(images[i], [28,28]), cmap='Greys_r') plt.title('Actual: ' + str(actuals[i]) + ' Pred: ' + str(predictions[i]), fontsize=10) frame = plt.gca() frame.axes.get_xaxis().set_visible(False) frame.axes.get_yaxis().set_visible(False)
NB: 这个比 tf.nn.conv2d 好用了许多!
conv1 = tf.layers.conv2d(X,
convlayer_sizes[0], # 使用几个filter卷积出多少个map
kernel_size =filter_shape,
padding =padding,
activation =tf.nn.relu,
bias_initializer =tf.zeros_initializer(),
kernel_regularizer=tf.nn.l2_loss,
bias_regularizer =tf.nn.l2_loss,
name ="conv1")
运行效率也要高一些。
$ python train.py Extracting data/mnist/train-images-idx3-ubyte.gz Extracting data/mnist/train-labels-idx1-ubyte.gz Extracting data/mnist/t10k-images-idx3-ubyte.gz Extracting data/mnist/t10k-labels-idx1-ubyte.gz Starting Training... Epoch 0, Training Loss: 1.60813545597, Test accuracy: 0.938201121795, time: 5.14s, total time: 6.15s Epoch 1, Training Loss: 1.52455213715, Test accuracy: 0.950721153846, time: 4.53s, total time: 11.64s Epoch 2, Training Loss: 1.51011738271, Test accuracy: 0.964743589744, time: 4.59s, total time: 17.22s Epoch 3, Training Loss: 1.50035703349, Test accuracy: 0.966746794872, time: 4.48s, total time: 22.65s Epoch 4, Training Loss: 1.49455949921, Test accuracy: 0.965044070513, time: 4.53s, total time: 28.17s Epoch 5, Training Loss: 1.49036714219, Test accuracy: 0.969851762821, time: 4.58s, total time: 33.72s Epoch 6, Training Loss: 1.48730323059, Test accuracy: 0.973657852564, time: 4.46s, total time: 39.21s Epoch 7, Training Loss: 1.48489333978, Test accuracy: 0.972155448718, time: 4.51s, total time: 44.72s Epoch 8, Training Loss: 1.48286200987, Test accuracy: 0.975260416667, time: 4.52s, total time: 50.22s Epoch 9, Training Loss: 1.48113634647, Test accuracy: 0.973657852564, time: 4.5s, total time: 55.69s Epoch 10, Training Loss: 1.47969393491, Test accuracy: 0.974559294872, time: 4.56s, total time: 61.21s Epoch 11, Training Loss: 1.47822354946, Test accuracy: 0.975360576923, time: 4.69s, total time: 66.84s Epoch 12, Training Loss: 1.47717202571, Test accuracy: 0.975360576923, time: 4.61s, total time: 72.51s Epoch 13, Training Loss: 1.47643559991, Test accuracy: 0.974859775641, time: 4.53s, total time: 77.97s Epoch 14, Training Loss: 1.4753200641, Test accuracy: 0.977864583333, time: 4.66s, total time: 83.59s Epoch 15, Training Loss: 1.47432387181, Test accuracy: 0.97796474359, time: 4.44s, total time: 88.93s Epoch 16, Training Loss: 1.47376608154, Test accuracy: 0.978565705128, time: 4.52s, total time: 94.4s Epoch 17, Training Loss: 1.47339749864, Test accuracy: 0.976362179487, time: 4.53s, total time: 99.92s Epoch 18, Training Loss: 1.47282876363, Test accuracy: 0.979967948718, time: 4.61s, total time: 105.53s Epoch 19, Training Loss: 1.47205684374, Test accuracy: 0.980268429487, time: 4.53s, total time: 110.99s Total training time: 110.99s Confusion Matrix: [[ 972 1 1 0 1 2 8 2 5 3] [ 0 1123 6 0 0 0 2 2 0 4] [ 1 4 1006 3 1 2 1 14 5 1] [ 0 1 2 993 0 5 1 4 4 5] [ 0 0 2 0 970 0 4 1 1 6] [ 2 3 0 5 0 877 1 0 2 5] [ 2 0 3 0 1 2 936 0 1 0] [ 1 1 6 4 1 1 0 999 4 10] [ 2 2 6 4 2 3 5 3 951 5] [ 0 0 0 1 6 0 0 3 1 970]] Training Complete