Coding according to TensorFlow 官方文档中文版
中文注释源于:tf.truncated_normal与tf.random_normal
1 import tensorflow as tf 2 from tensorflow.examples.tutorials.mnist import input_data 3 mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) 4 5 6 ''' Intro. for this python file. 7 Objective: 8 Implement for a Convolutional Neural Network on MNIST. 9 Operating Environment: 10 python = 3.6.4 11 tensorflow = 1.5.0 12 ''' 13 14 15 # To avoid initializing weight/bias variables repeatedly, we define two functions for initialization. 16 ''' tf.truncated_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None) 17 Explanation: 18 Outputs random values from a truncated normal distribution. The generated values follow a normal distribution with 19 specified mean and standard deviation, except that values whose magnitude is more than 2 standard deviations from 20 the mean are dropped and re-picked. 21 从截断的正态分布输出随机值。 22 Args: 23 shape: A 1-D integer Tensor or Python array. The shape of the output tensor. 24 mean: A 0-D Tensor or Python value of type dtype. The mean of the truncated normal distribution. 25 stddev: A 0-D Tensor or Python value of type dtype. The standard deviation of the normal distribution, before truncation. 26 dtype: The type of the output. 27 seed: A Python integer. Used to create a random seed for the distribution. See tf.set_random_seed for behavior. 28 name: A name for the operation (optional). 29 Returns: 30 A tensor of the specified shape filled with random truncated normal values. 31 ''' 32 33 34 def weight_variable(shape): 35 initial = tf.truncated_normal(shape, stddev=0.1) 36 return tf.Variable(initial) 37 38 39 def bias_variable(shape): 40 initial = tf.constant(value=0.1, shape=shape) 41 return tf.Variable(initial) 42 43 44 # Convolution 45 ''' tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=True, data_format="NHWC", dilations=[1, 1, 1, 1], name=None) 46 Explanation: 47 Computes a 2-D convolution given 4-D input and filter tensors. Given an input tensor of shape [batch, in_height, 48 in_width, in_channels] and a filter / kernel tensor of shape [filter_height, filter_width, in_channels, out_channels], 49 this op performs the following: 50 1. Flattens the filter to a 2-D matrix with shape [filter_height * filter_width * in_channels, output_channels]. 51 2. Extracts image patches from the input tensor to form a virtual tensor of shape [batch, out_height, out_width, 52 filter_height * filter_width * in_channels]. 53 3. For each patch, right-multiplies the filter matrix and the image patch vector. 54 In detail, with the default NHWC format, 55 output[b, i, j, k] = sum_{di, dj, q} input[b, strides[1] * i + di, strides[2] * j + dj, q] * filter[di, dj, q, k] 56 Must have strides[0] = strides[3] = 1. For the most common case of the same horizontal and vertices strides, 57 strides = [1, stride, stride, 1]. 58 Args: 59 input: A Tensor. Must be one of the following types: half, bfloat16, float32, float64. A 4-D tensor. The dimension 60 order is interpreted according to the value of data_format, see below for details. 61 filter: A Tensor. Must have the same type as input. A 4-D tensor of shape [filter_height, filter_width, in_channels, 62 out_channels] 63 strides: A list of ints. 1-D tensor of length 4. The stride of the sliding window for each dimension of input. The 64 dimension order is determined by the value of data_format, see below for details. 65 padding: A string from: "SAME", "VALID". The type of padding algorithm to use. 66 use_cudnn_on_gpu: An optional bool. Defaults to True. 67 data_format: An optional string from: "NHWC", "NCHW". Defaults to "NHWC". Specify the data format of the input and 68 output data. With the default format "NHWC", the data is stored in the order of: [batch, height, width, 69 channels]. Alternatively, the format could be "NCHW", the data storage order of: [batch, channels, 70 height, width]. 71 dilations: An optional list of ints. Defaults to [1, 1, 1, 1]. 1-D tensor of length 4. The dilation factor for each 72 dimension of input. If set to k > 1, there will be k-1 skipped cells between each filter element on that 73 dimension. The dimension order is determined by the value of data_format, see above for details. Dilations 74 in the batch and depth dimensions must be 1. 75 name: A name for the operation (optional). 76 第一个参数input:指需要做卷积的输入图像,要求是一个Tensor,具有[batch, in_height, in_width, in_channels]这样的shape, 77 具体含义是[训练时一个batch的图片数量,图片高度,图片宽度,图片通道数],注意这是一个4维的Tensor,要 78 求类型为float32和float64其中之一。 79 第二个参数filter:相当于CNN中的卷积核,它要求是一个Tensor,具有[filter_height, filter_width, in_channels, out_channels] 80 这样的shape,具体含义是[卷积核的高度,卷积核的宽度,图像通道数,卷积核个数],要求类型与参数input相 81 同,此处,第三维in_channels,就是参数input的第四维。 82 第三个参数strides:卷积操作在图像每一维上的步长(strides[0]控制batch,strides[1]控制height,strides[2]控制width, 83 strides[3]控制channels,第一个和最后一个跨度参数通常很少修改,因为它们会在该运算中跳过一些数据, 84 从而不将这部分数据考虑在内,如果希望降低输入的维数,可修改height和width参数),这是一个一维向量, 85 长度为4。 86 第四个参数padding:string类型的参数,只能是“SAME”和“VALID”中的一个,这个值决定了不同的卷积方式。 87 第五个参数use_cudnn_on_gpu:bool类型,是否使用cudnn加速,默认为true。 88 Returns: 89 A Tensor. Has the same type as input. 90 返回一个Tensor,这个输出就是我们常说的feature map,shape依然是[batch, height, width, channels]这种形式。 91 ''' 92 93 94 def conv2d(x, W): 95 return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding="SAME") 96 97 98 ''' tf.nn.max_pool(value, ksize, strides, padding, data_format="NHWC", name=None) 99 Explanation: 100 Performs the max pooling on the input. 101 Args: 102 value: A 4-D Tensor of the format specified by data_format. 103 ksize: A list or tuple of 4 ints. The size of the window for each dimension of the input tensor. 104 strides: A list or tuple of 4 ints. The stride of the sliding window for each dimension of the input tensor. 105 padding: A string, either 'VALID' or 'SAME'. The padding algorithm. See the comment here 106 data_format: A string. 'NHWC', 'NCHW' and 'NCHW_VECT_C' are supported. 107 name: Optional name for the operation. 108 第一个参数value:池化操作的输入,一般池化层接在卷积层后面,所以输入通常是feature map,依然是[batch, height, width, channels] 109 这样的shape。 110 第二个参数ksize:池化窗口的大小,取一个四维向量,一般是[1, height, width, 1],因为我们不想在batch和channels上做池化, 111 所以将这两个维度设为1。 112 第三个参数strides:和卷积类似,窗口在每一个维度上滑动的步长,一般也是[1, stride, stride, 1]。 113 第四个参数padding:和卷积类似,可以取“VALID”或者“SAME”。 114 Returns: 115 A Tensor of format specified by data_format. The max pooled output tensor. 116 返回一个Tensor,类型不变,shape仍然是[batch, height, width, channels]这种形式。 117 ''' 118 119 120 # Pooling 121 def max_pool_2x2(x): 122 return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME") 123 124 125 # Set a placeholder. We hope arbitrary number of images could be input to this model. 126 x = tf.placeholder("float", [None, 784]) 127 128 # Set a placeholder 'y_' to accept the ground-truth values. 129 y_ = tf.placeholder("float", [None, 10]) 130 131 # 1st Convolutional Layer 132 W_conv1 = weight_variable([5, 5, 1, 32]) 133 b_conv1 = bias_variable([32]) 134 135 # In order to utilize this layer, we convert x to a 4-D vector. 136 x_image = tf.reshape(x, [-1, 28, 28, 1]) 137 138 # 1st ReLU & Max-Pooling 139 h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) 140 h_pool1 = max_pool_2x2(h_conv1) 141 142 # 2nd Convolutional Layer 143 W_conv2 = weight_variable([5, 5, 32, 64]) 144 b_conv2 = bias_variable([64]) 145 146 h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) 147 h_pool2 = max_pool_2x2(h_conv2) 148 149 # Fully Connected Layer 150 W_fc1 = weight_variable([7 * 7 * 64, 1024]) 151 b_fc1 = bias_variable([1024]) 152 153 h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64]) 154 h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1) 155 156 # Dropout Layer 157 keep_prob = tf.placeholder("float") 158 h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) 159 160 # Output Layer 161 W_fc2 = weight_variable([1024, 10]) 162 b_fc2 = bias_variable([10]) 163 164 y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2) 165 166 # Training and Evaluation 167 cross_entropy = -tf.reduce_sum(y_ * tf.log(y_conv)) 168 train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) 169 correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1)) 170 accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) 171 172 # Launch the graph in a session. 173 sess = tf.Session() 174 sess.run(tf.global_variables_initializer()) 175 for i in range(20000): 176 batch = mnist.train.next_batch(50) 177 if i % 100 == 0: 178 # train_accuracy = accuracy.eval(feed_dict={x: batch[0], y_: batch[1], keep_prob: 1.0}) # ValueError 179 train_accuracy = accuracy.eval(session=sess, feed_dict={x: batch[0], y_: batch[1], keep_prob: 1.0}) 180 print("step %d, training accuracy %g" % (i, train_accuracy)) 181 # train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) # ValueError 182 # sess.run(train_step, feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) # Correct 183 train_step.run(session=sess, feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) 184 print("test accuracy %g" % accuracy.eval(sesson=sess, feed_dict={x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0})) 185 186 # The training accuracy is stable at 1.