Tensorflow - Implement for a Convolutional Neural Network on MNIST. - LZ_Jaja

公告

Tensorflow - Implement for a Convolutional Neural Network on MNIST.
中文注释源于：tf.truncated_normal与tf.random_normal
　　　　　　　tf.nn.max_pool参数含义和用法
  1 import tensorflow as tf
  2 from tensorflow.examples.tutorials.mnist import input_data
  3 mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
  4 
  5 
  6 ''' Intro. for this python file.
  7 Objective:
  8     Implement for a Convolutional Neural Network on MNIST.
  9 Operating Environment:
 10     python = 3.6.4
 11     tensorflow = 1.5.0
 12 '''
 13 
 14 
 15 # To avoid initializing weight/bias variables repeatedly, we define two functions for initialization.
 16 ''' tf.truncated_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)
 17 Explanation: 
 18     Outputs random values from a truncated normal distribution. The generated values follow a normal distribution with 
 19     specified mean and standard deviation, except that values whose magnitude is more than 2 standard deviations from 
 20     the mean are dropped and re-picked.
 21     从截断的正态分布输出随机值。
 22 Args:
 23     shape: A 1-D integer Tensor or Python array. The shape of the output tensor.
 24     mean: A 0-D Tensor or Python value of type dtype. The mean of the truncated normal distribution.
 25     stddev: A 0-D Tensor or Python value of type dtype. The standard deviation of the normal distribution, before truncation.
 26     dtype: The type of the output.
 27     seed: A Python integer. Used to create a random seed for the distribution. See tf.set_random_seed for behavior.
 28     name: A name for the operation (optional).
 29 Returns:
 30     A tensor of the specified shape filled with random truncated normal values.
 31 '''
 32 
 33 
 34 def weight_variable(shape):
 35     initial = tf.truncated_normal(shape, stddev=0.1)
 36     return tf.Variable(initial)
 37 
 38 
 39 def bias_variable(shape):
 40     initial = tf.constant(value=0.1, shape=shape)
 41     return tf.Variable(initial)
 42 
 43 
 44 # Convolution
 45 ''' tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=True, data_format="NHWC", dilations=[1, 1, 1, 1], name=None)
 46 Explanation: 
 47     Computes a 2-D convolution given 4-D input and filter tensors. Given an input tensor of shape [batch, in_height, 
 48     in_width, in_channels] and a filter / kernel tensor of shape [filter_height, filter_width, in_channels, out_channels], 
 49     this op performs the following:
 50         1. Flattens the filter to a 2-D matrix with shape [filter_height * filter_width * in_channels, output_channels].
 51         2. Extracts image patches from the input tensor to form a virtual tensor of shape [batch, out_height, out_width, 
 52            filter_height * filter_width * in_channels].
 53         3. For each patch, right-multiplies the filter matrix and the image patch vector.
 54     In detail, with the default NHWC format,
 55         output[b, i, j, k] = sum_{di, dj, q} input[b, strides[1] * i + di, strides[2] * j + dj, q] * filter[di, dj, q, k]
 56     Must have strides[0] = strides[3] = 1. For the most common case of the same horizontal and vertices strides, 
 57     strides = [1, stride, stride, 1].
 58 Args:
 59     input: A Tensor. Must be one of the following types: half, bfloat16, float32, float64. A 4-D tensor. The dimension 
 60            order is interpreted according to the value of data_format, see below for details.
 61     filter: A Tensor. Must have the same type as input. A 4-D tensor of shape [filter_height, filter_width, in_channels, 
 62             out_channels]
 63     strides: A list of ints. 1-D tensor of length 4. The stride of the sliding window for each dimension of input. The 
 64              dimension order is determined by the value of data_format, see below for details.
 65     padding: A string from: "SAME", "VALID". The type of padding algorithm to use.
 66     use_cudnn_on_gpu: An optional bool. Defaults to True.
 67     data_format: An optional string from: "NHWC", "NCHW". Defaults to "NHWC". Specify the data format of the input and 
 68                  output data. With the default format "NHWC", the data is stored in the order of: [batch, height, width, 
 69                  channels]. Alternatively, the format could be "NCHW", the data storage order of: [batch, channels, 
 70                  height, width].
 71     dilations: An optional list of ints. Defaults to [1, 1, 1, 1]. 1-D tensor of length 4. The dilation factor for each 
 72                dimension of input. If set to k > 1, there will be k-1 skipped cells between each filter element on that 
 73                dimension. The dimension order is determined by the value of data_format, see above for details. Dilations 
 74                in the batch and depth dimensions must be 1.
 75     name: A name for the operation (optional).
 76     第一个参数input：指需要做卷积的输入图像，要求是一个Tensor，具有[batch, in_height, in_width, in_channels]这样的shape，
 77                      具体含义是[训练时一个batch的图片数量，图片高度，图片宽度，图片通道数]，注意这是一个4维的Tensor，要
 78                      求类型为float32和float64其中之一。
 79     第二个参数filter：相当于CNN中的卷积核，它要求是一个Tensor，具有[filter_height, filter_width, in_channels, out_channels]
 80                       这样的shape，具体含义是[卷积核的高度，卷积核的宽度，图像通道数，卷积核个数]，要求类型与参数input相
 81                       同，此处，第三维in_channels，就是参数input的第四维。
 82     第三个参数strides：卷积操作在图像每一维上的步长（strides[0]控制batch，strides[1]控制height，strides[2]控制width，
 83                        strides[3]控制channels，第一个和最后一个跨度参数通常很少修改，因为它们会在该运算中跳过一些数据，
 84                        从而不将这部分数据考虑在内，如果希望降低输入的维数，可修改height和width参数），这是一个一维向量，
 85                        长度为4。
 86     第四个参数padding：string类型的参数，只能是“SAME”和“VALID”中的一个，这个值决定了不同的卷积方式。
 87     第五个参数use_cudnn_on_gpu：bool类型，是否使用cudnn加速，默认为true。
 88 Returns:
 89     A Tensor. Has the same type as input.
 90     返回一个Tensor，这个输出就是我们常说的feature map，shape依然是[batch, height, width, channels]这种形式。
 91 '''
 92 
 93 
 94 def conv2d(x, W):
 95     return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding="SAME")
 96 
 97 
 98 ''' tf.nn.max_pool(value, ksize, strides, padding, data_format="NHWC", name=None)
 99 Explanation:
100     Performs the max pooling on the input.
101 Args:
102     value: A 4-D Tensor of the format specified by data_format.
103     ksize: A list or tuple of 4 ints. The size of the window for each dimension of the input tensor.
104     strides: A list or tuple of 4 ints. The stride of the sliding window for each dimension of the input tensor.
105     padding: A string, either 'VALID' or 'SAME'. The padding algorithm. See the comment here
106     data_format: A string. 'NHWC', 'NCHW' and 'NCHW_VECT_C' are supported.
107     name: Optional name for the operation.
108     第一个参数value：池化操作的输入，一般池化层接在卷积层后面，所以输入通常是feature map，依然是[batch, height, width, channels]
109                      这样的shape。
110     第二个参数ksize：池化窗口的大小，取一个四维向量，一般是[1, height, width, 1]，因为我们不想在batch和channels上做池化，
111                      所以将这两个维度设为1。
112     第三个参数strides：和卷积类似，窗口在每一个维度上滑动的步长，一般也是[1, stride, stride, 1]。
113     第四个参数padding：和卷积类似，可以取“VALID”或者“SAME”。
114 Returns:
115     A Tensor of format specified by data_format. The max pooled output tensor.
116     返回一个Tensor，类型不变，shape仍然是[batch, height, width, channels]这种形式。
117 '''
118 
119 
120 # Pooling
121 def max_pool_2x2(x):
122     return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")
123 
124 
125 # Set a placeholder. We hope arbitrary number of images could be input to this model.
126 x = tf.placeholder("float", [None, 784])
127 
128 # Set a placeholder 'y_' to accept the ground-truth values.
129 y_ = tf.placeholder("float", [None, 10])
130 
131 # 1st Convolutional Layer
132 W_conv1 = weight_variable([5, 5, 1, 32])
133 b_conv1 = bias_variable([32])
134 
135 # In order to utilize this layer, we convert x to a 4-D vector.
136 x_image = tf.reshape(x, [-1, 28, 28, 1])
137 
138 # 1st ReLU & Max-Pooling
139 h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
140 h_pool1 = max_pool_2x2(h_conv1)
141 
142 # 2nd Convolutional Layer
143 W_conv2 = weight_variable([5, 5, 32, 64])
144 b_conv2 = bias_variable([64])
145 
146 h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
147 h_pool2 = max_pool_2x2(h_conv2)
148 
149 # Fully Connected Layer
150 W_fc1 = weight_variable([7 * 7 * 64, 1024])
151 b_fc1 = bias_variable([1024])
152 
153 h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
154 h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
155 
156 # Dropout Layer
157 keep_prob = tf.placeholder("float")
158 h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
159 
160 # Output Layer
161 W_fc2 = weight_variable([1024, 10])
162 b_fc2 = bias_variable([10])
163 
164 y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
165 
166 # Training and Evaluation
167 cross_entropy = -tf.reduce_sum(y_ * tf.log(y_conv))
168 train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
169 correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
170 accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
171 
172 # Launch the graph in a session.
173 sess = tf.Session()
174 sess.run(tf.global_variables_initializer())
175 for i in range(20000):
176     batch = mnist.train.next_batch(50)
177     if i % 100 == 0:
178         # train_accuracy = accuracy.eval(feed_dict={x: batch[0], y_: batch[1], keep_prob: 1.0})     # ValueError
179         train_accuracy = accuracy.eval(session=sess, feed_dict={x: batch[0], y_: batch[1], keep_prob: 1.0})
180         print("step %d, training accuracy %g" % (i, train_accuracy))
181     # train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})     # ValueError
182     # sess.run(train_step, feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})   # Correct
183     train_step.run(session=sess, feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
184 print("test accuracy %g" % accuracy.eval(sesson=sess, feed_dict={x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
185 
186 # The training accuracy is stable at 1.
posted on 2018-08-28 20:57 LZ_Jaja 阅读(326) 评论(0) 编辑收藏举报
刷新页面返回顶部