cs20_3-1
0. Review
-
Computation graph
-
TensorFlow separates definition of computations from their execution
-
Phase 1: assemble a graph
Phase 2: use a session to execute operations in the graph.
-
-
TensorBoard
-
tf.constant and tf.Variable
-
Constant values are stored in the graph definition
Sessions allocate memory to store variable values
-
tf.placeholder and feed_dict
Feed values into placeholders with a dictionary (feed_dict)
Easy to use but poor performance
-
-
Avoid lazy loading
- Separate the assembling of graph and executing ops
- Use Python attribute to ensure a function is only loaded the first time it’s called
1. Linear regression on birth_life data
-
Phase 1: Assemble our graph
- Read in data
- Create placeholders for inputs and labels
- Create weight and bias
- Inference
- Specify loss function
- Create optimizer
-
Phase 2: Train our model
- Initialize variables
- Run optimizer
-
在完成phase 1之后,使用tf.summary创建graph的log文件,以备tensorboard可视化
Write log files using a FileWriter:
writer = tf.summary.FileWriter('./graphs/linear_reg', sess.graph)
-
See it on TensorBoard
- run code
- execute tensorboard command
2. Control Flow
-
e.g. Implementing Huber loss
-
description:
-
code:
def huber_loss(labels, predictions, delta=14.0): residual = tf.abs(labels - predictions) def f1(): return 0.5 * tf.square(residual) def f2(): return delta * residual - 0.5 * tf.square(delta) return tf.cond(residual < delta, f1, f2)
-
-
TF Control Flow-table
Control Flow Ops tf.group, tf.count_up_to, tf.cond, tf.case, tf.while_loop, ... Comparison Ops tf.equal, tf.not_equal, tf.less, tf.greater, tf.where, ... Logical Ops tf.logical_and, tf.logical_not, tf.logical_or, tf.logical_xor Debugging Ops tf.is_finite, tf.is_inf, tf.is_nan, tf.Assert, tf.Print, ...
3. tf.data
-
Placeholder
-
Pro: put the data processing outside TensorFlow, making it easy to do in Python
-
Cons: users often end up processing their data in a single thread and creating data bottleneck that slows execution down.
-
-
tf.data:
-
Instead of doing inference with placeholders and feeding in data later, do inference directly with data
-
tf.data.Dataset
tf.data.Iterator
-
tf.data.Dataset
-
from variables
tf.data.Dataset.from_tensor_slices((features, labels)) tf.data.Dataset.from_generator(gen, output_types, output_shapes) dataset = tf.data.Dataset.from_tensor_slices((data[:,0], data[:,1]))
-
from file
tf.data.TextLineDataset(filenames) tf.data.FixedLengthRecordDataset(filenames) tf.data.TFRecordDataset(filenames)
-
-
tf.data.Iterator
-
Create an iterator to iterate through samples in Dataset
-
e.g.
iterator = dataset.make_one_shot_iterator() # Iterates through the dataset exactly once. No need to initialization. iterator = dataset.make_initializable_iterator() Iterates through the dataset as many times as we want. Need to initialize with each epoch. # iterator = dataset.make_one_shot_iterator() X, Y = iterator.get_next() # X is the birth rate, Y is the life expectancy with tf.Session() as sess: print(sess.run([X, Y])) # >> [1.822, 74.82825] print(sess.run([X, Y])) # >> [3.869, 70.81949] print(sess.run([X, Y])) # >> [3.911, 72.15066] # iterator = dataset.make_initializable_iterator() ... for i in range(100): sess.run(iterator.initializer) total_loss = 0 try: while True: sess.run([optimizer]) except tf.errors.OutOfRangeError: pass
-
-
Handling data in TensorFlow
-
e.g.
dataset = dataset.shuffle(1000) dataset = dataset.repeat(100) dataset = dataset.batch(128) dataset = dataset.map(lambda x: tf.one_hot(x, 10)) # convert each elem of dataset to one_hot vector
-
-
Does tf.data really perform better?
-
Should we always use tf.data?
- For prototyping, feed dict can be faster and easier to write (pythonic)
- tf.data is tricky to use when you have complicated preprocessing or multiple data sources
- NLP data is normally just a sequence of integers. In this case, transferring the data over to GPU is pretty quick, so the speedup of tf.data isn't that large
-
How does TensorFlow know what variables to update? Optimizer
-
4. Optimizers, gradients
-
e.g.
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(loss) _, l = sess.run([optimizer, loss], feed_dict={X: x, Y:y}) # Session looks at all trainable variables that loss depends on and update them # 比如weight,bias,gradiets_1丢到GradientDescent(),但是只要weights和bias是trainable
-
Trainable variables
tf.Variable(initial_value=None, trainable=True,...) # Specify if a variable should be trained or not # By default, all variables are trainable
-
List of optimizers in TF
tf.train.GradientDescentOptimizer tf.train.AdagradOptimizer tf.train.MomentumOptimizer tf.train.AdamOptimizer tf.train.FtrlOptimizer tf.train.RMSPropOptimizer ... # #“Advanced” optimizers work better when tuned, but are generally harder to tune
-
question:
- How to know that our model is correct?
- How to improve our model?
5. Logistic regression on MNIST
-
MNIST Database: Each image is a 28x28 array, flattened out to be a 1-d tensor of size 784.
-
X: image of a handwritten digit
Y: the digit value
Recognize the digit in the image
-
Model: Inference:
Y_predicted = softmax(X * w + b)
Cross entropy loss: -log(Y_predicted)
-
Process data:
from tensorflow.examples.tutorials.mnist import input_data MNIST = input_data.read_data_sets('data/mnist', one_hot=True) # MNIST.train: 55,000 examples # MNIST.validation: 5,000 examples # MNIST.test: 10,000 examples # No immediate way to convert Python generators to tf.data.Dataset # mnist_folder = 'data/mnist' utils.download_mnist(mnist_folder) train, val, test = utils.read_mnist(mnist_folder, flatten=True) # # train_data = tf.data.Dataset.from_tensor_slices(train) train_data = train_data.shuffle(10000) # optional test_data = tf.data.Dataset.from_tensor_slices(test) # # iterator = train_data.make_initializable_iterator() img, label = iterator.get_next() # problem: # Can only do inference with train_data. # Need to build another subgraph with another iterator for test_data!!! # so improvement。。。 # # 因为这里的train_data,test_data的type及shape都是相同,同一个dataset来的,只是有不同用途,所以可以共用一个iterator的base class iterator = tf.data.Iterator.from_structure(train_data.output_types, train_data.output_shapes) img, label = iterator.get_next() train_init = iterator.make_initializer(train_data) # initializer for train_data test_init = iterator.make_initializer(test_data) # initializer for train_data # # # Initialize iterator with the dataset you want with tf.Session() as sess: ... for i in range(n_epochs): sess.run(train_init) # use train_init during training loop try: while True: _, l = sess.run([optimizer, loss]) except tf.errors.OutOfRangeError: pass # # test the model sess.run(test_init) # use test_init during testing try: while True: sess.run(accuracy) except tf.errors.OutOfRangeError: pass
-
Phase 1: Assemble our graph
-
Step 1: Read in data
-
Step 2: Create datasets and iterator
train_data = tf.data.Dataset.from_tensor_slices(train) train_data = train_data.shuffle(10000) # optional(train必须置乱D_train) train_data = train_data.batch(batch_size) test_data = tf.data.Dataset.from_tensor_slices(test) test_data = test_data.batch(batch_size) iterator = tf.data.Iterator.from_structure(train_data.output_types, train_data.output_shapes) img, label = iterator.get_next() train_init = iterator.make_initializer(train_data) test_init = iterator.make_initializer(test_data)
-
Step 3: Create weights and biases
use tf.get_variable()
-
Step 4: Build model to predict Y
logits = tf.matmul(img, w) + b # We don’t do softmax here, as we’ll do softmax together with cross_entropy loss. # It’s more efficient to compute gradients w.r.t. logits than w.r.t. softmax
-
Step 5: Specify loss function
entropy = tf.nn.softmax_cross_entropy_with_logits(labels=label, logits=logits) # 等价于先做softmax,再对概率化后的logits和本身就是0或1的labels做cross entropy loss = tf.reduce_mean(entropy)
-
Step 6: Create optimizer
tf.train.AdamOptimizer(learning_rate=0.01).minimize(loss)
-
-
Phase 2: Train our model
- Step 1: Initialize variables
- Step 2: Run optimizer op
-
TensorBoard it
6. Loss functions
- cross entropy: 写上我写的一个link