cs20_2-2

1. Tensorboard

  1. tensorboard使用

    # 在graph中生成event文件
    # 定义graph...
    writer = tf.summary.FileWriter([logdir], [graph])
    # 执行graph...
    writer.close()
    
    # 激活tensorborad:
    tensorboard --logdir="./graphs" --port 12345
    # 然后浏览器中输入127.0.0.1:12345即可查看
    
    • Note: If you've run your code several times, there will be multiple event files in your [logdir]. TF will show only the latest graph and display the warning of multiple event files. To get rid of the warning, delete the event files you no longer need.

2. Constant op

# tf.constant(value, dtype=None, shape=None, name='Const', verify_shape=False)
# constant of 1d tensor (vector)
a = tf.constant([2, 2], name="vector")
#
# constant of 2x2 tensor (matrix)
b = tf.constant([[0, 1], [2, 3]], name="matrix")
#
# You can create a tensor of a specific dimension and fill it with a specific value, similar to Numpy.
# create a tensor of shape and all elements are zeros
tf.zeros([2, 3], tf.int32) # == > [[0, 0, 0], [0, 0, 0]]
#
# tf.zeros_like(input_tensor, dtype=None, name=None, optimize=True)
# create a tensor of shape and type (unless type is specified) as the input_tensor but all elements are zeros.
input_tensor = tf.constant([[0, 1], [2, 3], [4, 5]])
tf.zeros_like(input_tensor) # == > [[0, 0], [0, 0], [0, 0]]
#
# tf.ones(shape, dtype=tf.float32, name=None)
# create a tensor of shape and all elements are ones
tf.ones([2, 3], tf.int32) # == > [[1, 1, 1], [1, 1, 1]]
#
# tf.ones_like(input_tensor, dtype=None, name=None, optimize=True)
# create a tensor of shape and type (unless type is specified) as the input_tensor but all elements are ones.
input_tensor = tf.constant([[0, 1], [2, 3], [4, 5]])
tf.ones_like(input_tensor) # == > [[1, 1], [1, 1], [1, 1]]
#
# tf.fill(dims, value, name=None)
# create a tensor filled with a scalar value.
tf.fill([2, 3], 8) # == > [[8, 8, 8], [8, 8, 8]]
#
# You can create constants that are sequences
# tf.lin_space(start, stop, num, name=None)
# create a sequence of num evenly-spaced values are generated beginning at start. If num > 1, the values in the sequence increase by (stop - start) / (num - 1), so that the last one is exactly stop.
# comparable to but slightly different from numpy.linspace
# 但是注意python中list是左闭右开的:[start,stop)
tf.lin_space(10.0, 13.0, 4, name="linspace") # == > [10.0 11.0 12.0 13.0]
#
# tf.range([start], limit=None, delta=1, dtype=None, name='range')
# create a sequence of numbers that begins at start and extends by increments of delta up to but not including limit
# 等价于lisy的split操作:[start:limit:delta]
#
# slight different from range in Python
#
# 'start' is 3, 'limit' is 18, 'delta' is 3
tf.range(start, limit, delta) #  == > [3, 6, 9, 12, 15]
# 'start' is 3, 'limit' is 1,  'delta' is -0.5
tf.range(start, limit, delta) # == > [3, 2.5, 2, 1.5]
# 'limit' is 5
tf.range(limit) # == > [0, 1, 2, 3, 4]
#
# Note that unlike NumPy or Python sequences, TensorFlow sequences are not iterable.
for _ in np.linspace(0, 10, 4):  # OK
for _ in tf.linspace(0.0, 10.0, 4):  # TypeError: 'Tensor' object is not iterable.

for _ in range(4):  # OK
for _ in tf.range(4):  # TypeError: 'Tensor' object is not iterable.
#
#You can also generate random constants from certain distributions. See details.
# https://www.tensorflow.org/api_guides/python/constant_op
# tf.random_normal
# tf.truncated_normal
# tf.random_uniform
# tf.random_shuffle
# tf.random_crop
# tf.multinomial
# tf.random_gamma
# tf.set_random_seed

3. Math Operations

  1. tf.div:

    • Make sure you read the documentation to understand which one to use. At a high level, tf.div does TensorFlow’s style division, while tf.divide does exactly Python’s style division.
  2. tf.add_n

    # Allows you to add multiple tensors.
    tf.add_n([a, b, b])  => equivalent to a + b + b
    
  3. Dot product in TensorFlow

    • Note that tf.matmul no longer does dot product. It multiplies matrices of ranks greater or equal to 2. To do dot product in TensorFlow, we use tf.tensordot.

    • e.g.

      a = tf.constant([10, 20], name='a')
      b = tf.constant([2, 3], name='b')
      with tf.Session() as sess:
      	print(sess.run(tf.multiply(a, b)))           ⇒ [20 60] # element-wise multiplication
      	print(sess.run(tf.tensordot(a, b, 1)))       ⇒ 80
      
  4. Data Types

    • Python Native Types
      • TensorFlow takes in Python native types such as Python boolean values, numeric values (integers, floats), and strings. Single values will be converted to 0-d tensors (or scalars), lists of values will be converted to 1-d tensors (vectors), lists of lists of values will be converted to 2-d tensors (matrices), and so on.
    • TensorFlow Native Types
      • Like NumPy, TensorFlow also has its own data types as you've seen: tf.int32, tf.float32, together with more exciting types such as tf.bfloat, tf.complex, tf.quint. Below is the full list of TensorFlow data types, as literally screenshot from tf.DType class.
    • NumPy Data Types
      • By now, you've probably noticed the similarity between NumPy and TensorFlow. TensorFlow was designed to integrate seamlessly with Numpy, the package that has become the lingua franca of data science. TensorFlow's data types are based on those of NumPy; in fact, np.int32 == tf.int32 returns True. You can pass NumPy types to TensorFlow ops.
      • Remember our best friend tf.Session.run()? If the requested object is a Tensor, the output of will be a NumPy array.
      • Note 1: There is a catch here for string data types. For numeric and boolean types, TensorFlow and NumPy dtypes match down the line. However, tf.string does not have an exact match in NumPy due to the way NumPy handles strings. TensorFlow can still import string arrays from NumPy perfectly fine -- just don't specify a dtype in NumPy!
      • Note 2: Both TensorFlow and NumPy are n-d array libraries. NumPy supports ndarray, but doesn't offer methods to create tensor functions and automatically compute derivatives, nor GPU support. There have been numerous efforts to create “NumPy for GPU”, such as Numba, PyCUDA, gnumpy, but none has really taken off, so I guess TensorFlow is “NumPy for GPU”. Please correct me if I’m wrong here.
      • Note 3: Using Python types to specify TensorFlow objects is quick and easy, and it is useful for prototyping ideas. However, there is an important pitfall in doing it this way. Python types lack the ability to explicitly state the data type, while TensorFlow's data types are more explicit. For example, all integers are the same type, but TensorFlow has 8-bit, 16-bit, 32-bit, and 64-bit integers available. Therefore, if you use a Python type, TensorFlow has to infer which data type you mean.
    • Note 4: It's possible to convert the data into the appropriate type when you pass it into TensorFlow, but certain data types still may be difficult to declare correctly, such as complex numbers. Because of this, it is common to create hand-defined Tensor objects as NumPy arrays. However, always use TensorFlow types when possible.
  5. Variables

    • The differences between a constant and a variable:

      1. A constant is, well, constant. Often, you’d want your weights and biases to be updated during training.(所以weight和bias不可能用variable)
      2. A constant's value is stored in the graph and replicated wherever the graph is loaded. A variable is stored separately, and may live on a parameter server.
    • Point 2 means that constants are stored in the graph definition. When constants are memory expensive, such as a weight matrix with millions of entries, it will be slow each time you have to load the graph. To see what’s stored in the graph's definition, simply print out the graph's protobuf. Protobuf stands for protocol buffer, “Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler.”

      import tensorflow as tf
      my_const = tf.constant([1.0, 2.0], name="my_const")
      print(tf.get_default_graph().as_graph_def())
      
      # Output
      node {
        name: "my_const"
        op: "Const"
        attr {
          key: "dtype"
          value {
            type: DT_FLOAT
          }
        }
        attr {
          key: "value"
          value {
            tensor {
              dtype: DT_FLOAT
              tensor_shape {
                dim {
                  size: 2
                }
              }
              tensor_content: "\000\000\200?\000\000\000@"
            }
          }
        }
      }
      versions {
        producer: 24
      
    • Creating variables:

      • To declare a variable, you create an instance of the class tf.Variable. Note that it's written tf.constant with lowercase ‘c’ but tf.Variable with uppercase ‘V’. It’s because tf.constant is an op, while tf.Variable is a class with multiple ops.(包含多个op的class)
    • Initialize variables:

    • Evaluate values of variables:

    • Assign values to variables

      • a interesting case:

        W = tf.Variable(10)
        
        assign_op = W.assign(100)
        with tf.Session() as sess:
        	sess.run(assign_op)
        	print(W.eval()) # >> 100
        

        Note that we don't have to initialize W in this case, because assign() does it for us. In fact, the initializer op is an assign op that assigns the variable's initial value to the variable itself.

        # in the source code
        self._initializer_op = state_ops.assign(self._variable, self._initial_value,                                                                   validate_shape=validate_shape).op
        
      • another interesting case:

        When you have a variable that depends on another variable, suppose you want to declare U = W * 2

        # W is a random 700 x 10 tensor
        W = tf.Variable(tf.truncated_normal([700, 10]))
        U = tf.Variable(W * 2)
        

        In this case, you should use initialized_value() to make sure that W is initialized before its value is used to initialize U.

        U = tf.Variable(W.initialized_value() * 2) # 必须先init W,才能给U使用
        

4. Interactive Session

  • You sometimes see InteractiveSession instead of Session. The only difference is an InteractiveSession makes itself the default session so you can call run() or eval() without explicitly call the session. This is convenient in interactive shells and IPython notebooks, as it avoids having to pass an explicit session object to run ops. However, it is complicated when you have multiple sessions to run.

  • e.g.

    sess = tf.InteractiveSession()
    a = tf.constant(5.0)
    b = tf.constant(6.0)
    c = a * b
    print(c.eval()) # we can use 'c.eval()' without explicitly stating a session[指的是无须sess.run(c)]
    sess.close()
    
  • tf.get_default_session() returns the default session for the current thread. The returned Session will be the innermost session on which a Session or Session.as_default() context has been entered.

5. Control Dependencies

  • Sometimes, we have two or more independent ops and we'd like to specify which ops should be run first. In this case, we use tf.Graph.control_dependencies([control_inputs]).

  • e.g.

    # your graph g have 5 ops: a, b, c, d, e
    with g.control_dependencies([a, b, c]):
      # `d` and `e` will only run after `a`, `b`, and `c` have executed.
      d = ...
      e = …
    

6. Importing Data

  1. The old way: placeholders and feed_dict

    • tf.placeholder(dtype, shape=None, name=None)

    • Note 1: dtype, shape, and name are self-explanatory. The only thing to note here is when you set the shape of the placeholder to None. shape=None means that tensors of any shape will be accepted. Using shape=None is easy to construct graphs, but nightmarish for debugging. You should always define the shape of your placeholders as detailed as possible. shape=None also breaks all following shape inference, which makes many ops not work because they expect certain rank.

    • a interesting case: You can feed values to tensors that aren't placeholders. Any tensors that are feedable can be fed. To check if a tensor is feedable or not, use: tf.Graph.is_feedable(tensor)

      e.g.

      a = tf.add(2, 5)
      b = tf.multiply(a, 3)
      
      with tf.Session() as sess:
      	print(sess.run(b)) 						#  21
      	# compute the value of b given the value of a is 15
          # 检测a是否feedable: tf.Graph.is_feedable(a)
      	print(sess.run(b, feed_dict={a: 15})) 			#  45
      
    • feed_dict can be extremely useful to test your model. When you have a large graph and just want to test out certain parts, you can provide dummy values so TensorFlow won't waste time doing unnecessary computations.

  2. The new way: tf.data

7. The trap of lazy loading 【*】

  • lazy loading本是TF用于save computation的好机制,但是也存在trap

  • One of the most common TensorFlow non-bug bugs I see is what my friend Danijar and I call “lazy loading”. Lazy loading is a term that refers to a programming pattern when you defer declaring/initializing an object until it is loaded. In the context of TensorFlow, it means you defer creating an op until you need to compute it. For example, this is normal loading: you create the op z when you assemble the graph.

    x = tf.Variable(10, name='x')
    y = tf.Variable(20, name='y')
    z = tf.add(x, y)
    
    with tf.Session() as sess:
    	sess.run(tf.global_variables_initializer())
    	writer = tf.summary.FileWriter('graphs/normal_loading', sess.graph)
    	for _ in range(10):
    		sess.run(z)
    	writer.close()
    

    This is what happens when someone decides to be clever(自作聪明) and use lazy loading to save one line of code:

    x = tf.Variable(10, name='x')
    y = tf.Variable(20, name='y')
    # 节省一行代码(自作聪明)
    
    with tf.Session() as sess:
    	sess.run(tf.global_variables_initializer())
    	writer = tf.summary.FileWriter('graphs/lazy_loading', sess.graph)
    	for _ in range(10):
    		sess.run(tf.add(x, y)) # 匿名使用add
    	print(tf.get_default_graph().as_graph_def()) 
    	writer.close()
    
    

    Let's see the graphs for them on TensorBoard. Note that you can open Tensorboard with logdir=’graphs’ and you can easily switch between normal_loading graph and lazy_loading graph.(直接在浏览器修改地址:graphs/normal_loading或graphs/lazy_loading)

    • Let's look at the graph definition. Remember that to print out the graph definition, we use:

      print(tf.get_default_graph().as_graph_def())

      • The protobuf for the graph in normal loading has only 1 node “Add”:
      node {
        name: "Add"
        op: "Add"
        input: "x/read"
        input: "y/read"
        attr {
            key: "T"
            value {
                type: DT_INT32
            }
        }
      }
      
      • On the other hand, the protobuf for the graph in lazy loading has 10 copies of the node “Add”. It adds a new node “Add” every time you want to compute z!

        
        node {
          name: "Add_0"
          op: "Add"
          input: "x_1/read"
          input: "y_1/read"
          attr {
            key: "T"
            value {
              type: DT_INT32
            }
          }
        }
        …
        …
        …
        node {
          name: "Add_9"
          op: "Add"
          ...
        }
        
        
  • You probably think: “This is stupid. Why would I want to compute the same value more than once?” and think that it's a bug that nobody will ever commit. It happens more often than you think. For example, you might want to compute the same loss function or make the same prediction every batch of training samples. If you aren’t careful, you can add thousands of unnecessary nodes to your graph. Your graph definition becomes bloated, slow to load and expensive to pass around.

  • There are two ways to avoid this bug. First, always separate the definition of ops and their execution when you can. But when it is not possible because you want to group related ops into classes, you can use Python @property to ensure that your function is only loaded once when it's first called. If you want to know, check out this wonderful blog post by Danijar Hafner.

8. reference

​ [1]. https://docs.google.com/document/d/1FSPNZFQsnaUVeTo0OQ2RrEZ0f4el9bIGI5sQALbG_F0/edit#

posted @ 2019-02-12 17:47  hzhang_NJU  阅读(177)  评论(0编辑  收藏  举报