Tensorflow - 使用

batch size的设置经验：

batch_size=1的极端，每次修正方向取决于单个样本，横冲直撞难以收敛。合理范围增大batch size，提高内存利用率，跑完一次epoch所需迭代次数减少。但是不能盲目增大，因为会内存溢出，想要达到相同精度训练时间变长，batchsize增加到一定程度，其确定的下降方向已经基本不再变大。一般10-100。大小一般16 32 64 128.

关于placeholder：

在训练和测试的时候，我们想用不同的数.所以采用占位符的方式

batch_size = tf.placeholder(tf.int32)  # 注意类型必须为 tf.int32
# 在 1.0 版本以后请使用 ：
# keep_prob = tf.placeholder(tf.float32, [])
# batch_size = tf.placeholder(tf.int32, [])
_X = tf.placeholder(tf.float32, [None, 784])
y = tf.placeholder(tf.float32, [None, class_num])
keep_prob = tf.placeholder(tf.float32)

tf.placeholder(tf.float32, shape=(None, 1024))。1024是指的数据的尺寸，None指的batch size的大小，所以可以是任何数。

tf.placeholder(tf.float32, shape=[None, img_height, img_width, channels]) 类似地，后面几个是图片尺寸的参数，第一个参数为None，表示batch size的大小。

tf.transpose

# ‘x‘ is

[[1 2 3]
[4 5 6]]
tf.transpose(x) ==>

[[1 4]
[2 5]
[3 6]]

# Equivalently
tf.transpose(x perm=[1, 0])

If perm is not given, it is set to (n-1...0)

‘x‘ is

[[[1 2 3]

[4 5 6]]
[[7 8 9]
[10 11 12]]]
tf.transpose(b, perm=[0, 2, 1]) ==>

[[[1 4]
[2 5]
[3 6]]

[[7 10]
[8 11]
[9 12]]]

MNIST数据input怎么输入：

input_size = 28，一行有 28 个像素

timestep_size = 28，每做一次预测，需要先输入28行

hidden_size = 256，每个隐含层的节点数

class_num = 10，最后输出分类类别数量，如果是回归预测的话应该是 1

# 把784个点的字符信息还原成 28 * 28 的图片

_X = tf.placeholder(tf.float32, [None, 784])

# **步骤1：RNN 的输入shape = (batch_size, timestep_size, input_size)
X = tf.reshape(_X, [-1, 28, 28])

MNIST数据定义一层 LSTM_cell

只需要说明 hidden_size, 它会自动匹配输入的 X 的维度

lstm_cell = rnn.BasicLSTMCell(num_units=hidden_size, forget_bias=1.0, state_is_tuple=True)

# **步骤3：添加 dropout layer, 一般只设置 output_keep_prob

tf.truncated_normal

产生截断正态分布随机数，取值范围为 [ mean - 2 * stddev, mean + 2 * stddev ]。默认stddev=1.0，mean=0.0

例子：

initial = tf.truncated_normal(shape=[3,3], mean=0, stddev=1)
print(tf.Session().run(initial))

初始化模型的bias时候会用到。一般hidden layer的w用zeros初始化，output layer的w是normal初始化，bias不管什么layer都是normal初始化。

 self.br = tf.Variable(tf.truncated_normal(
            [self.hidden_layer_size], mean=1))

tf.nn.sigmoid_cross_entropy_with_logits

先对 logits 通过 sigmoid 计算，再计算交叉熵

例子

x = tf.constant([1,2,3,4,5,6,7],dtype=tf.float64)
y = tf.constant([1,1,1,0,0,1,0],dtype=tf.float64) #type, shape 与 logits相同
loss = tf.nn.sigmoid_cross_entropy_with_logits(labels = y,logits = x)
with tf.Session() as sess:
    print (sess.run(loss))

tf.reduce_mean

计算张量 input_tensor 平均值

#!/usr/bin/python

import tensorflow as tf
import numpy as np

initial = [[1.,1.],[2.,2.]]
x = tf.Variable(initial,dtype=tf.float32)
init_op = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init_op)
    print(sess.run(tf.reduce_mean(x)))
    print(sess.run(tf.reduce_mean(x,0))) #Column
    print(sess.run(tf.reduce_mean(x,1))) #row

logits

深度学习源码里经常出现logits
logits: 未归一化的概率，一般也就是 softmax层的输入。所以logits和lables的shape一样

$\text{logit}(p) = \log\frac{p}{1-p}$

graph

一旦你的graph (图)初始化了,我们可以使用 Graph.as_default() 方法获取他的上下文管理器, 并来添加操作(OPs)。结合with语句，让TensorFlow知道我们要将操作添加到特定的图中

import tensorflow as tf

g1 = tf.get_default_graph() #Create new graphs, ignore default graph: 如果没有这一行，直接g2 = tf.Graph()是错误的

g2 = tf.Graph()

with g1.as_default():

　　# Define g1 Operations, tensors, etc

可能你会问为什么之前的例子中，我们在添加Ops的时候不需要指定图呢？为了方便起见，TensorFlow在加载库的时候会自动创建图，并且将这个图指定为默认图。因此，在Graph.as_default（）之外定义的任何操作、张量等都将会自动放置在默认图形中。

tf.Graph.init()
建立一个空图

tf.Graph.as_default()
一个将某图设置为默认图

Tensorflow的图必须在一个会话(Session)中来计算

import tensorflow as tf

# Build a graph.
a = tf.constant([1.0, 2.0])
b = tf.constant([3.0, 4.0])
c = a * b

# Launch the graph in a session.
sess = tf.Session()

# Evaluate the tensor 'c'.
print sess.run(c)
sess.close()

Session

（会话）是负责graph（图）执行的，一旦一个session开启，你可以使用它主要的方法run()来计算期望的tensor输出

import tensorflow as tf
g1 = tf.Graph()
with g1.as_default():
c1 = tf.constant([1.0])
with tf.Graph().as_default() as g2:
c2 = tf.constant([2.0])

with tf.Session(graph=g1) as sess1:
print sess1.run(c1)
with tf.Session(graph=g2) as sess2:
print sess2.run(c2)

>> result:
>> [ 1.0 ]
>> [ 2.0 ]

Session.run() 需要一个参数--fetches ，同时也有三个可选参数：feed_dict, options 和run_metadata 。

在上面的例子中，我们设置fetches 给tensor b（是tf.mul()操作的输出）。这是告诉Tensorflow：这个Session应该找到计算b的值所需要的节点，并按照计算顺序执行，最后输出b

当fetchs是一个列表，run() 的输出结果将会是也将是列表，

sess.run([a,b]) ## 返回值的是  [7, 21]

tf.initialize_all_variables()

是用于对所有的Tensorflow 变量（variable）的并行化做准备用的

feed_dict参数可以使用图中的Tensor值来覆盖（overide）

a = tf.add(2,3)

b = tf.mul(a,3)

## 使用默认的图开启一个'Session'

sess = tf.Session()

## 定义一个字典用于取代原来的值a=15

replace_dict={a:15}

# 传递参数 repalce_dict 作为 feed_dict  的值：

sess.run(b, feed_dict=replace_dict) # 返回 45

这就意味着如果你有一个巨大的图并且想要用假数据（虚拟值dummy value）测试图的一部分，tensorflow不会把时间浪费在不必要的计算中。

feed_dict 在指定输入值得时候也是很用的，我们会在将在即将到来的占位符部分中介绍。

InteractiveSession（互动会话）是TensorFlow session 的另一种类型，但是我们不会用到它。

所有的InteractiveSession 又会在运行的时候都会自动地将自己设置为默认session。当使用交互式的Python Shell 的时候却很方便，因为你可以使用a.eval() 或者a.run() 而不必显式的输出sess.run([a])。不过，如果你要拼凑多个session的话，这会有一点棘手。维护一致的图形使得调试更加容易，所以我们坚持使用常规的Session对象。

.name scope和variable scope区别

TF中有两种作用域类型：

命名域 (name scope)，通过tf.name_scope 或 tf.op_scope创建；
变量域 (variable scope)，通过tf.variable_scope 或 tf.variable_op_scope创建；

要理解 name_scope 和 variable_scope，首先必须明确二者的使用目的。我们都知道，和普通模型相比，神经网络的节点非常多，节点节点之间的连接（权值矩阵）也非常多。所以我们费尽心思，准备搭建一个网络，然后有了图1的网络，WTF! 因为变量太多，我们构造完网络之后，一看，什么鬼，这个变量到底是哪层的？

这两种作用域，对于使用tf.Variable()方式创建的变量，具有相同的效果，都会在变量名称前面，加上域名称。

对于通过tf.get_variable()方式创建的变量，只有variable scope名称会加到变量名称前面，而name scope不会作为前缀。

with tf.name_scope("my_name_scope"):
    v1 = tf.get_variable("var1", [1], dtype=tf.float32) 
    v2 = tf.Variable(1, name="var2", dtype=tf.float32)
    a = tf.add(v1, v2)
    print(v1.name)
    print(v2.name) 
    print(a.name)
输出：
var1:0
my_name_scope/var2:0
my_name_scope/Add:0

with tf.variable_scope("my_variable_scope"):
    。。。
输出：
my_variable_scope/var1:0
my_variable_scope/var2:0
my_variable_scope/Add:0

在variable_scope的作用域下，tf.get_variable()和tf.Variable()都加了scope_name前缀。因此，在tf.variable_scope的作用域下，通过get_variable()可以使用已经创建的变量，实现了变量的共享，即可以通过get_variable()在tf.variable_scope设定的作用域范围内进行变量共享。在重复使用的时候, 一定要在代码中强调 scope.reuse_variables()

三种方式创建变量： tf.placeholder, tf.Variable, tf.get_variable

tf.placeholder() 占位符。* trainable==False *
tf.Variable() 一般变量用这种方式定义。 * 可以选择 trainable 类型 *
tf.get_variable() 一般都是和 tf.variable_scope() 配合使用，从而实现变量共享的功能。 * 可以选择 trainable 类型 *

tf.name_scope() 并不会对 tf.get_variable() 创建的变量有任何影响。
tf.name_scope() 主要是用来管理命名空间的，这样子让我们的整个模型更加有条理。而 tf.variable_scope() 的作用是为了实现变量共享，它和 tf.get_variable() 来完成变量共享的功能

posted @ 2019-09-29 13:05 SENTIMENT_SONNE 阅读(180) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

SENTIMENT_SONNE