tensorflow函数/重要功能实现

一、基础函数

1.1 、tf.reduce_sum(input_tensor, axis) Computes the sum of elements across dimensions of a tensor，沿着维度sxis计算和

x= [[1, 1, 1], [1, 1, 1]]，其秩为2

//求和，在所有维度操作，也就相当于对所有元素求和
tf.reduce_sum(x) ==> 6

//在维度0上操作，在这个例子中实际就是按列(维度0)求和
tf.reduce_sum(x, 0) ==> [2, 2, 2]
//也等价在维度-2操作
tf.reduce_sum(x, -2) ==> [2, 2, 2]


//在维度1上操作，在这个例子中实际就是按行(维度1)求和 
tf.reduce_sum(x, 1) ==> [3, 3]
//也等价在维度-1操作
tf.reduce_sum(x, 1) ==> [3, 3]

1.2、tf.concat(values, axis)：Concatenates tensors along one dimension, 在维度axis连接矩阵，不改变矩阵维数，比如这个维数是指原来是2维的，拼接后也是2维的

t1 = [[1, 2, 3], [4, 5, 6]]  //2*3维 
t2 = [[7, 8, 9], [10, 11, 12]]  //2*3维
tf.concat([t1, t2], 0) == > [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]  
//在维度0上连接，那么第一个维度会增加，在这里就是行会增多，结果是4*3维矩阵.

x=tf.ones((3,2,2)) //shape (3,2,2)
C=[x,x,x]
print(tf.concat(C,2).shape) == > (3,2,6)
// 再看这个例子，三维矩阵的连接，在第3个维度上，也就是维度2, 结果第三个维度会增加，也就是(3,2,6)

另外一个连接函数tf.stack：会改变前后矩阵的维数，比如拼接之前是2维的，拼接之后就都变成3维了。

behaviour_feat_list = []
behaviour_feat1 = tf.placeholder(dtype=tf.string, shape=[128, 256], name='behaviour_feat1')
behaviour_feat2 = tf.placeholder(dtype=tf.string, shape=[128, 256], name='behaviour_feat2')
behaviour_feat3 = tf.placeholder(dtype=tf.string, shape=[128, 256], name='behaviour_feat3')

behaviour_feat_list.append(behaviour_feat1)
behaviour_feat_list.append(behaviour_feat2)
behaviour_feat_list.append(behaviour_feat3)


behaviour_feat_vec1 = tf.concat(behaviour_feat_list, -1) #(128, 768)
behaviour_feat_vec2 = tf.stack(behaviour_feat_list, 1) #(128, 3, 256)

https://blog.csdn.net/mch2869253130/article/details/89232653

1.3、维度增加与删减

tf.expand_dims(input, axis=None, name=None, dim=None) ：Inserts a dimension of 1 into a tensor’s shape，在第axis位置增加一个维度

tf.squeeze(input, squeeze_dims=None, name=None) Removes dimensions of size 1 from the shape of a tensor。从tensor中删除所有大小是1的维度，如果不想删除所有尺寸1尺寸，可以通过指定squeeze_dims来删除特定尺寸1尺寸。

1.4、从tensor提取切片

tf.slice(input_, begin, size, name = None)，作用是从输入数据input中提取出一块切片，切片的尺寸是size，切片的开始位置是begin。切片的尺寸size表示输出tensor的数据维度，其中size[i]表示在第i维度上面的元素个数。开始位置begin表示切片相对于输入数据input_的每一个偏移量，比如数据input是

参考 https://blog.csdn.net/qq_30868235/article/details/80849422

input = [[1, 2, 3, 4, 5],
         [6, 7, 8, 9, 10],
         [11, 12, 13, 14, 15]
        ]

# 转换为tensor
input2 = tf.constant(input)


split_behaviour_feats=[]
indices = [1, 1, 2, 1]
begin = 0
for index in indices:
    output2 = tf.slice(input2, [0, begin], [-1, index])
    begin = begin + index
    split_behaviour_feats.append(output2)

with tf.Session() as sess:
    for split_behaviour_feat in split_behaviour_feats:
        print(sess.run(split_behaviour_feat))

# 依次输出：
# [1 6 11]
# [2 7 12]
# [3 8 13], [4 9 14]
# [5 10 15]

tf.slice函数的功能非常强大，通过控制begin和切片形状，可以切取任何想要的tensor。

另一种提取切片的函数是tf.gather()，该接口的作用：就是抽取出params的第axis维度上在indices里面所有的index，适合取tensor得某一列或者多列，但取连续切片不方便，此时可用tf.slice

https://blog.csdn.net/kkxi123456/article/details/103739404

import tensorflow as tf
import numpy as np
input = [[1, 2, 3],
         [2, 3, 4],
         [3, 4, 5],
         [4, 5, 6]
        ]
print(tf.shape(input)) # Tensor("Shape_3:0", shape=(2,), dtype=int32)
print(np.array(input).shape) # (4, 3)
with tf.Session() as sess:
    output = tf.gather(input, [1], axis=-1)  # 取-1维的第一"列"
    print(output.shape) # (4, 1)
    print(sess.run(output))

#输出
'''
[[2]
 [3]
 [4]
 [5]]
'''

第三种切片 tf.split(value，num_or_size_splits)，把一个张量划分成几个子张量，其中参数num_or_size_splits是将张量切成的份数

behaviour_vec = tf.placeholder(dtype=tf.string, shape=[128, 64，256], name='behaviour_vec')

# 返回 list，len(list)=64, 每个元素shape=(128, 1, 256)
behaviour_vec_list = tf.split(value=behaviour_vec, num_or_size_splits=64, axis=1)

1.5、值压缩函数

tf.clip_by_value(A, min, max)：输入一个张量A，把A中的每一个元素的值都压缩在min和max之间。小于min的让它等于min，大于max的元素的值等于max。

https://blog.csdn.net/UESTC_C2_403/article/details/72190248

1.6、张量扩展复制

tf.tile(input, multiples, name=None)：

　　input：待扩展的张量，A Tensor. 1-D or higher.

　　multiples：扩展参数，A Tensor. Must be one of the following types: int32, int64. 1-D. Length must be the same as the number of dimensions in input。

例如input是一个3维的张量。那么mutiples就必须是一个1x3的1维张量。这个张量的三个值依次表示input的第1，第2，第3维数据扩展几倍。

参考： https://blog.csdn.net/tsyccnh/article/details/82459859

1.7、tf.where

tf.where(condition, x=None, y=None, name=None): Return the elements, either from x or y, depending on the condition.

condition、x、y维度相同，其中condition必须是bool型。当condition某个位置为true时返回x相应位置的元素，false时返回y位置的元素。

参考：https://blog.csdn.net/ustbbsy/article/details/79564828

1.8、tf.range

用于创建数字序列变量，有以下两种形式：

tf.range(limit, delta=1, dtype=None, name='range')
tf.range(start, limit, delta=1, dtype=None, name='range')

该数字序列开始于 start 并且将以 delta 为增量扩展到不包括 limit 时的最大值结束，类似python的range函数。

参考：https://www.cnblogs.com/cvtoEyes/p/9002843.html

1.9、Tensorflow 中 crf_decode 和 viterbi_decode 的使用

https://blog.csdn.net/baobao3456810/article/details/83388516

viterbi_decode 和 crf_decode 实现了相同功能，前者是numpy的实现，后者是 tensor 的实现。

1.10、 tf.reshape

摘自: https://blog.csdn.net/lxg0807/article/details/53021859

tf.reshape(tensor, shape, name=None)

函数的作用是将tensor变换为参数shape的形式。其中shape为一个列表形式，特殊的一点是列表中可以存在-1。-1代表的含义是不用我们自己指定这一维的大小，函数会自动计算，但列表中只能存在一个-1。（当然如果存在多个-1，就是一个存在多解的方程了）

好了我想说的重点还有一个就是根据shape如何变换矩阵。其实简单的想就是，reshape（t, shape） => reshape(t, [-1]) => reshape(t, shape)，首先将矩阵t变为一维矩阵，然后再对矩阵的形式更改就可以了。

官方例子

# tensor 't' is [1, 2, 3, 4, 5, 6, 7, 8, 9]
# tensor 't' has shape [9]
t = tf.constant([1, 2, 3, 4, 5, 6, 7, 8, 9], tf.int32) 
reshape(t, [3, 3]) ==> [[1, 2, 3],
                        [4, 5, 6],
                        [7, 8, 9]]

# tensor 't' is [[[1, 1], [2, 2]],
#                [[3, 3], [4, 4]]]
# tensor 't' has shape [2, 2, 2]
reshape(t, [2, 4]) ==> [[1, 1, 2, 2],
                        [3, 3, 4, 4]]

# tensor 't' is [[[1, 1, 1],
#                 [2, 2, 2]],
#                [[3, 3, 3],
#                 [4, 4, 4]],
#                [[5, 5, 5],
#                 [6, 6, 6]]]
# tensor 't' has shape [3, 2, 3]
# pass '[-1]' to flatten 't'
reshape(t, [-1]) ==> [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6]

# -1 can also be used to infer the shape

# -1 is inferred to be 9:
reshape(t, [2, -1]) ==> [[1, 1, 1, 2, 2, 2, 3, 3, 3],
                         [4, 4, 4, 5, 5, 5, 6, 6, 6]]
# -1 is inferred to be 2:
reshape(t, [-1, 9]) ==> [[1, 1, 1, 2, 2, 2, 3, 3, 3],
                         [4, 4, 4, 5, 5, 5, 6, 6, 6]]
# -1 is inferred to be 3:
reshape(t, [ 2, -1, 3]) ==> [[[1, 1, 1],
                              [2, 2, 2],
                              [3, 3, 3]],
                             [[4, 4, 4],
                              [5, 5, 5],
                              [6, 6, 6]]]

# tensor 't' is [7]
# shape `[]` reshapes to a scalar
reshape(t, []) ==> 7

# 转换为shape [3,6]

reshape(t, [3, 2*3]) ==>

[[1 1 1 2 2 2],
[3 3 3 4 4 4],
[5 5 5 6 6 6]]

1.11、tf.tile()

tensorflow中的tile()函数是用来对张量(Tensor)进行扩展的，其特点是对当前张量内的数据进行一定规则的复制。最终的输出张量维度不变。

https://blog.csdn.net/tsyccnh/article/details/82459859

1.12、tf.reduce_mean

tf.reduce_mean 函数用于计算张量tensor沿着指定的数轴（tensor的某一维度）上的的平均值，主要用作降维或者计算tensor（图像）的平均值

https://blog.csdn.net/dcrmg/article/details/79797826

1.13、 tf.string_to_hash_bucket_fast( input, num_buckets, name=None)

利用hash将字符串特征转换为整型特征，其中num_buckets为桶的个数，即hash后整型特征取值范围.

tf.string_to_hash_bucket(tf.cast("tb",tf.string), 5) #将'tb' hash映射到桶[0, 1, 2, 3]中

https://www.w3cschool.cn/tensorflow_python/tensorflow_python-b7kc2mrg.html

1.14、TF中的字符串tf.string处理

https://blog.csdn.net/u013921430/article/details/101221896

#string 类型常用的函数

tf.as_string()
tf.substr()
tf.string_to_number()
tf.string_split()
tf.string_join()
tf.reduce_join()

注意，tf中string转int、float不能用tf.cast()函数，只能用tf.string_to_number()

1.15、label转换为onehot形式

import tensorflow as tf
index=[0,1,2,3]
one_hot=tf.one_hot(index, 8)
 
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(one_hot))

1.16、查看tensor的shape

import tensorflow as tf
import numpy as np

#对tensor取shape
tensor_x = tf.placeholder(tf.int64, [None, 42], name='tensor_x') 

#注意tf.shape()和tensor_x.shape的区别
x1 = tf.shape(tensor_x)
x2 = tensor_x.shape

print(x1) # Tensor("Shape:0", shape=(2,), dtype=int32)
print(x2) # (?, 42)
print(x2[1]) # 42

#对Python list查看shape
input = [[1, 2, 3],
         [2, 3, 4],
         [3, 4, 5],
         [4, 5, 6]
        ]

#讲list转换为np.array再查看
print(np.array(input).shape) # (4, 3)

1.17、one-hot

a) tensorflow：tf.one_hot()函数, https://blog.csdn.net/nini_coded/article/details/79250600

b) 多分类one-hot

import numpy as np

num_class = 4

# 样本y1的类别为1和3
y1 = np.array([1, 3])

#生成二维矩阵
one_hot_array = np.eye(N=num_class, dtype=np.int32)[y.reshape((-1))]
print(one_hot_array)

#生成label向量
one_hot_vec = np.sum(one_hot_array, axis=0) #按列求和
print(one_hot_vec) #[0 1 0 1]

二、网络层实现

2.1. 一维卷积、二维卷积

2.1.1、一维卷积(tf.nn.conv1d)和二维卷积(tf.nn.conv1d)的比较

二维卷积是将一个特征图在width和height两个方向上进行滑窗操作，对应位置进行相乘并求和；而一维卷积则是只在width或者说height方向上进行滑窗并相乘求和。

2.2.2、 tf.nn.conv2d、tf.layers.conv2d

https://blog.csdn.net/Mundane_World/article/details/80894618

Tensorflow中很多具有相同功能的函数，有不同的API。例如，2-D卷积，目前conv2d方法就有4个：

tf.nn.conv2d, tf.layers.conv2d, tf.contrib.layers.conv2d, slim.conv2d

它们在底层都调用了gen_nn_ops.conv2d()，实际上除了参数不一样外，其它没有大的区别，都实现了同样的功能。slim.conv2d已废弃。

参考：https://blog.csdn.net/mao_xiao_feng/article/details/53444333

一维卷积示例：https://blog.csdn.net/u011734144/article/details/84066928

2.2、全连接层

//out_dim=64维,激活函数为relu

self.output_tensor = tf.layers.Dense(64, activation=tf.nn.relu)(self.input_tensor)

一般都会在全连接层加Dropout 层防止过拟合，提升泛化能力。而很少见到卷积层后接Drop out （原因主要是卷积参数少，不易过拟合），今天找了些博客，特此记录

2.3、Drop层

output_tensor = tf.layers.dropout(inputs=input_tensor,rate=dropout_rate,training=is_training) #方法1(推荐),注意rate是指训练过程中丢掉神经元的比例

output_tensor= tf.nn.dropout(input_tensor, keep_prob) #方法2, keep_prob为训练过程中神经元保留的比例

Dropout原理：在不同的训练过程中随机扔掉一部分神经元，也就是让某个神经元的激活值以一定的概率p，让其停止工作，这次训练过程中不更新权值，也不参加神经网络的计算。但是它的权重得保留下来（只是暂时不更新而已），因为下次样本输入时它可能又得工作了。Dropout 层一般加在全连接层防止过拟合，提升模型泛化能力。而很少见到卷积层后接Drop out （原因主要是卷积参数少，不易过拟合）.

https://blog.csdn.net/qq_27292549/article/details/81092653

三、重要功能实现

3.1、tensor 标准化

3.1.1、tf.nn.l2_normalize(x, dim, epsilon=1e-12, name=None) ，对tensor利用L2范数(即欧氏距离)对指定维度 dim进行标准化。

https://blog.csdn.net/abiggg/article/details/79368982

3.2、对网络层正则化

在损失函数上加上正则项是防止过拟合的一个重要方法。tensorflow中对参数使用正则项分为两步:

a) 创建一个正则方法(函数/对象)
b) 将这个正则方法(函数/对象)，应用到参数上

L2正则函数 tf.contrib.layers.l2_regularizer(scale, scope=None)，scale: 正则项的系数，scope: 可选的scope name。L1正则类似。

使用过程示例：

//第一种方式
//1. 定义正则函数
l2_regularizer = tf.contrib.layers.l2_regularizer(scale=0.1)

//2. 在网络层(全连接层)应用L2正则
self.fc1 = tf.layers.Dense(units=128
                   ,activation=tf.nn.relu
                  ,kernel_initializer=tf.contrib.layers.xavier_initializer()
                   ,bias_initializer=tf.zeros_initializer()
                    ,kernel_regularizer=l2_regularizer)(self.input)

//3. 在loss函数加入L2正则损失
self.l2_loss =  tf.losses.get_regularization_loss() // 使用get_regularization_loss函数获取定义的全部L2 loss
self.ori_loss = ... //正常的损失函数
self.loss = self.ori_loss + self.l2_reg_lambda * self.l2_loss //在最终的 loss中加入L2 loss


//第二种方式
//1. 定义L2 loss变量
l2_loss = tf.constant(0.0)

//2. 在网络层(全连接层)应用L2正则
with tf.name_scope("fc1"):
    W = tf.get_variable(
        "W_hidden",
        shape=[size1, size2],
        initializer=tf.contrib.layers.xavier_initializer())
    b = tf.Variable(tf.constant(0.1, shape=[self.hidden_dim]), name="b")
    l2_loss += tf.nn.l2_loss(W)
    l2_loss += tf.nn.l2_loss(b)
    self.fc1_output = tf.nn.relu(tf.nn.xw_plus_b(self.input, W, b, name="fc1_output "))

//3. 在loss函数加入L2正则损失
self.ori_loss = ... //正常的损失函数
self.loss = self.ori_loss + self.l2_reg_lambda * self.l2_loss //在最终的 loss中加入L2 loss

参考 https://stackoverflow.com/questions/44232566/add-l2-regularization-when-using-high-level-tf-layers

https://zhuanlan.zhihu.com/p/27994404

3.3、Batch Normalization 批规范化

构建方式

//示例，对全连接层使用batch normalization
with tf.variable_scope('fc1', reuse=tf.AUTO_REUSE):
    liner = tf.layers.Dense(64, activation=None)(self.input)
    norm_liner = tf.layers.batch_normalization(liner, training=is_training)
    self.fc1 = tf.nn.relu(norm_liner)

参考：http://ai.51cto.com/art/201705/540230.htm

https://www.cnblogs.com/guoyaohua/p/8724433.html

3.、Layer Normalization 层规范化LN

#tf 1.x版本
hop_cur_m = tf.contrib.layers.layer_norm(inputs=x, begin_norm_axis=-1, begin_params_axis=-1)

#tf 2.x版本
tf.keras.layers.LayerNormalization()

参考：

https://www.tensorflow.org/addons/tutorials/layers_normalizations?hl=zh-cn 官方文档

https://blog.csdn.net/qq_34418352/article/details/105684488

结语：

L2正则主要是通过惩罚过大的参数值预防过拟合。那无论是全连接层还是卷积层，通过L2限制参数值过大，都是很合理的选择，所以都可以添加。

至于是不是都需要添加，那就不一定。因为正则化技术并不是只有L2这一种。全连接层，L2正则面临DropOut的竞争；卷积层，L2面临权重归一化（Weight Normalization）的竞争。具体还是取决于你的整个网络架构是如何设计的，并且往往需要通过一些试验才能确定。

至于BN，虽然现在已经是很常用的构件，但是关于它的理论研究其实还不怎么充分。BN的计算涉及用基于mini batch计算的均值、方差代替真实均值、方差，这就起到了正则化的作用。但正则化只是BN顺带的一个作用，主要起到降低内部协方差偏移的作用。

https://www.zhihu.com/question/288370837

3.4、残差

https://zhuanlan.zhihu.com/p/42706477 待

posted @ 2018-09-29 13:45 chease 阅读(2499) 评论(0) 收藏举报

刷新页面返回顶部

chease

tensorflow函数/重要功能实现

公告