机器学习中的python常用函数

lstrip()方法

lstrip() 方法用于截掉字符串左边的空格或指定字符

str.lstrip([chars])　　　　截掉指定的字符char

返回截掉指定字符的字符串

str = "     this is string example....wow!!!     ";
print( str.lstrip() );# this is string example....wow!!!     
str = "88888888this is string example....wow!!!8888888";
print( str.lstrip('8') );    # this is string example....wow!!!8888888

random.seed()

放一个改变随机数生成器的种子，每个seed()值对应着一个固定的随机操作(生成随机数、随机洗牌)

import random

random.seed ([x])

x ：改变随机数生成器的种子seed。如果不设置，Python会帮你选择seed值。

import random

random.seed(10)    # 生成同一个随机数
print("带种子的随机数10: ", random.random())
# 带种子的随机数10:  0.57140259469

random.seed(10)    # 生成同一个随机数
print("带种子的随机数10: ", random.random())
# 带种子的随机数10 10 :  0.57140259469

random.shuffle()

将序列的所有元素随机排序。

import random

random.shuffle(lit)

import random

lit = [20, 16, 10, 5]

random.shuffle(lit)
print("随机排序列表 : ", lit)     # 随机排序列表 :  [20, 10, 5, 16]

tf.cast()

tf.cast(x, dtype, name=None)

将x的数据格式转化成dtype，例如，原来x的数据格式是bool，那么将其转化成float以后，就能将其转化成 0 和 1 的序列

import tensorflow as tf

a = tf.Variable([1,0,0,1,1])
b = tf.cast(a,dtype=tf.bool)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
print(sess.run(b))
#[ True False False  True  True]

tf.concat

tf.concat相当于numpy中的np.concatenate函数，用于将两个张量在某一维度(axis)合并起来，

a = tf.constant([[1,2,3],[3,4,5]]) # shape (2,3)
b = tf.constant([[7,8,9],[10,11,12]]) # shape (2,3)
ab1 = tf.concat([a,b], axis=0) # shape(4,3)
ab2 = tf.concat([a,b], axis=1) # shape(2,6)

tf.stack

tf.stack 产生新的阶，并进行拼接张量，增加维度

a = tf.constant([[1,2,3],[3,4,5]]) # shape (2,3)
b = tf.constant([[7,8,9],[10,11,12]]) # shape (2,3)
ab = tf.stack([a,b], axis=0) # shape (2,2,3)

axis是决定其层叠(stack)张量的维度方向的，改变参数axis=2

import tensorflow as tf
a = tf.constant([[1,2,3],[3,4,5]]) # shape (2,3)
b = tf.constant([[7,8,9],[10,11,12]]) # shape (2,3)
ab = tf.stack([a,b], axis=2) # shape (2,3,2)

tf.unstack

tf.unstack与tf.stack的操作相反，是将一个高阶数的张量在某个axis上分解为低阶数的张量

a = tf.constant([[1,2,3],[3,4,5]]) # shape (2,3)
b = tf.constant([[7,8,9],[10,11,12]]) # shape (2,3)
ab = tf.stack([a,b], axis=0) # shape (2,2,3)

a1 = tf.unstack(ab, axis=0)

# a1的输出为
# [<tf.Tensor 'unstack_1:0' shape=(2, 3) dtype=int32>,
#  <tf.Tensor 'unstack_1:1' shape=(2, 3) dtype=int32>]

tf.transpose()函数

这个函数主要适用于交换输入张量的不同维度用，如果输入张量是二维，就相当是转置。如果张量是三维，就是用0,1,2来表示。这个列表里的每个数对应相应的维度。如果是[2,1,0]，就把输入张量的第三维度和第一维度交换。

import tensorflow as tf
import numpy as np

A = np.array([[1, 2, 3],
              [4, 5, 6]])
print(A.shape)      # (2,3)
x = tf.transpose(A, [1, 0])
print(x.shape)      # (3,2)

B = np.array([[[1, 2, 3, 4],
               [5, 6, 7, 8],
               [9, 10, 11, 12]],

              [[13, 14, 15, 16],
               [17, 18, 19, 20],
               [21, 22, 23, 24]]])
print(B.shape)      # (2,3,4)
y = tf.transpose(B, [2, 1, 0])
print(y.shape)      # (4,3,2)

enumerate()

enumerate() 函数用于将一个可遍历的数据对象组合为元组，同时返回数据下标和数据，一般用在 for 循环当中

seasons = ['Spring', 'Summer', 'Fall', 'Winter']
print(list(enumerate(seasons)))
# [(0, 'Spring'), (1, 'Summer'), (2, 'Fall'), (3, 'Winter')]

seq = ['one', 'two', 'three']
for i, element in enumerate(seq):
    print(i, element)

# 0 one
# 1 two
# 2 three

zip()

　　将可迭代对象打包成一个个元组，然后返回包含这些元组的列表

语法：zip([iterable, ...])

a = [1, 2, 3]
b = [4, 5, 6]
c = [4, 5, 6, 7, 8]
zipped = zip(a, b)     # 打包为元组的列表
# [(1, 4), (2, 5), (3, 6)]
zip(a, c)              # 元素个数与最短的列表一致
# [(1, 4), (2, 5), (3, 6)]
zip(*zipped)          # 与 zip 相反，*zipped 可理解为解压，返回二维矩阵式
# [(1, 2, 3), (4, 5, 6)]

tf.clip_by_global_norm理解

梯度剪裁一般的应用场景为

optimizer = tf.train.AdamOptimizer(self.learning_rate)
gradients, v = zip(*optimizer.compute_gradients(self.loss))
gradients, _ = tf.clip_by_global_norm(gradients, self.grad_clip)
updates_train_optimizer = optimizer.apply_gradients(zip(gradients, v), global_step=self.global_step)

梯度剪裁最直接的目的就是防止梯度暴躁，手段就是控制梯度的最大范式。

tf.clip_by_global_norm(t_list, clip_norm, use_norm=None, name=None)

参数：

t_list：输入梯度
clip_norm：裁剪率
clip_norm：要使用的全球规范

list_clipped：裁剪后的梯度列表
global_norm：全局的规约数

但是，它比clip_by_norm()慢，因为在执行剪裁操作之前，必须准备好所有参数

tf.contrib.layers.xavier_initializer_conv2d

xavier_initializer初始化的基本思想是保持输入和输出的方差一致，这样就避免了所有的输出值都趋向于0。

这个初始化器是用来保持每一层的梯度大小都差不多相同。

posted @ 2019-01-06 10:17 凌逆战阅读(1017) 评论(0) 编辑收藏举报

刷新页面返回顶部