随笔- 38 文章- 25 评论- 2 阅读- 72511

Res-DenseNetSegmentation模型调试记录

参考：https://blog.csdn.net/AbstractSky/article/details/76769202

　　　https://blog.csdn.net/jsliuqun/article/details/62418778

一、函数记录

1、glob.glob(“图片路径”)

用于将数据集里图片路径储存到列表里

2、np.fliplr（矩阵）

翻转矩阵左右

1 >>> A = np.diag([1.,2.,3.])
2 >>> A
3 array([[ 1.,  0.,  0.],
4        [ 0.,  2.,  0.],
5        [ 0.,  0.,  3.]])
6 >>> np.fliplr(A)
7 array([[ 0.,  0.,  1.],
8        [ 0.,  2.,  0.],
9        [ 3.,  0.,  0.]])

3、m[np.where(img_grey ==i)]=1

np.where(img_grey ==i)相当于条件，这句话的意思是当img_grey==i时m=1

4、np.stack（列表，axis=2）

叠加垂直成一个矩阵

5、np.pad（array，pad_width，mode，**kwars）

填充数组

其中array为要填补的数组（input）
pad_width是在各维度的各个方向上想要填补的长度,如（（2，3），（4，5）），如果直接输入一个整数，则说明各个维度和各个方向所填补的长度都一样。（2，3）对应上下方向，（4，5）对应左右方向
mode为填补类型，即怎样去填补，有“constant”，“edge”等模式，如果为constant模式，就得指定填补的值。

返回值是填充好的ndarray

6、scipy.ndimage.zoom（array,zoom=[a,b,c],order）

上采样与下采样

a代表图片长的倍数

b代表图片宽的倍数

c通道倍数

双线性插值将是order = 1，
最临近插值的是order = 0，
立方体是默认值（顺序= 3）

7、with tf.control_dependencies([tf.group(*update_ops)]):

设计是用来控制计算流图的，给图中的某些计算指定顺序。

在执行某些op,tensor之前，某些op,tensor得首先被运行。

在有些机器学习程序中我们想要指定某些操作执行的依赖关系，这时我们可以使用tf.control_dependencies()来实现。
control_dependencies(control_inputs)返回一个控制依赖的上下文管理器，使用with关键字可以让在这个上下文环境中的操作都在control_inputs 执行。

1 with g.control_dependencies([a, b, c]):
2   # `d` and `e` will only run after `a`, `b`, and `c` have executed.
3   d = ...
4   e = ...

8、tf.train.Saver(max_to_keep,keep_checkpoint_every_n_hour)

max_to_keep: 表明保存的最大checkpoint 文件数。当一个新文件创建的时候，旧文件就会被删掉。如果值为None或0，表示保存所有的checkpoint 文件。默认值为5（也就是说，保存最近的5个checkpoint 文件）。

keep_checkpoint_every_n_hour: 除了保存最近的max_to_keep checkpoint 文件，你还可能想每训练N小时保存一个checkpoint 文件。这将是非常有用的，如果你想分析一个模型在很长的一段训练时间内是怎么改变的。例如，设置 keep_checkpoint_every_n_hour=2 确保没训练2个小时保存一个checkpoint 文件。默认值10000小时无法看到特征。

tf.train.Saver.save(sess, save_path, global_step=None, latest_filename=None, meta_graph_suffix='meta', write_meta_graph=True)

保存变量

这个方法运行通过构造器添加的操作。它需要启动图的session。被保存的变量必须经过了初始化。

方法返回新建的checkpoint 文件的路径。路径可以直接传给restore() 进行调用。

参数：

sess: 用于保存变量的Session
save_path: checkpoint 文件的路径。如果saver 是共享的，这是共享checkpoint 文件名的前缀。
global_step: 如果提供了global step number，将会追加到 save_path 后面去创建checkpoint 的文件名。可选参数可以是一个Tensor，一个name Tensor或integer Tensor.

返回值：

一个字符串：保存变量的路径。如果saver 是被共享的，字符串以'-?????-of-nnnnn' 结尾。'nnnnn' 是共享的数目。

保存变量

tf.train.Saver.restore(sess, save_path)(测试的时候用)

恢复之前保存的变量

这个方法运行构造器为恢复变量所添加的操作。它需要启动图的Session。恢复的变量不需要经过初始化，恢复作为初始化的一种方法。

save_path 参数是之前调用save() 的返回值，或调用 latest_checkpoint() 的返回值。

参数：

sess: 用于恢复参数的Session
save_path: 参数之前保存的路径

 1 #Create a saver
 2 
 3 saver=tf.train.Saver(...variables...)
 4 
 5 #Launch the graph and train, saving the model every 1,000 steps.
 6 
 7 sess=tf.Session()
 8 
 9 for step in xrange(1000000):
10 
11     sess.run(...training_op...)
12 
13     if step % 1000 ==0:
14 
15         #Append the step number to the checkpoint name:
16 
17         saver.save(sess,'my-model',global_step=step)

9、tf.train.get_checkpoint_state(checkpoint_dir)通过checkpoint文件找到模型文件名

该函数返回的是checkpoint文件CheckpointState proto类型的内容，其中有model_checkpoint_path和all_model_checkpoint_paths两个属性。其中model_checkpoint_path保存了最新的tensorflow模型文件的文件名，all_model_checkpoint_paths则有未被删除的所有tensorflow模型文件的文件名。

10、tf.reset_default_graph()

利用这个可清空defualt graph以及nodes

 1 import tensorflow as tf
 2 tf.reset_default_graph() # 利用这个可清空defualt graph以及nodes
 3 with tf.variable_scope('Space_a'):
 4     a = tf.constant([1,2,3])
 5 with tf.variable_scope('Space_b'):
 6     b = tf.constant([4,5,6])
 7 with tf.variable_scope('Space_c'):
 8     c = a + b
 9 d = a + b
10 with tf.Session()as sess:
11     print(a)
12     print(b)
13     print(c)
14     print(d)
15     print(sess.run(c))
16     print(sess.run(d))

11、slim.arg_scope（list_ops_or_scope,**kwargs）的用法

list_ops_or_scope: 操作列表或作用域列表
kwargs: 参数，以keyword=value方式显示
作用是给list_ops中的参数设置默认值。但是每个list_ops中的每个成员需要用@add_arg_scope修饰才行。所以使用slim.arg_scope（）有两个步骤：
1. 使用@slim.add_arg_scope修饰目标函数，slim.conv2d( ),slim.fully_connected( ),slim.max_pool2d( )等函数在他被定义的时候就已经添加了@add_arg_scope。以slim.conv2d( )为例
2. 用 slim.arg_scope（）为目标函数设置默认参数.
使用实例

net = slim.conv2d(inputs, 64, [11, 11], 4, padding='SAME',

                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),

                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv1')

net = slim.conv2d(net, 128, [11, 11], padding='VALID',

                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),

                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv2')

net = slim.conv2d(net, 256, [11, 11], padding='SAME',

                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),

                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv3')

由于上例中

1 weights_initializer=tf.truncated_normal_initializer(stddev=0.01)
2 
3 weights_regularizer=slim.l2_regularizer(0.0005)

完全重复，我们可以将这些参数设为默认值，但是

1 padding='SAME'

padding比较常用的值是'SAME'，如果有特例，再重新声明，主要的话，我们可以通过slim.arg_scope改造如下

 1 with slim.arg_scope([slim.conv2d], padding='SAME',
 2 
 3                       weights_initializer=tf.truncated_normal_initializer(stddev=0.01)
 4 
 5                       weights_regularizer=slim.l2_regularizer(0.0005)):
 6 
 7     net = slim.conv2d(inputs, 64, [11, 11], scope='conv1')
 8 
 9     net = slim.conv2d(net, 128, [11, 11], padding='VALID', scope='conv2')
10 
11     net = slim.conv2d(net, 256, [11, 11], scope='conv3')

12、tf.random_normal | tf.truncated_normal | tf.random_uniform

tf.random_normal(shape,mean=0.0,stddev=1.0,dtype=tf.float32,seed=None,name=None)
tf.truncated_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)
tf.random_uniform(shape,minval=0,maxval=None,dtype=tf.float32,seed=None,name=None)
这几个都是用于生成随机数tensor的。尺寸是shape
random_normal: 正太分布随机数，均值mean,标准差stddev
truncated_normal:截断正态分布随机数，均值mean,标准差stddev,不过只保留[mean-2*stddev,mean+2*stddev]范围内的随机数
random_uniform:均匀分布随机数，范围为[minval,maxval]

13、tf.concat

tf.concat(concat_dim, values, name=’concat’)
tf.concat是连接两个矩阵的操作,除去name参数用以指定该操作的name，与方法有关的一共两个参数：
第一个参数concat_dim：必须是一个数，表明在哪一维上连接.

如果concat_dim是0，那么在某一个shape的第一个维度上连，对应到实际，就是叠放到列上.

1 t1 = [[1, 2, 3], [4, 5, 6]]  
2 t2 = [[7, 8, 9], [10, 11, 12]]  
3 tf.concat(0, [t1, t2]) == > [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]

如果concat_dim是1，那么在某一个shape的第二个维度上连

1 t1 = [[1, 2, 3], [4, 5, 6]]  
2 t2 = [[7, 8, 9], [10, 11, 12]]  
3 tf.concat(1, [t1, t2]) ==> [[1, 2, 3, 7, 8, 9], [4, 5, 6, 10, 11, 12]]

第二个参数values：就是两个或者一组待连接的tensor了

这里要注意的是：如果是两个向量，它们是无法调用tf.concat(1, [t1, t2]) 来连接的，因为它们对应的shape只有一个维度，当然不能在第二维上连了，虽然实际中两个向量可以在行上连，但是放在程序里是会报错的
如果要连，必须要调用tf.expand_dims来扩维：

1 t1=tf.constant([1,2,3])  
2 t2=tf.constant([4,5,6])  
3 #concated = tf.concat(1, [t1,t2])这样会报错  
4 t1=tf.expand_dims(tf.constant([1,2,3]),1)  
5 t2=tf.expand_dims(tf.constant([4,5,6]),1)  
6 concated = tf.concat(1, [t1,t2])#这样就是正确的

14、tf.cast

cast(x, dtype, name=None)
将x的数据格式转化成dtype.例如，原来x的数据格式是bool，那么将其转化成float以后，就能够将其转化成0和1的序列。反之也可以

1 a = tf.Variable([1,0,0,1,1])
2 b = tf.cast(a,dtype=tf.bool)
3 sess = tf.InteractiveSession()
4 sess.run(tf.initialize_all_variables())
5 print(sess.run(b))
6 #[ True False False  True  True]

15、tf.contrib.slim库下的函数

在神经网络中，一个卷积层由许多底层操作符组成：

1. 创建权重、偏置变量

2. 将来自上一层的数据和权值进行卷积

3. 在卷积结果上加上偏置

4. 应用激活函数

 1 input = ...  
 2 
 3 with tf.name_scope('conv1_1') as scope:  
 4 
 5   kernel = tf.Variable(tf.truncated_normal([3, 3, 64, 128], dtype=tf.float32,  
 6 
 7                                            stddev=1e-1), name='weights')  
 8 
 9   conv = tf.nn.conv2d(input, kernel, [1, 1, 1, 1], padding='SAME')  
10 
11   biases = tf.Variable(tf.constant(0.0, shape=[128], dtype=tf.float32),  
12 
13                        trainable=True, name='biases')  
14 
15   bias = tf.nn.bias_add(conv, biases)  
16 
17   conv1 = tf.nn.relu(bias, name=scope)

TF-Slim简化了上述过程

1 input = ...  
2 
3 net = slim.conv2d(input, 128, [3, 3], scope='conv1_1')

1)slim.conv2d(

　　input,

　　filters,

kernel_size,

stride=1,

padding='SAME',

data_format=None,

rate=1,

activation_fn=nn.relu,

normalizer_fn=None,

normalizer_params=None,

weights_initializer=initializers.xavier_initializer(),

weights_regularizer=None,

biases_initializer=init_ops.zeros_initializer(),

biases_regularizer=None,

reuse=None,

variables_collections=None,

outputs_collections=None,

trainable=True,

scope=None)

)

inputs同样是指需要做卷积的输入图像

num_outputs指定卷积核的个数（就是filter的个数）

kernel_size用于指定卷积核的维度（卷积核的宽度，卷积核的高度）

stride为卷积时在图像每一维的步长

padding为padding的方式选择，VALID或者SAME

data_format是用于指定输入的input的格式

rate这个参数不是太理解，而且tf.nn.conv2d中也没有，对于使用atrous convolution的膨胀率（不是太懂这atrous convolution）

activation_fn用于激活函数的指定，默认的为ReLU函数

normalizer_fn用于指定正则化函数

normalizer_params用于指定正则化函数的参数

weights_initializer用于指定权重的初始化程序

weights_regularizer为权重可选的正则化程序

biases_initializer用于指定biase的初始化程序

biases_regularizer: biases可选的正则化程序

reuse指定是否共享层或者和变量

variable_collections指定所有变量的集合列表或者字典

outputs_collections指定输出被添加的集合

trainable:卷积层的参数是否可被训练

scope:共享变量所指的variable_scope

简化slim.conv2d()API

tf.contrib.slim.conv2d (inputs,

filters,[卷积核个数]

kernel_size,[卷积核的高度，卷积核的宽度]

stride=1,

padding='SAME',

)

tf.nn.conv2d(

input,(与上述一致)

filter,([卷积核的高度，卷积核的宽度，图像通道数，卷积核个数])

strides,

padding,

)

2)slim.batch_norm

slim的使用batch normalize的时候很方便，不需要在每个卷积层后面显示地加一个batch normalize.只需要在slim里面的arg_scope中加入slim.batch_norm就可以。
如下操作就可以：

 1 batch_norm_params = {
 2       'decay': batch_norm_decay,
 3       'epsilon': batch_norm_epsilon,
 4       'scale': batch_norm_scale,
 5       'updates_collections': tf.GraphKeys.UPDATE_OPS,
 6       'is_training': is_training
 7   }
 8 
 9   with slim.arg_scope(
10       [slim.conv2d],
11       weights_regularizer=slim.l2_regularizer(weight_decay),
12       weights_initializer=slim.variance_scaling_initializer(),
13       activation_fn=tf.nn.relu,
14       normalizer_fn=slim.batch_norm,
15       normalizer_params=batch_norm_params):
16     with slim.arg_scope([slim.batch_norm], **batch_norm_params):
17       ...

3)slim.conv2d_transpose()（反卷积或"解卷积"）

convolution2d_transpose(
    inputs,
filters,
    kernel_size,
    stride=1,
    padding=’SAME’,
    data_format=DATA_FORMAT_NHWC,
    activation_fn=nn.relu,
    normalizer_fn=None,
    normalizer_params=None,
    weights_initializer=initializers.xavier_initializer(),
    weights_regularizer=None,
    biases_initializer=init_ops.zeros_initializer(),
    biases_regularizer=None,
    reuse=None,
    variables_collections=None,
    outputs_collections=None,
    trainable=True,
    scope=None)

二、遇到的问题

1.attributeError: 'NoneType' object has no attribute 'shape' 报错
cv2.imread("图片路径")读入数据为None的原因有1.路径错误2.图片有问题。
我两个都遇到了，认真确认路径，查看数据集。
方法：debug,打印图片名
2.batch_size太大也会报错，表现：报错信息有OOM
方法：batch_size改小，图片reshape小点。