local_response_normalization 和 batch_normalization

Normalization

 

Normalization

local_response_normalization

local_response_normalization出现在论文”ImageNet Classification with deep Convolutional Neural Networks”中,论文中说,这种normalization对于泛化是有好处的.

bix,y=aix,y(k+αmin(0,i+n/2)j=max(0,in/2)(ajx,y)2)β

经过了一个conv2d或pooling后,我们获得了[batch_size, height, width, channels]这样一个tensor.现在,将channels称之为层,不考虑batch_size
  • i代表第i
  • aix,y就代表 第i层的 (x,y)位置所对应的值
  • n个相邻feature maps.
  • k...α...n...β是hyper parameters
  • 可以看出,这个函数的功能就是, aix,y需要用他的相邻的map的同位置的值进行normalization
    在alexnet中, k=2,n=5,α=104,β=0.75
tf.nn.local_response_normalization(input, depth_radius=None, bias=None, alpha=None, beta=None, name=None)
'''
Local Response Normalization.
The 4-D input tensor is treated as a 3-D array of 1-D vectors (along the last dimension), and each vector is normalized independently. Within a given vector, each component is divided by the weighted, squared sum of inputs within depth_radius. In detail,
'''
"""
input: A Tensor. Must be one of the following types: float32, half. 4-D.
depth_radius: An optional int. Defaults to 5. 0-D. Half-width of the 1-D normalization window.
bias: An optional float. Defaults to 1. An offset (usually positive to avoid dividing by 0).
alpha: An optional float. Defaults to 1. A scale factor, usually positive.
beta: An optional float. Defaults to 0.5. An exponent.
name: A name for the operation (optional).
"""
  • depth_radius: 就是公式里的n/2
  • bias : 公式里的k
  • input: 将conv2d或pooling 的输出输入就行了[batch_size, height, width, channels]
  • return :[batch_size, height, width, channels], 正则化后

batch_normalization

论文地址
batch_normalization, 故名思意,就是以batch为单位进行normalization
- 输入:mini_batch: In={x1,x2,..,xm}
- γ,β,需要学习的参数,都是向量
- ϵ: 一个常量
- 输出: Out={y1,y2,...,ym}
算法如下:
(1)mini_batch mean:

μIn1mi=1mxi

(2)mini_batch variance
σ2In=1mi=1m(xiμIn)2

(3)Normalize
x^i=xiμInσ2In+ϵ

(4)scale and shift
yi=γx^i+β

可以看出,batch_normalization之后,数据的维数没有任何变化,只是数值发生了变化
Out作为下一层的输入
函数:
tf.nn.batch_normalization()
def batch_normalization(x,
                        mean,
                        variance,
                        offset,
                        scale,
                        variance_epsilon,
                        name=None):

Args:

  • x: Input Tensor of arbitrary dimensionality.
  • mean: A mean Tensor.
  • variance: A variance Tensor.
  • offset: An offset Tensor, often denoted β in equations, or None. If present, will be added to the normalized tensor.
  • scale: A scale Tensor, often denoted γ in equations, or None. If present, the scale is applied to the normalized tensor.
  • variance_epsilon: A small float number to avoid dividing by 0.
  • name: A name for this operation (optional).
  • Returns: the normalized, scaled, offset tensor.
    对于卷积,x:[bathc,height,width,depth]
    对于卷积,我们要feature map中共享 γiβi ,所以 γ,β的维度是[depth]

现在,我们需要一个函数 返回mean和variance, 看下面.

tf.nn.moments()

def moments(x, axes, shift=None, name=None, keep_dims=False):
# for simple batch normalization pass `axes=[0]` (batch only).

对于卷积的batch_normalization, x 为[batch_size, height, width, depth],axes=[0,1,2],就会输出(mean,variance), mean 与 variance 均为标量。

posted @   bonelee  阅读(1163)  评论(0编辑  收藏  举报
编辑推荐:
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 浏览器原生「磁吸」效果!Anchor Positioning 锚点定位神器解析
· 没有源码,如何修改代码逻辑?
阅读排行:
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
· DeepSeek 开源周回顾「GitHub 热点速览」
历史上的今天:
2017-01-31 Hive数据分析——Spark是一种基于rdd(弹性数据集)的内存分布式并行处理框架,比于Hadoop将大量的中间结果写入HDFS,Spark避免了中间结果的持久化
2017-01-31 Hive group by实现-就是word 统计
2017-01-31 Hive mapreduce SQL实现原理——SQL最终分解为MR任务,而group by在MR里和单词统计MR没有区别了
2017-01-31 SQL group by底层原理——本质是排序,可以利用索引事先排好序
点击右上角即可分享
微信分享提示