各种Normalization
1 BatchNorm、InstanceNorm和LayerNorm的理解
[1] Batch Normalization, Instance Normalization, Layer Normalization: Structural Nuances
• Transformer的Encoder使用了Layer Normalization
• 还有个Group Normalization,可以参考《全面解读Group Normalization》
2 BatchNorm
2.1 momentum参数在计算running mean和running variance中起到importance factor的作用
[2] https://stats.stackexchange.com/questions/219808/how-and-why-does-batch-normalization-use-moving-averages-to-track-the-accuracy-o
[3] Batch Normlization Explained
running_mean = momentum * running_mean + (1-momentum) * new_mean
running_var = momentum* running_var + (1-momentum) * new_var
Momentum is the importance given to the last seen mini-batch, a.k.a “lag”. If the momentum is set to 0, the running mean and variance come from the last seen mini-batch. However, this may be biased and not the desirable one for testing. Conversely, if momentum is set to 1, it uses the running mean and variance from the first mini-batch. Essentially, momentum controls how much each new mini-batch contributes to the running averages.
Ideally, the momentum should be set close to 1 (>0.9) to ensure slow learning of the running mean and variance such that the noise in a mini-batch is ignored.
2.2 torch.utils.checkpoint对batch normalization的处理
[4] Trading compute for memory in PyTorch models using Checkpointing
Batch normalization layer maintains the running mean and variance stats depending on the current minibatch and everytime a forward pass is run, the stats are updated based on the momentum value. In checkpointing, running the forward pass twice on a model segment in the same iteration will result in updating mean and stats value. In order to avoid this, use the new_momentum = sqrt(momentum) as the momentum value.
3 AdaIN(Adaptive Instance Normalization)
AdaIN是style transfer中经常用到的一种normalization
AdaIN receives a content input x and a style input y, and simply aligns the channel- wise mean and variance of x to match those of y. Unlike BN, IN or CIN, AdaIN has no learnable affine parameters.
IBN-Net对Instance Normalization和Batch Normalization的一个推论
IN learns features that are invariant to appearance changes, such as colors, styles, and virtuality/reality, while BN is essential for preserving content related information
IBN-Net在ReID模型中用得比较多。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· AI与.NET技术实操系列:基于图像分类模型对图像进行分类
· go语言实现终端里的倒计时
· 如何编写易于单元测试的代码
· 10年+ .NET Coder 心语,封装的思维:从隐藏、稳定开始理解其本质意义
· .NET Core 中如何实现缓存的预热?
· 25岁的心里话
· 闲置电脑爆改个人服务器(超详细) #公网映射 #Vmware虚拟网络编辑器
· 零经验选手,Compose 一天开发一款小游戏!
· 因为Apifox不支持离线,我果断选择了Apipost!
· 通过 API 将Deepseek响应流式内容输出到前端