TODO
+1、近端梯度下降法
https://zhuanlan.zhihu.com/p/82622940
+2、nn.init.kaiming_uniform_使用,在conv后面,类似BN层?
conv = Conv2d(
in_channels,
out_channels,
kernel_size=kernel_size,
stride=stride,
padding=dilation * (kernel_size - 1) // 2,
dilation=dilation,
bias=False if use_gn else True
)
# Caffe2 implementation uses XavierFill, which in fact
# corresponds to kaiming_uniform_ in PyTorch
nn.init.kaiming_uniform_(conv.weight, a=1)