摘要: CanChen ggchen@mail.ustc.edu.cn AdaBatch Motivation: Current stochastic gradient descend methods use fixed batchsize. Small batchsize with small learn 阅读全文
posted @ 2020-02-23 22:33 Klaus-Chen 阅读(69) 评论(0) 推荐(0) 编辑