摘要: def noam_scheme(global_step, num_warmup_steps, num_train_steps, init_lr, warmup=True): """ decay learning rate if warmup > global step, the learning rate will be global_step/num_warmup_st... 阅读全文
posted @ 2019-07-22 16:53 下路派出所 阅读(2514) 评论(0) 推荐(0) 编辑