摘要: **发表时间:**2020(ICML 2020) **文章要点:**这篇文章把MCTS和policy optimization结合起来,说AlphaZero这类算法其实可以看作是带正则项的policy optimization(AlphaZero's search heuristics, along 阅读全文
posted @ 2023-02-25 23:04 initial_h 阅读(81) 评论(0) 推荐(0) 编辑