随笔档案「2021年8月19日」：Learning and Planning in Complex Action ... - initial_h

2021年8月19日

Learning and Planning in Complex Action Spaces

摘要： **发表时间：**2021 **文章要点：**文章想说，在动作空间很大或者连续的时候，想要枚举所有动作来做MCTS是不现实的。作者提出了sample-based policy iteration framework，通过采用的方式来做MCTS（Sampled MuZero）。大概思路就是说，在MCT 阅读全文

posted @ 2021-08-19 02:12 initial_h 阅读(301) 评论(0) 推荐(0)

initial_h

https://github.com/initial-h

公告