2022 年 4月 27 日随笔档案 - initial_h

2022年4月27日

Think Too Fast Nor Too Slow: The Computational Trade-off Between Planning And Reinforcement Learning

摘要： **发表时间：**2020（ICAPS: PRL 2020） **文章要点：**这篇文章主要探究planning和learning的算力的trade-off，得出的结论是既不能planning太多，也不能planning太少。具体的，作者先指出了一类叫multi-step approximate 阅读全文

posted @ 2022-04-27 23:44 initial_h 阅读(45) 评论(0) 推荐(0) 编辑

Application of MCTS in Atari Black-box Planning

摘要： **发表时间：**2018（ICAPS 2018 workshop Heuristics and Search for Domain-independent Planning (HSDIP)） **文章要点：**这篇文章主要就是做实验看了看几种tree search方法在Atari上的效果如何，里面阅读全文

posted @ 2022-04-27 23:40 initial_h 阅读(39) 评论(0) 推荐(0) 编辑

Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning

摘要： **发表时间：**2014（NIPS 2014） **文章要点：**这篇文章主要是测试了Monte-Carlo Tree Search在Atari上的效果，不过并不是结合强化做的，而是先用tree search收集样本，再用神经网络拟合数据训成一个Q网络或者policy网络。得出的结论是比DQN效果阅读全文

posted @ 2022-04-27 23:34 initial_h 阅读(124) 评论(0) 推荐(0) 编辑

initial_h

https://github.com/initial-h

公告