Learning an Optimal Policy: Model-free Methods

 

http://www.mit.edu/~9.54/fall14/slides/Reinforcement%20Learning%202-Model%20Free.pdf

 

【基于所有、单个样本】

 

 

posted @ 2017-09-30 18:36  papering  阅读(173)  评论(0编辑  收藏  举报