2019 年 7月 31 日随笔档案 - Junfei_Wang

2019年7月31日

Temporal-Difference Control: SARSA and Q-Learning

摘要： SARSA SARSA algorithm also estimate Action-Value functions rather than State-Value function. The difference between SARSA and Monte Carlo is: SARSA do 阅读全文

posted @ 2019-07-31 21:52 Junfei_Wang 阅读(207) 评论(0) 推荐(0) 编辑

Rhys_Wang

公告