koocn

导航

公告

11月深度学习班第9课强化学习与DQN

强化学习与DQN

强化学习成就

 Learned the world’s best player of Backgammon (Tesauro 1995)
 Learned acrobatic helicopter autopilots (Ng, Abbeel, Coates et al
2006+)
 Widely used in the placement and selection of advertisements on
the web (e.g. A-B tests)
 Used to make strategic decisions in Jeopardy! (IBM’s Watson
2011)
 Achieved human-level performance on Atari games from pixel
-level visual input, in conjunction with deep learning (Google
Deepmind 2015)
 In all these cases, performance was better than could be obtained by
any other method, and was obtained without human instruction

posted on 2017-10-30 21:48 koocn 阅读(242) 评论(0) 收藏举报

刷新页面返回顶部