koocn

导航

11月深度学习班第9课强化学习与DQN

强化学习与DQN

强化学习成就

 Learned the world’s best player of Backgammon (Tesauro 1995)
 Learned acrobatic helicopter autopilots (Ng, Abbeel, Coates et al
2006+)
 Widely used in the placement and selection of advertisements on
the web (e.g. A-B tests)
 Used to make strategic decisions in Jeopardy! (IBM’s Watson
2011)
 Achieved human-level performance on Atari games from pixel
-level visual input, in conjunction with deep learning (Google
Deepmind 2015)
 In all these cases, performance was better than could be obtained by
any other method, and was obtained without human instruction

 

 

posted on   koocn  阅读(228)  评论(0编辑  收藏  举报

努力加载评论中...
点击右上角即可分享
微信分享提示