koocn

导航

11月深度学习班第9课强化学习与DQN

强化学习与DQN

强化学习成就

 Learned the world’s best player of Backgammon (Tesauro 1995)
 Learned acrobatic helicopter autopilots (Ng, Abbeel, Coates et al
2006+)
 Widely used in the placement and selection of advertisements on
the web (e.g. A-B tests)
 Used to make strategic decisions in Jeopardy! (IBM’s Watson
2011)
 Achieved human-level performance on Atari games from pixel
-level visual input, in conjunction with deep learning (Google
Deepmind 2015)
 In all these cases, performance was better than could be obtained by
any other method, and was obtained without human instruction

 

 

posted on 2017-10-30 21:48  koocn  阅读(219)  评论(0编辑  收藏  举报