11月深度学习班第9课强化学习与DQN
强化学习与DQN
强化学习成就
Learned the world’s best player of Backgammon (Tesauro 1995)
Learned acrobatic helicopter autopilots (Ng, Abbeel, Coates et al
2006+)
Widely used in the placement and selection of advertisements on
the web (e.g. A-B tests)
Used to make strategic decisions in Jeopardy! (IBM’s Watson
2011)
Achieved human-level performance on Atari games from pixel
-level visual input, in conjunction with deep learning (Google
Deepmind 2015)
In all these cases, performance was better than could be obtained by
any other method, and was obtained without human instruction
Life is short, but I have a cat.