Awesome Reinforcement Learning

A curated list of resources dedicated to reinforcement learning.

We have pages for other topics: awesome-rnn, awesome-deep-vision, awesome-random-forest

Maintainers: Hyunsoo Kim, Jiwon Kim

We are looking for more contributors and maintainers!

Contributing

Please feel free to pull requests

Codes

Codes for examples and exercises in Richard Sutton and Andrew Barto's Reinforcement Learning: An Introduction
Simulation code for Reinforcement Learning Control Problems
- Pole-Cart Problem
- Q-learning Controller
MATLAB Environment and GUI for Reinforcement Learning
Reinforcement Learning Repository - University of Massachusetts, Amherst
Brown-UMBC Reinforcement Learning and Planning Library (Java)
Reinforcement Learning in R (MDP, Value Iteration)
Reinforcement Learning Environment in Python and MATLAB
RL-Glue (standard interface for RL) and RL-Glue Library
PyBrain Library - Python-Based Reinforcement learning, Artificial intelligence, and Neural network
Maja - Machine learning framework for problems in Reinforcement Learning in python
TeachingBox - Java based Reinforcement Learning framework
Implementation of RL algorithms in Python/C++
Policy Gradient Reinforcement Learning Toolbox for MATLAB
PIQLE - Platform Implementing Q-LEarning and other RL algorithms

Theory

Lectures

[UCL] COMPM050/COMPGI13 Reinforcement Learning by David Silver
[UC Berkeley] CS188 Artificial Intelligence by Pieter Abbeel
[Udacity (Georgia Tech.)] Machine Learning 3: Reinforcement Learning (CS7641)
[Stanford] CS229 Machine Learning - Lecture 16: Reinforcement Learning by Andrew Ng

Books

Richard Sutton and Andrew Barto, Reinforcement Learning: An Introduction [Book] [Code]
Csaba Szepesvari, Algorithms for Reinforcement Learning [Book]
David Poole and Alan Mackworth, Artificial Intelligence: Foundations of Computational Agents[Book Chapter]
Dimitri P. Bertsekas and John N. Tsitsiklis, Neuro-Dynamic Programming [Book (Amazon)][Summary]
Mykel J. Kochenderfer, Decision Making Under Uncertainty: Theory and Application [Book (Amazon)]

Surveys

Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore, Reinforcement Learning: A Survey, JAIR, 1996. [Paper]
S. S. Keerthi and B. Ravindran, A Tutorial Survey of Reinforcement Learning, Sadhana, 1994. [Paper]
Jens Kober, J. Andrew Bagnell, Jan Peters, Reinforcement Learning in Robotics, A Survey, IJRR, 2013. [Paper]
Littman, Michael L. "Reinforcement learning improves behaviour from evaluative feedback." Nature 521.7553 (2015): 445-451. [Paper]
Marc P. Deisenroth, Gerhard Neumann, Jan Peter, A Survey on Policy Search for Robotics, Foundations and Trends in Robotics, 2014. [Book]

Papers / Thesis

Foundational Papers
- Marvin Minsky, Steps toward Artificial Intelligence, Proceedings of the IRE, 1961.[Paper]
  - discusses issues in RL such as the "credit assignment problem"
- Ian H. Witten, An Adaptive Optimal Controller for Discrete-Time Markov Environments, Information and Control, 1977. [Paper]
  - earliest publication on temporal-difference (TD) learning rule.
Solution Methods
- Dynamic Programming (DP):
  - Christopher J. C. H. Watkins, Learning from Delayed Rewards, Ph.D. Thesis, Cambridge University, 1989. [Thesis]
- Monte Carlo:
  - Andrew Barto, Michael Duff, Monte Carlo Inversion and Reinforcement Learning, NIPS, 1994. [Paper]
  - Satinder P. Singh, Richard S. Sutton, Reinforcement Learning with Replacing Eligibility Traces, Machine Learning, 1996. [Paper]
- Temporal-Difference:
  - Richard S. Sutton, Learning to predict by the methods of temporal differences. Machine Learning 3: 9-44, 1988. [Paper]
- Q-Learning (Off-policy TD algorithm):
  - Chris Watkins, Learning from Delayed Rewards, Cambridge, 1989. [Thesis]
- Sarsa (On-policy TD algorithm):
  - G.A. Rummery, M. Niranjan, On-line Q-learning using connectionist systems, Technical Report, Cambridge Univ., 1994. [Report]
  - Richard S. Sutton, Generalization in Reinforcement Learning: Successful examples using sparse coding, NIPS, 1996. [Paper]
- R-Learning (learning of relative values)
  - Andrew Schwartz, A Reinforcement Learning Method for Maximizing Undiscounted Rewards, ICML, 1993. [Paper-Google Scholar]
- Function Approximation methods (Least-Sqaure Temporal Difference, Least-Sqaure Policy Iteration)
  - Steven J. Bradtke, Andrew G. Barto, Linear Least-Squares Algorithms for Temporal Difference Learning, Machine Learning, 1996. [Paper]
  - Michail G. Lagoudakis, Ronald Parr, Model-Free Least Squares Policy Iteration, NIPS, 2001. [Paper] [Code]
- Policy Search (in application to Robotics)
  - Nate Kohl, Peter Stone, Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion, ICRA, 2004. [Paper]
  - Marc Deisenroth, Carl Rasmussen, PILCO: A Model-Based and Data-Efficient Approach to Policy Search, ICML, 2011. [Paper]
  - Jan Peters, Sethu Vijayakumar, Stefan Schaal, Natural Actor-Critic, ECML, 2005.[Paper]
  - Scott Kuindersma, Roderic Grupen, Andrew Barto, Learning Dynamic Arm Motions for Postural Recovery, Humanoids, 2011. [Paper]
- Hierarchical RL
  - Richard Sutton, Doina Precup, Satinder Singh, Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning, Artificial Intelligence, 1999. [Paper]
  - George Konidaris, Andrew Barto, Building Portable Options: Skill Transfer in Reinforcement Learning, IJCAI, 2007. [Paper]

Applications

Game Playing

Traditional Games
- Backgammon - "TD-Gammon" game play using TD(λ) (ACM 1995) [Paper]
- Chess - "KnightCap" program using TD(λ) [Paper-arXiv]
- Chess - Giraffe: Using deep reinforcement learning to play chess [Paper-arXiv]
Computer Games
- Human-level Control through Deep Reinforcement Learning (Nature 2015) [Paper][Code] [Video]
- Flappy Bird Reinforcement Learning [Video]
- MarI/O (learning to play Mario with evolutionary reinforcement learning using artificial neural networks) [Paper][Video]

Robotics

Reinforcement Learning for Humanoid Robotics (ICHR 2003) [Paper]
Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion (ICRA 2004)[Paper]
Robot Motor SKill Coordination with EM-based Reinforcement Learning (IROS 2010) [Paper][Video]
Generalized Model Learning for Reinforcement Learning on a Humanoid Robot (ICRA 2010)[Paper] [Video]
Autonomous Skill Acquisition on a Mobile Manipulator (AAAI 2011) [Paper] [Video]
PILCO: A Model-Based and Data-Efficient Approach to Policy Search (ICML 2011) [Paper]
Incremental Semantically Grounded Learning from Demonstration (RSS 2013) [Paper]
Efficient Reinforcement Learning for Robots using Informative Simulated Priors (ICRA 2015)[Paper] [Video]

Control

An Application of Reinforcement Learning to Aerobatic Helicopter Flight (NIPS 2006) [Paper][Video]
Autonomous helicopter control using Reinforcement Learning Policy Search Methods (ICRA 2011) [Paper]

Operations Research

Scaling Average-reward Reinforcement Learning for Product Delivery (AAAI 2004) [Paper]
Cross Channel Optimized Marketing by Reinforcement Learning (KDD 2004) [Paper]

Human Computer Interaction

Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System (JAIR 2002) [Paper]

Tutorials / Websites

Mance Harmon and Stephanie Harmon, Reinforcement Learning: A Tutorial
Short introduction to some Reinforcement Learning algorithms
C. Igel, M.A. Riedmiller, et al., Reinforcement Learning in a Nutshell, ESANN, 2007. [Paper]
UNSW - Reinforcement Learning
ROS Reinforcement Learning Tutorial
POMDP for Dummies
Scholarpedia articles on:
- Reinforcement Learning
- Temporal Difference Learning
Repository with useful MATLAB Software, presentations, and demo videos
Bibliography on Reinforcement Learning

Online Demos

Real-world demonstrations of Reinforcement Learning

posted @ 2015-11-16 19:11 菜鸡一枚阅读(974) 评论(0) 编辑收藏举报

刷新页面返回顶部

菜鸡一枚

Awesome Reinforcement Learning

Awesome Reinforcement Learning

Contributing

Table of Contents

Codes

Theory

Lectures

Books

Surveys

Papers / Thesis

Applications

Game Playing

Robotics

Control

Operations Research

Human Computer Interaction

Tutorials / Websites

Online Demos

公告