ecoflex - 博客园

2018年5月30日

CS294-112 深度强化学习秋季学期（伯克利）NO.20 Guest lecture: John Schulman (PPO and Applications)

摘要：阅读全文

posted @ 2018-05-30 18:19 ecoflex 阅读(309) 评论(0) 推荐(0)

CS294-112 深度强化学习秋季学期（伯克利）NO.19 Guest lecture: Igor Mordatch (Optimization and Reinforcement Learning in Multi-Agent Settings)

摘要： skip over 阅读全文

posted @ 2018-05-30 17:51 ecoflex 阅读(241) 评论(0) 推荐(0)

2018年5月29日

CS294-112 深度强化学习秋季学期（伯克利）NO.18 Advanced imitation learning and open problems

摘要： ... 阅读全文

posted @ 2018-05-29 17:24 ecoflex 阅读(228) 评论(0) 推荐(0)

CS294-112 深度强化学习秋季学期（伯克利）NO.17 Meta-learning and parallelism

摘要：阅读全文

posted @ 2018-05-29 17:23 ecoflex 阅读(221) 评论(0) 推荐(0)

CS294-112 深度强化学习秋季学期（伯克利）NO.16 Multi-task learning and transfer

摘要：阅读全文

posted @ 2018-05-29 17:22 ecoflex 阅读(189) 评论(0) 推荐(0)

CS294-112 深度强化学习秋季学期（伯克利）NO.15 Exploration 2

摘要： jump over this lecture 阅读全文

posted @ 2018-05-29 17:21 ecoflex 阅读(130) 评论(0) 推荐(0)

CS294-112 深度强化学习秋季学期（伯克利）NO.14 Exploration 1

摘要： ... 阅读全文

posted @ 2018-05-29 17:17 ecoflex 阅读(157) 评论(0) 推荐(0)

CS294-112 深度强化学习秋季学期（伯克利）NO.13 Advanced policy gradients

摘要： ... 阅读全文

posted @ 2018-05-29 16:27 ecoflex 阅读(171) 评论(0) 推荐(0)

CS294-112 深度强化学习秋季学期（伯克利）NO.12 Inverse reinforcement learning

摘要： after the break, we'll extend our IRL into continuous spaces 阅读全文

posted @ 2018-05-29 14:55 ecoflex 阅读(199) 评论(0) 推荐(0)

2018年5月28日

CS294-112 深度强化学习秋季学期（伯克利）NO.11 Connection between inference and control

摘要： yellow region corresponds to β blue to α 阅读全文

posted @ 2018-05-28 20:46 ecoflex 阅读(140) 评论(0) 推荐(0)

CS294-112 深度强化学习秋季学期（伯克利）NO.10 Guest Lecture: Advanced model learning and images

摘要： ... 阅读全文

posted @ 2018-05-28 17:13 ecoflex 阅读(159) 评论(0) 推荐(0)

2018年5月27日

CS294-112 深度强化学习秋季学期（伯克利）NO.9 Learning policies by imitating optimal controllers

摘要： make compromise between learnt policy and minimal cost！ π hat is using states π theta is using observations 阅读全文

posted @ 2018-05-27 23:01 ecoflex 阅读(199) 评论(0) 推荐(0)

CS294-112 深度强化学习秋季学期（伯克利）NO.8 Learning dynamical system from data

摘要： MPC means replan every step Every N step, rebuild the dynamic model 阅读全文

posted @ 2018-05-27 18:15 ecoflex 阅读(247) 评论(0) 推荐(0)

2018年5月26日

CS294-112 深度强化学习秋季学期（伯克利）NO.7 Optimal control and planning

摘要： transition possibility is unknown and we even don't need to estimate the possibility 阅读全文

posted @ 2018-05-26 23:04 ecoflex 阅读(151) 评论(0) 推荐(0)

CS294-112 深度强化学习秋季学期（伯克利）NO.6 Value functions introduction NO.7 Advanced Q learning

摘要： understand that correlated samples cause problem. and how paralled solve the problem another solution is replay buffers, fully ultilizing the advantag 阅读全文

posted @ 2018-05-26 19:57 ecoflex 阅读(222) 评论(0) 推荐(0)

CS294-112 深度强化学习秋季学期（伯克利）NO.5 Actor-critic introduction

摘要： in most AC algorithms, we actually just fit value function. less common to fit Q function as well. batch：off line， monte carlo。online： bootstrap，TD in 阅读全文

posted @ 2018-05-26 12:28 ecoflex 阅读(216) 评论(0) 推荐(0)

2018年5月24日

CS294-112 深度强化学习秋季学期（伯克利）NO.4 Policy gradients introduction

摘要： green bar is the reward function, blue curve is the possibility of differenct trajectories if green bars are equally increased to yellow bars, the res 阅读全文

posted @ 2018-05-24 23:13 ecoflex 阅读(143) 评论(0) 推荐(0)

CS294-112 深度强化学习秋季学期（伯克利）NO.3 Reinforcement learning introduction

摘要： first order markov chain on policy algorithm is easier to be paralleled off policy algorithm has to fit transition net, and policy net. much more comp 阅读全文

posted @ 2018-05-24 18:13 ecoflex 阅读(158) 评论(0) 推荐(0)

CS294-112 深度强化学习秋季学期（伯克利）NO.1 Introduction NO.2 Supervised learning and imitation

摘要：前面弄错了，应该看2017的秋季课，结果看了春季课了。 neural network control a virtual robot, by imitating human motion Domain shift cause the failure of supervised learning in 阅读全文

posted @ 2018-05-24 16:43 ecoflex 阅读(1085) 评论(0) 推荐(0)

CS294-112深度增强学习课程（加州大学伯克利分校 2017）NO.5 Guest lecture: lgor Mordatch （open ai）

摘要： initialization dramatically influences the trajectory. the current state depends on all the past decision. ones reflect the dimensions being counted. 阅读全文

posted @ 2018-05-24 13:59 ecoflex 阅读(317) 评论(0) 推荐(0)