(一万小时计划)-十二月一日总结

十二月一日学习汇总

代码:

Deep reinforcement learning course :https://github.com/simoninithomas/Deep_reinforcement_learning_Course/tree/master/PPO with Sonic the Hedgehog,

https://medium.com/deep-reinforcement-learning-course/launching-deep-reinforcement-learning-course-v2-0-38fa3c24bcbc

Deep reinforcement learning with pytorch:https://github.com/sweetice/Deep-reinforcement-learning-with-pytorch,https://github.com/p-christ/Deep-Reinforcement-Learning-Algorithms-with-PyTorch

Reinforce详解:https://blog.csdn.net/lrt366/article/details/91359230

相关代码:https://github.com/chingyaoc/pytorch-REINFORCE/blob/master/assets/algo.png

算法博客:

DDPG算法详解:https://blog.csdn.net/kenneth_yu/article/details/78478356

策略梯度:https://developer.ibm.com/zh/articles/ba-lo-deep-introduce-policy-gradient/

ONpolicy off policy 区别:https://www.zhihu.com/question/57159315#:~:text=On-policy和off-policy,策略,后者则不是。&text=-greedy,则是on-policy。&text=)%EF%BC%8C%E6%9B%B4%E6%96%B0%E7%9A%84%E6%97%B6%E5%80%99%E6%98%AF0,%EF%BC%8C%E5%88%99%E6%98%AFoff%2Dpolicy%E3%80%82。

TD算法详解:https://zhuanlan.zhihu.com/p/25913410

DQN算法:https://blog.csdn.net/qq_30615903/article/details/80744083,https://zhuanlan.zhihu.com/p/21421729

No module named ...解决办法:https://github.com/openai/spinningup/issues/60

课程

MIt概率论:https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-041-probabilistic-systems-analysis-and-applied-probability-fall-2010/video-lectures/

应用随机过程:概率模型导论

Probability in electrical engineering and computer science an application driven course

凸优化以及随机过程,CS285.

数学相关知识:https://www.msra.cn/zh-cn/news/features/book-recommendation-machine-learning-math

论文相关:https://arxiv.org/abs/1701.08936

posted @ 2020-12-01 20:40  Ethancode  阅读(69)  评论(0编辑  收藏  举报