记录一下关于DQN的想法

下载了几份代码，就两份没有报错通过了

DQN玩FlappyBird

https://github.com/yenchenlin/DeepLearningFlappyBird

DQN玩Cartpole

https://www.cnblogs.com/caorui/p/6431156.html

https://blog.csdn.net/xiewenbo/article/details/84959579

思路主要是这样的，像玩小鸟的就是通过障碍物能继续走，reward 1和-100， action也是1和0，玩平衡车是能立住就reward是1，立的时间长评分高，并不是真的用reward来衡量分数

target是神经网络的期望值

电脑gym包中平衡车位置：D:\virtualenv\venv\env37\gym\envs\classic_control\cartpole.py

keras中文文档：https://keras-cn.readthedocs.io/en/latest/

batch_saze的解释：一次喂多少个数据给神经网络 http://www.360doc.com/content/17/0809/15/42392246_677820641.shtml

这篇系列有点意思，是从安装开始解说强化学习：https://blog.csdn.net/u012465304/article/details/80888684

知乎专栏关于DQN：https://zhuanlan.zhihu.com/intelligentunit

1 https://zhuanlan.zhihu.com/p/21262246

2 https://zhuanlan.zhihu.com/p/21292697

3 https://zhuanlan.zhihu.com/p/21340755

4 https://zhuanlan.zhihu.com/p/21378532

5 https://zhuanlan.zhihu.com/p/21421729

6 https://zhuanlan.zhihu.com/p/21547911

7 https://zhuanlan.zhihu.com/p/21609472

posted @ 2019-09-05 17:03 洛圣熙阅读(135) 评论(0) 编辑收藏举报

刷新页面返回顶部

洛圣熙