摘要:
1.Delayed, sparse reward(feedback), Long-term planning Hierarchical Deep Reinforcement Learning, Sub-goal, SAMDP, optoins, Thompson sampling, Boltzman 阅读全文
摘要:
Zahavy, Tom, Nir Ben-Zrihem, and Shie Mannor. "Graying the black box: Understanding DQNs." International Conference on Machine Learning. 2016. 这篇论文想要做 阅读全文