Safe RL——Constrained Policy Optimization (CPO)
摘要:Safe RL——Constrained Policy Optimization (CPO) 作者:凯鲁嘎吉 - 博客园 http://www.cnblogs.com/kailugaji/ 这篇文章详细讲解Constrained Policy Optimization (CPO)的公式推导,文献来自
阅读全文
posted @
2022-11-19 10:44
凯鲁嘎吉
阅读(1626)
推荐(0) 编辑