05 2023 档案

强化学习的基本概念

摘要：概率密度函数期望（expect） state s action a agent policy Π(a|s) reward r state transition p(s'|s,a) return（cumulative future reward 未来累计回报） discounted return（γ 阅读全文

posted @ 2023-05-09 17:26 阿Qi早起了吗阅读(47) 评论(0) 推荐(0) 编辑

公告

昵称：阿Qi早起了吗
园龄： 4年2个月
粉丝： 4
关注： 4

+加关注

2025年3月

日

一

二

三

四

五

六

05 2023 档案

公告

搜索

常用链接

随笔分类

随笔档案

文章分类

阅读排行榜