2024 年 11月 13 日随笔档案 - penuel

2024年11月13日

摘要： 1. value iteration algorithm: 值迭代上一节已经介绍过： 1.1 policy update: 1.2 Value update：此时，\(\pi_{k+1}\)和\(v_k\)都是已知的 1.3 procedure summary： 1.4 example: 2. p 阅读全文

posted @ 2024-11-13 11:12 penuel 阅读(24) 评论(0) 推荐(0) 编辑

penuel

公告