会员
商店
众包
新闻
博问
闪存
赞助商
Chat2DB
所有博客
当前博客
我的博客
我的园子
账号设置
会员中心
简洁模式
...
退出登录
注册
登录
initial_h
https://github.com/initial-h
博客园
首页
新随笔
管理
随笔- 187 文章- 0 评论- 37 阅读-
18万
我的随笔
1
2
3
4
5
···
13
下一页
For SALE: State-Action Representation Learning for Deep Reinforcement Learning
initial_h 2024-08-06 01:17
阅读:188
评论:0
推荐:0
编辑
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning
initial_h 2024-06-11 11:15
阅读:144
评论:0
推荐:0
编辑
Survey on Large Language Model-Enhanced Reinforcement Learning: Concept, Taxonomy, and Methods
initial_h 2024-05-23 13:38
阅读:507
评论:0
推荐:0
编辑
RETROFORMER: RETROSPECTIVE LARGE LANGUAGE AGENTS WITH POLICY GRADIENT OPTIMIZATION
initial_h 2024-05-13 23:56
阅读:147
评论:0
推荐:0
编辑
REACT: SYNERGIZING REASONING AND ACTING IN LANGUAGE MODELS
initial_h 2024-05-04 23:05
阅读:324
评论:0
推荐:0
编辑
Reflexion: Language Agents with Verbal Reinforcement Learning
initial_h 2024-04-30 11:24
阅读:479
评论:0
推荐:0
编辑
Large Language Models Are Semi-Parametric Reinforcement Learning Agents
initial_h 2024-04-24 13:48
阅读:98
评论:0
推荐:0
编辑
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
initial_h 2024-03-04 10:13
阅读:216
评论:0
推荐:0
编辑
Improved Soft Actor-Critic: Mixing Prioritized Off-Policy Samples with On-Policy Experience
initial_h 2024-03-01 03:22
阅读:40
评论:0
推荐:0
编辑
State Distribution-aware Sampling for Deep Q-learning
initial_h 2024-02-24 01:04
阅读:42
评论:0
推荐:0
编辑
Large Batch Experience Replay
initial_h 2024-02-17 00:50
阅读:29
评论:0
推荐:0
编辑
Prioritized Experience Replay
initial_h 2024-02-14 08:29
阅读:76
评论:0
推荐:0
编辑
Sample-Efficient Deep Reinforcement Learning via Episodic Backward Update
initial_h 2024-02-11 02:46
阅读:29
评论:0
推荐:0
编辑
Experience Replay with Likelihood-free Importance Weights
initial_h 2023-08-13 23:20
阅读:58
评论:0
推荐:0
编辑
Striving for Simplicity and Performance in Off-Policy DRL: Output Normalization and Non-Uniform Sampling
initial_h 2023-08-12 08:00
阅读:46
评论:0
推荐:0
编辑
1
2
3
4
5
···
13
下一页
公告
昵称:
initial_h
园龄:
6年9个月
粉丝:
51
关注:
2
常用链接
我的随笔
我的评论
我的参与
最新评论
我的标签
更多链接
随笔分类
(353)
coding bug(3)
leetcode(2)
LLM(5)
Reinforcement Learning(175)
tools(2)
论文速读(166)
阅读排行榜
1. Gumbel-Softmax Trick和Gumbel分布(63670)
2. RuntimeWarning: invalid value encountered in true_divide(29657)
3. 《Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments》论文解读(18496)
4. PPT作矢量图转为eps格式(8582)
5. 《Population Based Training of Neural Networks》论文解读(5665)
推荐排行榜
1. Gumbel-Softmax Trick和Gumbel分布(19)
2. 《Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments》论文解读(7)
3. AlphaZero并行五子棋AI(4)
4. RuntimeWarning: invalid value encountered in true_divide(2)
5. Heuristic-Guided Reinforcement Learning(1)
6. Teachable Reinforcement Learning via Advice Distillation(1)
7. Value targets in off-policy AlphaZero: a new greedy backup(1)
8. Deep Exploration via Bootstrapped DQN(1)
9. Encoding Human Domain Knowledge to Warm Start Reinforcement Learning(1)
10. MOReL: Model-Based Offline Reinforcement Learning(1)
11. MDP中值函数的求解(1)
点击右上角即可分享