一文读懂「RLHF」:基于人类反馈的强化学习
posted @ 2025-03-06 16:08 ExplorerMan 阅读(2010) 评论(0) 推荐(1)
posted @ 2025-03-06 16:08 ExplorerMan 阅读(2010) 评论(0) 推荐(1)
posted @ 2025-03-04 14:35 ExplorerMan 阅读(812) 评论(0) 推荐(0)
posted @ 2025-03-04 14:33 ExplorerMan 阅读(927) 评论(0) 推荐(0)
posted @ 2025-03-01 00:42 ExplorerMan 阅读(1617) 评论(0) 推荐(0)
posted @ 2025-03-01 00:42 ExplorerMan 阅读(308) 评论(0) 推荐(0)
posted @ 2025-03-01 00:29 ExplorerMan 阅读(123) 评论(0) 推荐(0)
posted @ 2025-03-01 00:13 ExplorerMan 阅读(705) 评论(0) 推荐(0)
posted @ 2025-02-27 17:20 ExplorerMan 阅读(2798) 评论(0) 推荐(0)
posted @ 2025-02-27 17:12 ExplorerMan 阅读(5244) 评论(0) 推荐(1)
posted @ 2025-02-27 17:10 ExplorerMan 阅读(824) 评论(0) 推荐(0)