2019 年 1月 8 日随笔档案 - 乐乐章

2019年1月8日

摘要： https://github.com/zle1992/Reinforcement_Learning_Game 主函数阅读全文

posted @ 2019-01-08 22:37 乐乐章阅读(565) 评论(0) 推荐(0) 编辑

摘要： Given a m x n matrix, if an element is 0, set its entire row and column to 0. Do it in-place. Example 1: Input: [ [1,1,1], [1,0,1], [1,1,1] ] Output: 阅读全文

posted @ 2019-01-08 17:13 乐乐章阅读(120) 评论(0) 推荐(0) 编辑

强化学习--QLearning

摘要： 1.概述： QLearning基于值函数的方法，不同与policy gradient的方法，Qlearning是预测值函数，通过值函数来选择值函数最大的action，而policy gradient直接预测出action。 Q-learning 是一种基于值函数估计的强化学习方法，Policy G 阅读全文

posted @ 2019-01-08 14:46 乐乐章阅读(816) 评论(0) 推荐(0) 编辑

乐乐章

NLP/推荐我很菜

公告

乐乐章

NLP/推荐 我很菜

公告

NLP/推荐我很菜